Hello Jim and Gabor,
Thanks for your inputs. The lines:
a<-as.matrix(read.table(pipe("awk -f cut.awk Data.file")))
cut.awk>{for(i = 1; i <= NF; i=i+10) print $i,""}
solved my problem. I know that 40k lines is not a large data set. I have
about 150 files each of which has 40k rows and in each fil
1. You can pipe your data through gawk (or other scripting language)
process as in:
http://tolstoy.newcastle.edu.au/R/e5/help/08/09/2129.html
2. read.csv.sql in the sqldf package on CRAN will set up a database
for you, read the file into the database automatically defining the
layout of the table,
That does not seem like a "large" data set. How are you reading it?
How many columns does it have? What is "a lot of time" by your
definition? You have provided minimal data for obtaining help. I
common read in files with 300K rows in under 30 seconds. Maybe you
need to consider a relational d
Hello All,
I have a 40k rows long data set that is taking a lot of time to be read-in.
Is there a way to skip reading even/odd numbered rows or read-in only rows
that are multiples of, say, 10? This way I get the general trend of the data
w/o actually reading the entire thing. The option 'skip' in
4 matches
Mail list logo