Re: [R] Conditional read-in of data

2009-11-04 Thread jim holtman
That does not seem like a large data set. How are you reading it? How many columns does it have? What is a lot of time by your definition? You have provided minimal data for obtaining help. I common read in files with 300K rows in under 30 seconds. Maybe you need to consider a relational

Re: [R] Conditional read-in of data

2009-11-04 Thread Gabor Grothendieck
1. You can pipe your data through gawk (or other scripting language) process as in: http://tolstoy.newcastle.edu.au/R/e5/help/08/09/2129.html 2. read.csv.sql in the sqldf package on CRAN will set up a database for you, read the file into the database automatically defining the layout of the

Re: [R] Conditional read-in of data

2009-11-04 Thread mnstn
Hello Jim and Gabor, Thanks for your inputs. The lines: a-as.matrix(read.table(pipe(awk -f cut.awk Data.file))) cut.awk{for(i = 1; i = NF; i=i+10) print $i,} solved my problem. I know that 40k lines is not a large data set. I have about 150 files each of which has 40k rows and in each file I

[R] Conditional read-in of data

2009-11-03 Thread mnstn
Hello All, I have a 40k rows long data set that is taking a lot of time to be read-in. Is there a way to skip reading even/odd numbered rows or read-in only rows that are multiples of, say, 10? This way I get the general trend of the data w/o actually reading the entire thing. The option 'skip'