RE: [R] Performing Analysis on Subset of External data

2004-10-06 Thread Ted Harding
On 06-Oct-04 Laura Quinn wrote: > I want to perform some analysis on subsets of huge data files. > There are 20 of the files and I want to select the same subsets > of each one (each subset is a chunk of 1500 or so consecutive > rows from several million). > To save time and processing power is the

Re: [R] Performing Analysis on Subset of External data

2004-10-06 Thread Prof Brian Ripley
1) Use the skip= and nrows= arguments to read.table. 2) Open a connection, read and discard rows, read the block you want then close the connection. (Which is how 1 works, essentially.) 3) Use perl, awk or some such to extract the rows you want -- this is probably rather faster. On Wed, 6 Oct

Re: [R] Performing Analysis on Subset of External data

2004-10-06 Thread Thomas Lumley
On Wed, 6 Oct 2004, Laura Quinn wrote: Hi, I want to perform some analysis on subsets of huge data files. There are 20 of the files and I want to select the same subsets of each one (each subset is a chunk of 1500 or so consecutive rows from several million). To save time and processing power is th

Re: [R] Performing Analysis on Subset of External data

2004-10-06 Thread Gavin Simpson
Laura Quinn wrote: Hi, I want to perform some analysis on subsets of huge data files. There are 20 of the files and I want to select the same subsets of each one (each subset is a chunk of 1500 or so consecutive rows from several million). To save time and processing power is there a method to tell

[R] Performing Analysis on Subset of External data

2004-10-06 Thread Laura Quinn
Hi, I want to perform some analysis on subsets of huge data files. There are 20 of the files and I want to select the same subsets of each one (each subset is a chunk of 1500 or so consecutive rows from several million). To save time and processing power is there a method to tell R to *only* read