Re: [R] Fast way to determine number of lines in a file

2010-02-10 Thread Indian_R_Analyst
Hi Hadley, Hope this is what you are looking for. This approach provides number of lines in a large 'bzip' file using chunks. testconn - file(xyzxyz.csv.bz2, open=r) csize - 1 nolines - 0 while((readnlines - length(readLines(testconn,csize))) 0 ) nolines - nolines+readnlines close(testconn)

Re: [R] Fast way to determine number of lines in a file

2010-02-09 Thread hadley wickham
I was looking for a fast line counter as well a while ago and ended up writing a small function in R:  countLines() in the R.utils package At least at the time, it was faster than readLines() [for unknown reasons].  It is also more memory efficient.  It supports connections.  I don't think

Re: [R] Fast way to determine number of lines in a file

2010-02-09 Thread kMan
It depends on the type of file and your system. 'count.fields()' is impractical for large files because it generates a matrix with the same number of dimensions as the file. It would be easier to use scan() with the delimiter argument set up to read to the end of line marker, \n I believe, and the

[R] Fast way to determine number of lines in a file

2010-02-08 Thread Hadley Wickham
Hi all, Is there a fast way to determine the number of lines in a file? I'm looking for something like count.lines analogous to count.fields. Hadley -- http://had.co.nz/ __ R-help@r-project.org mailing list

Re: [R] Fast way to determine number of lines in a file

2010-02-08 Thread Romain Francois
Hi, parser::nlines does it in C. Romain On 02/08/2010 03:16 PM, Hadley Wickham wrote: Hi all, Is there a fast way to determine the number of lines in a file? I'm looking for something like count.lines analogous to count.fields. Hadley -- Romain Francois Professional R Enthusiast +33(0)

Re: [R] Fast way to determine number of lines in a file

2010-02-08 Thread Ken Knoblauch
Hadley Wickham hadley at rice.edu writes: Hi all, Is there a fast way to determine the number of lines in a file? I'm looking for something like count.lines analogous to count.fields. Hadley How about something like length(readLines(fname)) Ken

Re: [R] Fast way to determine number of lines in a file

2010-02-08 Thread Gabor Grothendieck
If you are willing to use an external program parse the result of: system(wc -l small.dat) 10 small.dat On Windows there is a wc.exe program in the Rtools distribution. On Mon, Feb 8, 2010 at 9:16 AM, Hadley Wickham had...@rice.edu wrote: Hi all, Is there a fast way to determine the number

Re: [R] Fast way to determine number of lines in a file

2010-02-08 Thread Hadley Wickham
parser::nlines does it in C. Looks promising, but I need something that uses connections because I'm working with big bzipped files. Hadley -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help

Re: [R] Fast way to determine number of lines in a file

2010-02-08 Thread hadley wickham
Hi Ken, How about something like length(readLines(fname)) I'm trying to avoid the overhead of reading the file in twice. (I'm trying to preallocate a data structure for a chunked read) Hadley -- http://had.co.nz/ __ R-help@r-project.org mailing

Re: [R] Fast way to determine number of lines in a file

2010-02-08 Thread Romain Francois
On 02/08/2010 04:16 PM, Hadley Wickham wrote: parser::nlines does it in C. Looks promising, but I need something that uses connections because I'm working with big bzipped files. Hadley Ah... the lack of c-level api for connections again ;-) -- Romain Francois Professional R Enthusiast

Re: [R] Fast way to determine number of lines in a file

2010-02-08 Thread Henrik Bengtsson
I was looking for a fast line counter as well a while ago and ended up writing a small function in R: countLines() in the R.utils package At least at the time, it was faster than readLines() [for unknown reasons]. It is also more memory efficient. It supports connections. I don't think it