Re: [R] Suggestion for big files [was: Re: A comment about R:]

Prof Brian Ripley Fri, 06 Jan 2006 00:12:11 -0800

[Just one point extracted: Hadley Wickham has answered the random sampleone]


On Thu, 5 Jan 2006, François Pinard wrote:

[Brian Ripley]

One problem with Francois Pinard's suggestion (the credit has got lost)
is that R's I/O is not line-oriented but stream-oriented.  So selecting
lines is not particularly easy in R.


I understand that you mean random access to lines, instead of random
selection of lines.  Once again, this chat comes out of reading someone
else's problem, this is not a problem I actually have.  SPSS was not
randomly accessing lines, as data files could well be hold on magnetic
tapes, where random access is not possible on average practice.  SPSS
reads (or was reading) lines sequentially from beginning to end, and the
_random_ sample is built while the reading goes.

That was not my point. R's standard I/O is through connections, whichallow for pushbacks, changing line endings and re-encoding character sets.That does add overhead compared to C/Fortran line-buffered reading of afile. Skipping lines you do not need will take longer than you mightguess (based on some limited experience).


--
Brian D. Ripley,                  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Suggestion for big files [was: Re: A comment about R:]

Reply via email to