Re: [Rd] R's IO speed

Frank E Harrell Jr Sun, 26 Dec 2004 06:06:44 -0800

R-devel now has some improved versions of read.table and write.table.

For a million-row data frame containing one number, one factor with few
Brian Ripley wrote:

levels and one logical column, a 56Mb object.

generating it takes 4.5 secs.

calling summary() on it takes 2.2 secs.

writing it takes 8 secs and an additional 10Mb.

saving it in .rda format takes 4 secs.

reading it naively takes 28 secs and an additional 240Mb

reading it carefully (using nrows, colClasses and comment.char) takes 16
secs and an additional 150Mb (56Mb of which is for the object read in).
(The overhead of read.table over scan was about 2 secs, mainly in the
conversion back to a factor.)

loading from .rda format takes 3.4 secs.

[R 2.0.1 read in 23 secs using an additional 210Mb, and wrote in 50 secs
using an additional 450Mb.]


Will Frank Harrell or someone else please explain to me a real application
in which this is not fast enough?
---------------------------------------------------------------------------

Brian - I really appreciate your work on this, and the data. The wise use of read.table that you mentioned should be fine for almost everything I do. There may be other users who need to read larger datasets for which memory usage is an issue. They can speak for themselves though.

Sincerely,

Frank
--
Frank E Harrell Jr   Professor and Chair           School of Medicine
                     Department of Biostatistics   Vanderbilt University

______________________________________________
R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R's IO speed

Reply via email to