Re: [R] R File IO Slow?
I decided to run an experiment: just reading in a file which is 78MB in binary format (of ints). It takes less than 30s using a laptop with 512 MB RAM, 2.3 GHz Intel-4 single processor. At that point, I did not notice that Ramzi was talking about a .RData file. For huge files, I usually do not save my files. I run the R code whenever I need it: the entire exercise usually takes a few minutes, at the most. If something takes very long, I usually save the output into a file and read from there. I have found that this is more efficient (besides helping in reproducing my results). HTH! Ranjan On Thu, 01 Mar 2007 13:04:54 -0500 "Roger D. Peng" <[EMAIL PROTECTED]> wrote: > A 27MB .RData file is relatively big, in may experience. What do you think > is > slow? Maybe it's your computer that is slow? > > -roger > > ramzi abboud wrote: > > Is R file IO slow in general or am I missing > > something? It takes me 5 minutes to do a load(MYFILE) > > where MYFILE is a 27 MB Rdata file. Is there any way > > to speed this up? > > > > The one idea I have is having R call a C or Perl > > routine, reading the file in that language, converting > > the data in to R objects, then sending them back into > > R. This is more work that I want to do, however, in > > loading Rdata files. > > > > Any ideas would be appreciated. > > Ramzi Aboud > > University of Rochester > > > > > > > > > > > > > > > > Need Mail bonding? > > > > __ > > R-help@stat.math.ethz.ch mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > -- > Roger D. Peng | http://www.biostat.jhsph.edu/~rpeng/ > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R File IO Slow?
On Thu, 2007-03-01 at 09:22 -0800, ramzi abboud wrote: > Is R file IO slow in general or am I missing > something? It takes me 5 minutes to do a load(MYFILE) > where MYFILE is a 27 MB Rdata file. Is there any way > to speed this up? > > The one idea I have is having R call a C or Perl > routine, reading the file in that language, converting > the data in to R objects, then sending them back into > R. This is more work that I want to do, however, in > loading Rdata files. > > Any ideas would be appreciated. > Ramzi Aboud > University of Rochester Here are some timings on my system, which runs Linux on a 3.2 Ghz P4 with 2 Gb of RAM and a 7200 rpm HD. I typically get around 28 Mb/sec throughput on this drive, which is about 15% lower than normal, as it is an encrypted partition using 256 bit AES. > Vec <- 1:1500 > system.time(save(Vec, file = "Vec.RData")) [1] 33.297 0.565 38.889 0.000 0.000 # File is ~29 Mb > file.info("Vec.RData")$size [1] 30112009 > system.time(load("Vec.RData")) [1] 5.607 0.167 6.575 0.000 0.000 Not terribly burdensome... You might want to be sure that you are not low on RAM, resulting in a lot of swapping to disk, or perhaps just a slow drive. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R File IO Slow?
It is not slow on my system. The file was 34MB on disk and took about 37 seconds to write out (probably mostly disk I/O on my laptop) and 12 seconds to read in after I flushed the system cache. > x <- runif(27e6/4) # creates a 34MB file on disk > object.size(x) [1] 5424 > system.time(save.image('test.xx')) [1] 23.16 0.40 37.93NANA > gc() used (Mb) gc trigger (Mb) max used (Mb) Ncells 258153 6.9 531268 14.2 35 9.4 Vcells 6864025 52.48380235 64.0 6865364 52.4 > rm(x) > gc() used (Mb) gc trigger (Mb) max used (Mb) Ncells 258150 6.9 531268 14.2 35 9.4 Vcells 114019 0.96704187 51.2 6865364 52.4 > system.time(load('test.xx')) #without flushing system cache it takes 4 > seconds [1] 3.64 0.01 4.07 NA NA > gc() used (Mb) gc trigger (Mb) max used (Mb) Ncells 258153 6.9 531268 14.2 35 9.4 Vcells 6864025 52.47911488 60.4 6870189 52.5 > system.time(load('test.xx')) # after flushing system cache [1] 3.48 0.11 12.12NANA > So it must be someelse on your system. Jim Holtman "What is the problem you are trying to solve?" - Original Message From: ramzi abboud <[EMAIL PROTECTED]> To: r-help@stat.math.ethz.ch Sent: Thursday, March 1, 2007 12:22:22 PM Subject: [R] R File IO Slow? Is R file IO slow in general or am I missing something? It takes me 5 minutes to do a load(MYFILE) where MYFILE is a 27 MB Rdata file. Is there any way to speed this up? The one idea I have is having R call a C or Perl routine, reading the file in that language, converting the data in to R objects, then sending them back into R. This is more work that I want to do, however, in loading Rdata files. Any ideas would be appreciated. Ramzi Aboud University of Rochester Need Mail bonding? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. No need to miss a message. Get email on-the-go [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R File IO Slow?
A 27MB .RData file is relatively big, in may experience. What do you think is slow? Maybe it's your computer that is slow? -roger ramzi abboud wrote: > Is R file IO slow in general or am I missing > something? It takes me 5 minutes to do a load(MYFILE) > where MYFILE is a 27 MB Rdata file. Is there any way > to speed this up? > > The one idea I have is having R call a C or Perl > routine, reading the file in that language, converting > the data in to R objects, then sending them back into > R. This is more work that I want to do, however, in > loading Rdata files. > > Any ideas would be appreciated. > Ramzi Aboud > University of Rochester > > > > > > > > Need Mail bonding? > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Roger D. Peng | http://www.biostat.jhsph.edu/~rpeng/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R File IO Slow?
Just an idea: Two things that can slow down save()/load() is if you save() in ASCII format or a compressed binary format. If this is your case for MYFILE, try to resave in a non-compressed binary format. See ?save for details. /HB On 3/1/07, ramzi abboud <[EMAIL PROTECTED]> wrote: > Is R file IO slow in general or am I missing > something? It takes me 5 minutes to do a load(MYFILE) > where MYFILE is a 27 MB Rdata file. Is there any way > to speed this up? > > The one idea I have is having R call a C or Perl > routine, reading the file in that language, converting > the data in to R objects, then sending them back into > R. This is more work that I want to do, however, in > loading Rdata files. > > Any ideas would be appreciated. > Ramzi Aboud > University of Rochester > > > > > > > > Need Mail bonding? > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.