Re: [R] R File IO Slow?

2007-03-01 Thread Ranjan Maitra
I decided to run an experiment: just reading in a file which is 78MB in binary 
format (of ints). It takes less than 30s using a laptop with 512 MB RAM, 2.3 
GHz Intel-4 single processor. At that point, I did not notice that Ramzi was 
talking about a .RData file.

For huge files, I usually do not save my files. I run the R code whenever I 
need it: the entire exercise usually takes a few minutes, at the most. If 
something takes very long, I usually save the output into a file and read from 
there. I have found that this is more efficient (besides helping in reproducing 
my results).

HTH!
Ranjan


On Thu, 01 Mar 2007 13:04:54 -0500 "Roger D. Peng" <[EMAIL PROTECTED]> wrote:

> A 27MB .RData file is relatively big, in may experience.  What do you think 
> is 
> slow?  Maybe it's your computer that is slow?
> 
> -roger
> 
> ramzi abboud wrote:
> > Is R file IO slow in general or am I missing
> > something?  It takes me 5 minutes to do a load(MYFILE)
> > where MYFILE is a 27 MB Rdata file.  Is there any way
> > to speed this up?  
> > 
> > The one idea I have is having R call a C or Perl
> > routine, reading the file in that language, converting
> > the data in to R objects, then sending them back into
> > R.  This is more work that I want to do, however, in
> > loading Rdata files.
> > 
> > Any ideas would be appreciated.
> > Ramzi Aboud
> > University of Rochester
> > 
> > 
> > 
> > 
> > 
> >  
> > 
> > Need Mail bonding?
> > 
> > __
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> > 
> 
> -- 
> Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R File IO Slow?

2007-03-01 Thread Marc Schwartz
On Thu, 2007-03-01 at 09:22 -0800, ramzi abboud wrote:
> Is R file IO slow in general or am I missing
> something?  It takes me 5 minutes to do a load(MYFILE)
> where MYFILE is a 27 MB Rdata file.  Is there any way
> to speed this up?  
> 
> The one idea I have is having R call a C or Perl
> routine, reading the file in that language, converting
> the data in to R objects, then sending them back into
> R.  This is more work that I want to do, however, in
> loading Rdata files.
> 
> Any ideas would be appreciated.
> Ramzi Aboud
> University of Rochester

Here are some timings on my system, which runs Linux on a 3.2 Ghz P4
with 2 Gb of RAM and a 7200 rpm HD. I typically get around 28 Mb/sec
throughput on this drive, which is about 15% lower than normal, as it is
an encrypted partition using 256 bit AES.


> Vec <- 1:1500

> system.time(save(Vec, file = "Vec.RData"))
[1] 33.297  0.565 38.889  0.000  0.000

# File is ~29 Mb
> file.info("Vec.RData")$size
[1] 30112009

> system.time(load("Vec.RData"))
[1] 5.607 0.167 6.575 0.000 0.000


Not terribly burdensome...

You might want to be sure that you are not low on RAM, resulting in a
lot of swapping to disk, or perhaps just a slow drive.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R File IO Slow?

2007-03-01 Thread jim holtman
It is not slow on my system.  The file was 34MB on disk and took about 37 
seconds to write out (probably mostly disk I/O on my laptop) and 12 seconds to 
read in after I flushed the system cache.

> x <- runif(27e6/4)  # creates a 34MB file on disk
> object.size(x)
[1] 5424
> system.time(save.image('test.xx'))
[1] 23.16  0.40 37.93NANA
> gc()
  used (Mb) gc trigger (Mb) max used (Mb)
Ncells  258153  6.9 531268 14.2   35  9.4
Vcells 6864025 52.48380235 64.0  6865364 52.4
> rm(x)
> gc()
 used (Mb) gc trigger (Mb) max used (Mb)
Ncells 258150  6.9 531268 14.2   35  9.4
Vcells 114019  0.96704187 51.2  6865364 52.4
> system.time(load('test.xx'))  #without flushing system cache it takes 4 
> seconds
[1] 3.64 0.01 4.07   NA   NA
> gc()
  used (Mb) gc trigger (Mb) max used (Mb)
Ncells  258153  6.9 531268 14.2   35  9.4
Vcells 6864025 52.47911488 60.4  6870189 52.5
> system.time(load('test.xx')) # after flushing system cache
[1]  3.48  0.11 12.12NANA
> 

So it must be someelse on your system.

 
Jim Holtman

"What is the problem you are trying to solve?"



- Original Message 
From: ramzi abboud <[EMAIL PROTECTED]>
To: r-help@stat.math.ethz.ch
Sent: Thursday, March 1, 2007 12:22:22 PM
Subject: [R] R File IO Slow?


Is R file IO slow in general or am I missing
something?  It takes me 5 minutes to do a load(MYFILE)
where MYFILE is a 27 MB Rdata file.  Is there any way
to speed this up?  

The one idea I have is having R call a C or Perl
routine, reading the file in that language, converting
the data in to R objects, then sending them back into
R.  This is more work that I want to do, however, in
loading Rdata files.

Any ideas would be appreciated.
Ramzi Aboud
University of Rochester







Need Mail bonding?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


 

No need to miss a message. Get email on-the-go 


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R File IO Slow?

2007-03-01 Thread Roger D. Peng
A 27MB .RData file is relatively big, in may experience.  What do you think is 
slow?  Maybe it's your computer that is slow?

-roger

ramzi abboud wrote:
> Is R file IO slow in general or am I missing
> something?  It takes me 5 minutes to do a load(MYFILE)
> where MYFILE is a 27 MB Rdata file.  Is there any way
> to speed this up?  
> 
> The one idea I have is having R call a C or Perl
> routine, reading the file in that language, converting
> the data in to R objects, then sending them back into
> R.  This is more work that I want to do, however, in
> loading Rdata files.
> 
> Any ideas would be appreciated.
> Ramzi Aboud
> University of Rochester
> 
> 
> 
> 
> 
>  
> 
> Need Mail bonding?
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R File IO Slow?

2007-03-01 Thread Henrik Bengtsson
Just an idea: Two things that can slow down save()/load() is if you
save() in ASCII format or a compressed binary format.  If this is your
case for MYFILE, try to resave in a non-compressed binary format.  See
?save for details.

/HB

On 3/1/07, ramzi abboud <[EMAIL PROTECTED]> wrote:
> Is R file IO slow in general or am I missing
> something?  It takes me 5 minutes to do a load(MYFILE)
> where MYFILE is a 27 MB Rdata file.  Is there any way
> to speed this up?
>
> The one idea I have is having R call a C or Perl
> routine, reading the file in that language, converting
> the data in to R objects, then sending them back into
> R.  This is more work that I want to do, however, in
> loading Rdata files.
>
> Any ideas would be appreciated.
> Ramzi Aboud
> University of Rochester
>
>
>
>
>
>
> 
> Need Mail bonding?
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.