> -----Original Message----- > From: r-help-boun...@r-project.org > [mailto:r-help-boun...@r-project.org] On Behalf Of Saptarshi Guha > Sent: Sunday, August 15, 2010 9:23 AM > To: R-help@r-project.org > Subject: [R] time of serialization > > Hello, > I have question about the overhead in lapply. > x is a list of 3000 lists. Each of the i (1<=i<=3000) list elements is > pair of two elements: a string vector and a data frame > > x is roughly 235MB. > > > gc() > ## > > > z <- system.time(y <- lapply(x,function(r){ > system.time(serialize(r,NULL))['elapsed'] > })) > > sum(unlist(y)) > 18.812 > > z > user system elapsed > 494.144 0.041 494.247 > > So, the entire lapply takes ~26 times longer than the sum of the > individual operations.
Your test involves calling serialize(), system.time(), and `[`(), and the anonymous function of 'r' 3000 times from lapply, so why pick on lapply() as the culprit? I made a 3000 long list 'x' according to your description and tried the following experiments (I didn't bother to put [ into the mix): > system.time(lapply(x, function(xi)serialize(xi, NULL))) user system elapsed 0.21 0.02 0.22 > system.time(lapply(x, serialize, NULL)) user system elapsed 0.18 0.00 0.20 > system.time(lapply(x, serialize, NULL)) user system elapsed 0.20 0.00 0.22 > system.time(lapply(x, function(xi)serialize(xi, NULL))) user system elapsed 0.19 0.00 0.20 > system.time(lapply(x, function(xi)system.time(serialize(xi, NULL)))) user system elapsed 103.17 0.03 101.47 > system.time(lapply(x, function(xi)system.time(1.0))) user system elapsed 102.88 0.11 100.89 > system.time(for(i in 1:3000)system.time(1.0)) user system elapsed 48.82 0.33 48.50 > system.time(for(xi in x)system.time(1.0)) user system elapsed 97.06 0.35 97.70 It looks like system.time() eats the time and the following experiment indicates that its call to gc() eats most of its time: > system.time(lapply(x, function(xi)system.time(serialize(xi, NULL), gcFirst=FALSE))) user system elapsed 0.79 0.02 0.78 You can use a profiled version of R to get this information but some quick experimentation works pretty well. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > > Have i missed something? > > Regards > Saptarshi > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.