Yes, that's actually quite helpful. I think this gets to the point that Patrick Burns was making in Ch. 4 of the Inferno book. I think the "sapply" is breaking the problem into chunks that make it more efficient.
This is a prototype of a function that I will probably use quite often, so it's worthwhile for me to optimize, so thank you for the detailed response and for the benchmark syntax. On Sat, Jul 10, 2010 at 8:44 AM, Allan Engelhardt <all...@cybaea.com> wrote: > > > On 09/07/10 21:19, Duncan Murdoch wrote: > >> On 09/07/2010 4:11 PM, Gene Leynes wrote: >> >>> I thought the "apply" functions are faster than for loops, but my most >>> recent test shows that apply actually takes a significantly longer than a >>> for loop. Am I missing something? >>> >> >> Probably not. apply() needs to figure out the shape of the results it >> gets from each row in order to put them into the final result matrix. You >> know that in advance, and set up the result to hold them, so your >> calculation would be more efficient. >> > > Plus, in the way it is set up, apply has to make a much larger temporary > matrix. If you do > > ds1<- 1+ds > library("rbenchmark") > benchmark(apply=apply(1+ds, 1, cumprod), > forloop=for (i in 1:nrow(ds)){ y2[i,]<-cumprod(1+ds[i,]) }, > apply2=apply(ds1, 1, cumprod), > replications=1, > columns=c("test","elapsed","relative","user.self","sys.self"), > order="elapsed") > # test elapsed relative user.self sys.self > # 2 forloop 1.863 1.000000 1.861 0.000 > # 3 apply2 2.175 1.167472 1.934 0.239 > # 1 apply 2.443 1.311326 2.108 0.334 > > > > you can sense some of the impact of that. > > But if it is speed you are after, sapply may be even faster (you'll need to > t() the result again): > > benchmark(forloop=for (i in 1:nrow(ds)){ y2[i,]<-cumprod(1+ds[i,]) }, > sapply={dsone<-1+ds;sapply(1:NROW(dsone), function(i) > cumprod(dsone[i,]))}, > replications=1, > columns=c("test","elapsed","relative","user.self","sys.self"), > order="elapsed") > # test elapsed relative user.self sys.self > # 2 sapply 1.539 1.000000 1.300 0.239 > # 1 forloop 1.878 1.220273 1.878 0.000 > zz<- sapply(1:NROW(ds1), function(i) cumprod(dsone[i,])) > identical(t(zz), y2) > # [1] TRUE > > > Hope this helps a little. > > Allan > > > >> The *apply functions are designed to be convenient and clear to read, not >> necessarily fast. >> >> Duncan Murdoch >> >> It doesn't matter much if I do column wise calculations rather than row >>> wise >>> >>> ## Example of how apply is SLOWER than for loop: >>> >>> #rm(list=ls()) >>> >>> ## DEFINE VARIABLES >>> mu=0.05 ; sigma=0.20 ; dt=.25 ; T=50 ; sims=1e5 >>> timesteps = T/dt >>> >>> ## MAKE PHI AND DS >>> phi = matrix(rnorm(timesteps*sims), nrow=sims, ncol=timesteps) >>> ds = mu*dt + sigma * sqrt(dt) * phi >>> >>> ## USE APPLY TO CALCULATE ROWWISE CUMULATIVE PRODUCT >>> system.time(y1 <- apply(1+ds, 1, cumprod)) >>> ## UNTRANSFORM Y1, BECAUSE ROW APPLY FLIPS THE MATRIX >>> y1=t(y1) >>> >>> ## USE FOR LOOP TO CALCULATE ROWWISE CUMULATIVE PRODUCT >>> y2=matrix(NA,nrow(ds),ncol(ds)) >>> system.time( >>> for (i in 1:nrow(ds)){ >>> y2[i,]<-cumprod(1+ds[i,]) >>> } >>> ) >>> >>> ## COMPARE RESULTS TO MAKE SURE THEY DID THE SAME THING >>> str(y1) >>> str(y2) >>> all(y1==y2) >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >>> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.