Yes, that's actually quite helpful.

I think this gets to the point that Patrick Burns was making in Ch. 4 of the
Inferno book.  I think the "sapply" is breaking the problem into chunks that
make it more efficient.

This is a prototype of a function that I will probably use quite often, so
it's worthwhile for me to optimize, so thank you for the detailed response
and for the benchmark syntax.


On Sat, Jul 10, 2010 at 8:44 AM, Allan Engelhardt <all...@cybaea.com> wrote:

>
>
> On 09/07/10 21:19, Duncan Murdoch wrote:
>
>> On 09/07/2010 4:11 PM, Gene Leynes wrote:
>>
>>> I thought the "apply" functions are faster than for loops, but my most
>>> recent test shows that apply actually takes a significantly longer than a
>>> for loop.  Am I missing something?
>>>
>>
>> Probably not.  apply() needs to figure out the shape of the results it
>> gets from each row in order to put them into the final result matrix.  You
>> know that in advance, and set up the result to hold them, so your
>> calculation would be more efficient.
>>
>
> Plus, in the way it is set up, apply has to make a much larger temporary
> matrix.  If you do
>
> ds1<- 1+ds
> library("rbenchmark")
> benchmark(apply=apply(1+ds, 1, cumprod),
>          forloop=for (i in 1:nrow(ds)){ y2[i,]<-cumprod(1+ds[i,]) },
>          apply2=apply(ds1, 1, cumprod),
>          replications=1,
> columns=c("test","elapsed","relative","user.self","sys.self"),
> order="elapsed")
> #      test elapsed relative user.self sys.self
> # 2 forloop   1.863 1.000000     1.861    0.000
> # 3  apply2   2.175 1.167472     1.934    0.239
> # 1   apply   2.443 1.311326     2.108    0.334
>
>
>
> you can sense some of the impact of that.
>
> But if it is speed you are after, sapply may be even faster (you'll need to
> t() the result again):
>
> benchmark(forloop=for (i in 1:nrow(ds)){ y2[i,]<-cumprod(1+ds[i,]) },
>          sapply={dsone<-1+ds;sapply(1:NROW(dsone), function(i)
> cumprod(dsone[i,]))},
>          replications=1,
> columns=c("test","elapsed","relative","user.self","sys.self"),
> order="elapsed")
> #      test elapsed relative user.self sys.self
> # 2  sapply   1.539 1.000000     1.300    0.239
> # 1 forloop   1.878 1.220273     1.878    0.000
> zz<- sapply(1:NROW(ds1), function(i) cumprod(dsone[i,]))
> identical(t(zz), y2)
> # [1] TRUE
>
>
> Hope this helps a little.
>
> Allan
>
>
>
>> The *apply functions are designed to be convenient and clear to read, not
>> necessarily fast.
>>
>> Duncan Murdoch
>>
>>  It doesn't matter much if I do column wise calculations rather than row
>>> wise
>>>
>>> ## Example of how apply is SLOWER than for loop:
>>>
>>> #rm(list=ls())
>>>
>>> ## DEFINE VARIABLES
>>> mu=0.05 ; sigma=0.20 ; dt=.25 ; T=50 ; sims=1e5
>>> timesteps = T/dt
>>>
>>> ## MAKE PHI AND DS
>>> phi = matrix(rnorm(timesteps*sims), nrow=sims, ncol=timesteps)
>>> ds = mu*dt + sigma * sqrt(dt) * phi
>>>
>>> ## USE APPLY TO CALCULATE ROWWISE CUMULATIVE PRODUCT
>>> system.time(y1 <- apply(1+ds, 1, cumprod))
>>> ## UNTRANSFORM Y1, BECAUSE ROW APPLY FLIPS THE MATRIX
>>> y1=t(y1)
>>>
>>> ## USE FOR LOOP TO CALCULATE ROWWISE CUMULATIVE PRODUCT
>>> y2=matrix(NA,nrow(ds),ncol(ds))
>>> system.time(
>>>    for (i in 1:nrow(ds)){
>>>        y2[i,]<-cumprod(1+ds[i,])
>>>    }
>>> )
>>>
>>> ## COMPARE RESULTS TO MAKE SURE THEY DID THE SAME THING
>>> str(y1)
>>> str(y2)
>>> all(y1==y2)
>>>
>>>    [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>> ______________________________________________
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to