Prof Brian Ripley wrote:

On Thu, 27 Jan 2005, Paul Gilbert wrote:

A few weeks ago I noticed

z <- matrix(rnorm(20000),10000,2)


system.time(for (i in 1:1000) apply(z,2,sum))

[1] 13.44 0.48 14.08 0.00 0.00

system.time(for (i in 1:1000) rep(1,10000) %*% z)

[1] 6.46 0.11 6.84 0.00 0.00


So both run in a few milliseconds for rather large problems.

which seemed  completely contrary to all my childhood teachings. Now

system.time(for (i in 1:1000) crossprod(rep(1,10000), z))

[1] 1.90 0.12 2.24 0.00 0.00

makes sense because it is suppose to be faster than %*% , but why is apply so slow?


`so slow' sic: what are you going to do in the 7ms you saved?

Yes, I think I've spent more time checking this than I will ever save.

(And should I go back and change apply in my code everywhere or is this likely to reverse again?)


It's not likely. apply is an R-level loop, and %*% is a C-level one.
However, %*% is not supposed to be much slower than crossprod, and the devil is in the details of how the BLAS is implemented: the code is very similar.


That %*% was faster than apply has been true in all my (17 years) of S/R experience. Your childhood may predate S3, of course.

I didn't say the teachings were correct. I clearly should have questioned some things sooner.


I still think one should use row/colSums for clarity with acceptable
efficiency. It must be very unusual for these operations to be a dominant part of a calculation, so let's not lose proportion here.

Agreed. I think I like the clarity reason best.

Thanks for the explanation.

Paul

______________________________________________
R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to