> | Hi all, > | > | Inspired by "Rcpp is smoking fast for agent-based models in data > | frames" (http://www.babelgraph.org/wp/?p=358), I've been doing some > > [ I liked that post, but we got flak afterwards as his example was not well > chosen. The illustration of the language speed difference does of course > hold. ]
The example is not particularly well chosen, but I think the problem of vectorisation is a real one. To vectorise code in R you need to have a big R vocabulary; to vectorise code in Rcpp, you need to be able to write a loop. So even if it's a not a completely fair comparison to R, it's still reasonable because it's much easy to vectorise in C++. > | exploration of vectorisation in R vs C++ at > | https://gist.github.com/4111256 > | > | I have five versions of the basic vaccinate function: > | > | * vacc1: vectorisation in R with a for loop > | * vacc2: used vectorised R primitives > | * vacc3: vectorised with loop in C++ > | * vacc4: vectorised with Rcpp sugar > | * vacc5: vectorised with Rcpp sugar, explicitly labelled as containing > | no missing values > | > | And the timings I get are as follows: > | > | Unit: microseconds > | expr min lq median uq max neval > | vacc1(age, female, ily) 6816.8 7139.4 7285.7 7823.9 10055.5 100 > | vacc2(age, female, ily) 194.5 202.6 212.6 227.9 260.4 100 > | vacc3(age, female, ily) 21.8 22.4 23.4 24.9 35.5 100 > | vacc4(age, female, ily) 36.2 38.7 41.3 44.5 55.6 100 > | vacc5(age, female, ily) 29.3 31.3 34.0 36.4 52.1 100 > | > | Unsurprisingly the R loop (vacc1) is very slow, and proper > | vectorisation speeds it up immensely. Interestingly, however, the C++ > | loop still does considerably better (about 10x faster) - I'm not sure > | exactly why this is the case, but I suspect it may be because it > | avoids the many intermediate vectors that R requires. The sugar > | version is about half as fast, but this gets quite a bit faster with > | explicit no missing flags. > | > | I'd love any feedback on my code (https://gist.github.com/4111256) - > | please let me know if I've missed anything obvious. > > I don't have a problem with sugar being a little slower that hand-rolling. > The code is so much simpler and shorter. And we're still way faster than > vectorised R. I like that place. Agreed - most of the time these tiny differences are not going to matter, and you're better off writing expressive code that you can actually understand. But my feeling again is that this sort of implicit vectorisation is hard for many people to get their heads around, and there are many R users who feel more comfortable with explicit loops. It's good to help educate these people into more vectorised/whole-object ways of thinking, but it's also nice to have a high-performance fall back. Hadley -- RStudio / Rice University http://had.co.nz/ _______________________________________________ Rcpp-devel mailing list [email protected] https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
