On 7 January 2011 at 12:54, Romain Francois wrote: | Le 05/01/11 14:01, Dirk Eddelbuettel a écrit : | > | > On 5 January 2011 at 10:55, Cedric Ginestet wrote: | > | Dear All, | > | | > | Here are some simulations that I have run this morning. Romain's suggestion to | > | compute xV.size() before the loop and Douglas' idea of using accumulate appear | > | to work best. However, both are substantially slower than the r-base function. | > | | > | I have also included two more versions: (i) one similar to Romain's but using | > | pre-incrementation in the loop and (ii) one using the iterator in the loop. | > | Another option may be to use the C++ boost library. I don't know if anyone on | > | this list has experience with using boost. | > | | > | See the results of the simulations below (N=1000 data sets). | > | Ced | > | | > | ##################################################################### | > | ## Functions. | > | Summing1<- cxxfunction(signature(x="numeric"), ' | > | NumericVector xV(x); | > | double out = sum(xV); | > | return wrap(out); | > | ',plugin="Rcpp") | > | Summing2<- cxxfunction(signature(x="numeric"), ' | > | NumericVector xV(x); | > | double out = 0.0; | > | for(int i=0; i<xV.size(); i++) out += xV[i]; | > | return wrap(out); | > | ',plugin="Rcpp") | > | Summing3<- cxxfunction(signature(x="numeric"), ' | > | NumericVector xV(x); | > | double out = 0.0; int N=xV.size(); | > | for(int i=0; i<N; i++) out += xV[i]; | > | return wrap(out); | > | ',plugin="Rcpp") | > | Summing4<- cxxfunction(signature(x="numeric"), ' | > | NumericVector xV(x); | > | return wrap(std::accumulate(xV.begin(), xV.end(), double())); | > | ',plugin="Rcpp") | > | Summing5<- cxxfunction(signature(x="numeric"), ' | > | NumericVector xV(x); | > | double out = 0.0; int N=xV.size(); | > | for(int i=0; i<N; ++i) out += xV[i]; | > | return wrap(out); | > | ',plugin="Rcpp") | > | Summing6<- cxxfunction(signature(x="numeric"), ' | > | NumericVector xV(x); | > | double out = 0.0; | > | for(NumericVector::iterator i=xV.begin(); i!=xV.end(); ++i) out += *i; | > | return wrap(out); | > | ',plugin="Rcpp") | > | | > | ##################################################################### | > | ## Simulation: Time Testing. | > | n<- 1000000; N<- 1000 | > | time.Sum<- matrix(0,N,7); | > | for(i in 1:N){ | > | x<- rnorm(n) | > | time.Sum[i,1]<- system.time(Summing1(x))[3]; | > | time.Sum[i,2]<- system.time(Summing2(x))[3]; | > | time.Sum[i,3]<- system.time(Summing3(x))[3]; | > | time.Sum[i,4]<- system.time(Summing4(x))[3]; | > | time.Sum[i,5]<- system.time(Summing5(x))[3]; | > | time.Sum[i,6]<- system.time(Summing6(x))[3]; | > | time.Sum[i,7]<- system.time(sum(x))[3]; | > | }# i | > | time.df<- data.frame(time.Sum) | > | names(time.df)<- c | > | ("Sugar","Rcpp","Rcpp_N","Accumulate","Pre-increment","Iterator","R") | > | boxplot(time.df) | > | | > | ##################################################################### | > | ## RESULTS: | > | formatC(summary(time.df),dec=3) | > | Sugar Rcpp Rcpp_N | > | " Min. :0.01600 " " Min. :0.01000 " "Min. :0.005000 " | > | " 1st Qu.:0.01600 " " 1st Qu.:0.01000 " "1st Qu.:0.005000 " | > | " Median :0.01600 " " Median :0.01100 " "Median :0.006000 " | > | " Mean :0.01631 " " Mean :0.01060 " "Mean :0.005668 " | > | " 3rd Qu.:0.01600 " " 3rd Qu.:0.01100 " "3rd Qu.:0.006000 " | > | " Max. :0.03700 " " Max. :0.02400 " "Max. :0.020000 " | > | Accumulate Pre-increment Iterator | > | "Min. :0.005000 " "Min. :0.005000 " " Min. :0.01000 " | > | "1st Qu.:0.005000 " "1st Qu.:0.005000 " " 1st Qu.:0.01000 " | > | "Median :0.006000 " "Median :0.006000 " " Median :0.01100 " | > | "Mean :0.005714 " "Mean :0.005697 " " Mean :0.01065 " | > | "3rd Qu.:0.006000 " "3rd Qu.:0.006000 " " 3rd Qu.:0.01100 " | > | "Max. :0.029000 " "Max. :0.021000 " " Max. :0.03100 " | > | R | > | "Min. :0.002000 " | > | "1st Qu.:0.002000 " | > | "Median :0.002000 " | > | "Mean :0.002211 " | > | "3rd Qu.:0.002000 " | > | "Max. :0.004000 " | > | ##################################################################### | > | | > | PS: Apologies to Dirk as I have not followed his advice, yet. | > | > Try this instead: | > | > ## Summing1 to Summing6 as above | > | > Summing1a<- cxxfunction(signature(x="numeric"), ' | > NumericVector xV(x); | > double out = sum(noNA(xV)); | > return wrap(out); | > ',plugin="Rcpp") | > | > library(rbenchmark) | > n<- 1000000 | > N<- 1000 | > x<- rnorm(n) | > | > bm<- benchmark(Sugar = Summing1(x), | > SugarNoNA = Summing1a(x), | > Rcpp = Summing2(x), | > Rcpp_N = Summing3(x), | > Accumulate= Summing4(x), | > PreIncrem = Summing5(x), | > Iterator = Summing6(x), | > R = function(x){ sum(x) }, | > columns=c("test", "elapsed", "relative", "user.self", "sys.self"), | > order="relative", | > replications=N) | > print(bm) | > | > which on my box gets this | > | > e...@max:/tmp$ Rscript cedric.R | > Loading required package: methods | > test elapsed relative user.self sys.self | > 8 R 0.003 1.00 0.00 0 | > 5 Accumulate 1.212 404.00 1.22 0 | > 2 SugarNoNA 1.214 404.67 1.22 0 | > 6 PreIncrem 1.214 404.67 1.21 0 | > 4 Rcpp_N 1.215 405.00 1.21 0 | > 7 Iterator 5.301 1767.00 5.30 0 | > 3 Rcpp 5.302 1767.33 5.30 0 | > 1 Sugar 7.229 2409.67 7.21 0 | > e...@max:/tmp$ | > | > indicating that you have four equivalent versions neither on of which can go | > as fast as an R builtin goes (well, doh). | > | > Basic sugar, as we said before, gives a lot of convenience along with some | > safeties (exception checks, NA checks, ...). | > | > But you are not the first person, and surely not the last, to simply assume | > that it would also be as fast as carefully tuned and crafted code. | > | > But that ain't so -- the No Free Lunch theorem is still valid. | > | > Dirk | | You can get free lunch if you are friend witrh the cook.
:) | I commited some code (rev 2846) that makes sum faster. This is based on | the same thing that made operators *, +, etc ... faster during christmas. | | So, with this. I get: | | rom...@naxos /tmp $ Rscript /tmp/sum.R | Le chargement a nécessité le package : inline | Le chargement a nécessité le package : methods | Le chargement a nécessité le package : Rcpp | test elapsed relative user.self sys.self | 1 Sugar 1.005 1.000000 1.003 0.003 | 3 Rcpp_N 1.005 1.000000 1.002 0.003 | 5 PreIncrem 1.005 1.000000 1.003 0.003 | 4 Accumulate 1.011 1.005970 1.007 0.003 | 7 R 1.648 1.639801 1.643 0.005 | 2 Rcpp 4.827 4.802985 4.813 0.015 | 6 Iterator 4.827 4.802985 4.812 0.014 | | BTW, Dirk this line was wrong: | | R = function(x){ sum(x) }, | | The expression that was benchmarked was "create the function" not "call | it", which explains why the R version was so much faster in your | example, it did not do anything. Eeek. My bad. I was wondering about that because the performance was so difference ... My bad. I should not have omitted the 'stopifnot(identical(res1, res2)' step we have normally have there too. Thanks for making sum() that much faster. Dirk -- Dirk Eddelbuettel | e...@debian.org | http://dirk.eddelbuettel.com _______________________________________________ Rcpp-devel mailing list Rcpp-devel@lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel