Le 05/01/11 14:01, Dirk Eddelbuettel a écrit :
On 5 January 2011 at 10:55, Cedric Ginestet wrote:
| Dear All,
|
| Here are some simulations that I have run this morning. Romain's suggestion to
| compute xV.size() before the loop and Douglas' idea of using accumulate appear
| to work best. However, both are substantially slower than the r-base function.
|
| I have also included two more versions: (i) one similar to Romain's but using
| pre-incrementation in the loop and (ii) one using the iterator in the loop.
| Another option may be to use the C++ boost library. I don't know if anyone on
| this list has experience with using boost.
|
| See the results of the simulations below (N=1000 data sets).
| Ced
|
| #####################################################################
| ## Functions.
| Summing1<- cxxfunction(signature(x="numeric"), '
| NumericVector xV(x);
| double out = sum(xV);
| return wrap(out);
| ',plugin="Rcpp")
| Summing2<- cxxfunction(signature(x="numeric"), '
| NumericVector xV(x);
| double out = 0.0;
| for(int i=0; i<xV.size(); i++) out += xV[i];
| return wrap(out);
| ',plugin="Rcpp")
| Summing3<- cxxfunction(signature(x="numeric"), '
| NumericVector xV(x);
| double out = 0.0; int N=xV.size();
| for(int i=0; i<N; i++) out += xV[i];
| return wrap(out);
| ',plugin="Rcpp")
| Summing4<- cxxfunction(signature(x="numeric"), '
| NumericVector xV(x);
| return wrap(std::accumulate(xV.begin(), xV.end(), double()));
| ',plugin="Rcpp")
| Summing5<- cxxfunction(signature(x="numeric"), '
| NumericVector xV(x);
| double out = 0.0; int N=xV.size();
| for(int i=0; i<N; ++i) out += xV[i];
| return wrap(out);
| ',plugin="Rcpp")
| Summing6<- cxxfunction(signature(x="numeric"), '
| NumericVector xV(x);
| double out = 0.0;
| for(NumericVector::iterator i=xV.begin(); i!=xV.end(); ++i) out += *i;
| return wrap(out);
| ',plugin="Rcpp")
|
| #####################################################################
| ## Simulation: Time Testing.
| n<- 1000000; N<- 1000
| time.Sum<- matrix(0,N,7);
| for(i in 1:N){
| x<- rnorm(n)
| time.Sum[i,1]<- system.time(Summing1(x))[3];
| time.Sum[i,2]<- system.time(Summing2(x))[3];
| time.Sum[i,3]<- system.time(Summing3(x))[3];
| time.Sum[i,4]<- system.time(Summing4(x))[3];
| time.Sum[i,5]<- system.time(Summing5(x))[3];
| time.Sum[i,6]<- system.time(Summing6(x))[3];
| time.Sum[i,7]<- system.time(sum(x))[3];
| }# i
| time.df<- data.frame(time.Sum)
| names(time.df)<- c
| ("Sugar","Rcpp","Rcpp_N","Accumulate","Pre-increment","Iterator","R")
| boxplot(time.df)
|
| #####################################################################
| ## RESULTS:
| formatC(summary(time.df),dec=3)
| Sugar Rcpp Rcpp_N
| " Min. :0.01600 " " Min. :0.01000 " "Min. :0.005000 "
| " 1st Qu.:0.01600 " " 1st Qu.:0.01000 " "1st Qu.:0.005000 "
| " Median :0.01600 " " Median :0.01100 " "Median :0.006000 "
| " Mean :0.01631 " " Mean :0.01060 " "Mean :0.005668 "
| " 3rd Qu.:0.01600 " " 3rd Qu.:0.01100 " "3rd Qu.:0.006000 "
| " Max. :0.03700 " " Max. :0.02400 " "Max. :0.020000 "
| Accumulate Pre-increment Iterator
| "Min. :0.005000 " "Min. :0.005000 " " Min. :0.01000 "
| "1st Qu.:0.005000 " "1st Qu.:0.005000 " " 1st Qu.:0.01000 "
| "Median :0.006000 " "Median :0.006000 " " Median :0.01100 "
| "Mean :0.005714 " "Mean :0.005697 " " Mean :0.01065 "
| "3rd Qu.:0.006000 " "3rd Qu.:0.006000 " " 3rd Qu.:0.01100 "
| "Max. :0.029000 " "Max. :0.021000 " " Max. :0.03100 "
| R
| "Min. :0.002000 "
| "1st Qu.:0.002000 "
| "Median :0.002000 "
| "Mean :0.002211 "
| "3rd Qu.:0.002000 "
| "Max. :0.004000 "
| #####################################################################
|
| PS: Apologies to Dirk as I have not followed his advice, yet.
Try this instead:
## Summing1 to Summing6 as above
Summing1a<- cxxfunction(signature(x="numeric"), '
NumericVector xV(x);
double out = sum(noNA(xV));
return wrap(out);
',plugin="Rcpp")
library(rbenchmark)
n<- 1000000
N<- 1000
x<- rnorm(n)
bm<- benchmark(Sugar = Summing1(x),
SugarNoNA = Summing1a(x),
Rcpp = Summing2(x),
Rcpp_N = Summing3(x),
Accumulate= Summing4(x),
PreIncrem = Summing5(x),
Iterator = Summing6(x),
R = function(x){ sum(x) },
columns=c("test", "elapsed", "relative", "user.self",
"sys.self"),
order="relative",
replications=N)
print(bm)
which on my box gets this
e...@max:/tmp$ Rscript cedric.R
Loading required package: methods
test elapsed relative user.self sys.self
8 R 0.003 1.00 0.00 0
5 Accumulate 1.212 404.00 1.22 0
2 SugarNoNA 1.214 404.67 1.22 0
6 PreIncrem 1.214 404.67 1.21 0
4 Rcpp_N 1.215 405.00 1.21 0
7 Iterator 5.301 1767.00 5.30 0
3 Rcpp 5.302 1767.33 5.30 0
1 Sugar 7.229 2409.67 7.21 0
e...@max:/tmp$
indicating that you have four equivalent versions neither on of which can go
as fast as an R builtin goes (well, doh).
Basic sugar, as we said before, gives a lot of convenience along with some
safeties (exception checks, NA checks, ...).
But you are not the first person, and surely not the last, to simply assume
that it would also be as fast as carefully tuned and crafted code.
But that ain't so -- the No Free Lunch theorem is still valid.
Dirk
You can get free lunch if you are friend witrh the cook.
I commited some code (rev 2846) that makes sum faster. This is based on
the same thing that made operators *, +, etc ... faster during christmas.
So, with this. I get:
rom...@naxos /tmp $ Rscript /tmp/sum.R
Le chargement a nécessité le package : inline
Le chargement a nécessité le package : methods
Le chargement a nécessité le package : Rcpp
test elapsed relative user.self sys.self
1 Sugar 1.005 1.000000 1.003 0.003
3 Rcpp_N 1.005 1.000000 1.002 0.003
5 PreIncrem 1.005 1.000000 1.003 0.003
4 Accumulate 1.011 1.005970 1.007 0.003
7 R 1.648 1.639801 1.643 0.005
2 Rcpp 4.827 4.802985 4.813 0.015
6 Iterator 4.827 4.802985 4.812 0.014
BTW, Dirk this line was wrong:
R = function(x){ sum(x) },
The expression that was benchmarked was "create the function" not "call
it", which explains why the R version was so much faster in your
example, it did not do anything.
Romain
require(inline)
require(Rcpp)
Summing1 <- cxxfunction(signature(x="numeric"), '
NumericVector xV(x);
double out = sum(xV);
return wrap(out);
',plugin="Rcpp")
Summing2 <- cxxfunction(signature(x="numeric"), '
NumericVector xV(x);
double out = 0.0;
for(int i=0; i<xV.size(); i++) out += xV[i];
return wrap(out);
',plugin="Rcpp")
Summing3 <- cxxfunction(signature(x="numeric"), '
NumericVector xV(x);
double out = 0.0; int N=xV.size();
for(int i=0; i<N; i++) out += xV[i];
return wrap(out);
',plugin="Rcpp")
Summing4 <- cxxfunction(signature(x="numeric"), '
NumericVector xV(x);
return wrap(std::accumulate(xV.begin(), xV.end(), double()));
',plugin="Rcpp")
Summing5 <- cxxfunction(signature(x="numeric"), '
NumericVector xV(x);
double out = 0.0; int N=xV.size();
for(int i=0; i<N; ++i) out += xV[i];
return wrap(out);
',plugin="Rcpp")
Summing6 <- cxxfunction(signature(x="numeric"), '
NumericVector xV(x);
double out = 0.0;
for(NumericVector::iterator i=xV.begin(); i!=xV.end(); ++i) out
+= *i;
return wrap(out);
',plugin="Rcpp")
library(rbenchmark)
n <- 1000000
N <- 1000
x <- rnorm(n)
bm <- benchmark(Sugar = Summing1(x),
Rcpp = Summing2(x),
Rcpp_N = Summing3(x),
Accumulate= Summing4(x),
PreIncrem = Summing5(x),
Iterator = Summing6(x),
R = sum(x),
columns=c("test", "elapsed", "relative", "user.self",
"sys.self"),
order="relative",
replications=N)
print(bm)
--
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr
|- http://bit.ly/fT2rZM : highlight 0.2-5
|- http://bit.ly/gpCSpH : Evolution of Rcpp code size
`- http://bit.ly/hovakS : RcppGSL initial release
_______________________________________________
Rcpp-devel mailing list
Rcpp-devel@lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel