Dear All,
Here are some simulations that I have run this morning. Romain's
suggestion to compute xV.size() before the loop and Douglas' idea of
using accumulate appear to work best. However, both are substantially
slower than the r-base function.
I have also included two more versions: (i) one similar to Romain's but
using pre-incrementation**in the loop and (ii) one using the iterator in
the loop. Another option may be to use the C++ boost library. I don't
know if anyone on this list has experience with using boost.
See the results of the simulations below (N=1000 data sets).
Ced
#####################################################################
## Functions.
Summing1 <- cxxfunction(signature(x="numeric"), '
NumericVector xV(x);
double out = sum(xV);
return wrap(out);
',plugin="Rcpp")
Summing2 <- cxxfunction(signature(x="numeric"), '
NumericVector xV(x);
double out = 0.0;
for(int i=0; i<xV.size(); i++) out += xV[i];
return wrap(out);
',plugin="Rcpp")
Summing3 <- cxxfunction(signature(x="numeric"), '
NumericVector xV(x);
double out = 0.0; int N=xV.size();
for(int i=0; i<N; i++) out += xV[i];
return wrap(out);
',plugin="Rcpp")
Summing4 <- cxxfunction(signature(x="numeric"), '
NumericVector xV(x);
return wrap(std::accumulate(xV.begin(), xV.end(), double()));
',plugin="Rcpp")
Summing5 <- cxxfunction(signature(x="numeric"), '
NumericVector xV(x);
double out = 0.0; int N=xV.size();
for(int i=0; i<N; ++i) out += xV[i];
return wrap(out);
',plugin="Rcpp")
Summing6 <- cxxfunction(signature(x="numeric"), '
NumericVector xV(x);
double out = 0.0;
for(NumericVector::iterator i=xV.begin(); i!=xV.end(); ++i) out
+= *i;
return wrap(out);
',plugin="Rcpp")
#####################################################################
## Simulation: Time Testing.
n <- 1000000; N <- 1000
time.Sum <- matrix(0,N,7);
for(i in 1:N){
x <- rnorm(n)
time.Sum[i,1] <- system.time(Summing1(x))[3];
time.Sum[i,2] <- system.time(Summing2(x))[3];
time.Sum[i,3] <- system.time(Summing3(x))[3];
time.Sum[i,4] <- system.time(Summing4(x))[3];
time.Sum[i,5] <- system.time(Summing5(x))[3];
time.Sum[i,6] <- system.time(Summing6(x))[3];
time.Sum[i,7] <- system.time(sum(x))[3];
}# i
time.df <- data.frame(time.Sum)
names(time.df) <-
c("Sugar","Rcpp","Rcpp_N","Accumulate","Pre-increment","Iterator","R")
boxplot(time.df)
#####################################################################
## RESULTS:
formatC(summary(time.df),dec=3)
Sugar Rcpp Rcpp_N
" Min. :0.01600 " " Min. :0.01000 " "Min. :0.005000 "
" 1st Qu.:0.01600 " " 1st Qu.:0.01000 " "1st Qu.:0.005000 "
" Median :0.01600 " " Median :0.01100 " "Median :0.006000 "
" Mean :0.01631 " " Mean :0.01060 " "Mean :0.005668 "
" 3rd Qu.:0.01600 " " 3rd Qu.:0.01100 " "3rd Qu.:0.006000 "
" Max. :0.03700 " " Max. :0.02400 " "Max. :0.020000 "
Accumulate Pre-increment Iterator
"Min. :0.005000 " "Min. :0.005000 " " Min. :0.01000 "
"1st Qu.:0.005000 " "1st Qu.:0.005000 " " 1st Qu.:0.01000 "
"Median :0.006000 " "Median :0.006000 " " Median :0.01100 "
"Mean :0.005714 " "Mean :0.005697 " " Mean :0.01065 "
"3rd Qu.:0.006000 " "3rd Qu.:0.006000 " " 3rd Qu.:0.01100 "
"Max. :0.029000 " "Max. :0.021000 " " Max. :0.03100 "
R
"Min. :0.002000 "
"1st Qu.:0.002000 "
"Median :0.002000 "
"Mean :0.002211 "
"3rd Qu.:0.002000 "
"Max. :0.004000 "
#####################################################################
PS: Apologies to Dirk as I have not followed his advice, yet.
--
Cedric Ginestet
Centre for Neuroimaging Sciences (L3.04)
NIHR Biomedical Research Centre
Institute of Psychiatry, Box P089
Kings College London
De Crespigny Park
London
SE5 8AF
On 04/01/11 15:37, Dirk Eddelbuettel wrote:
On 4 January 2011 at 15:14, Cedric Ginestet wrote:
| Happy new year to everyone,
|
| I have made a very straightforward comparison of the performance of standard
R,
| Rcpp function and sugar, and found that the latter produces the poorest
| performance. Let me know what you think and how I could improve such
| performance assessment.
|
| ###################################################
| Summing1<- cxxfunction(signature(x="numeric"), '
| NumericVector xV(x);
| double out = sum(xV);
| return wrap(out);
| ',plugin="Rcpp")
| Summing2<- cxxfunction(signature(x="numeric"), '
| NumericVector xV(x);
| double out = 0.0;
| for(int i=0; i<xV.size(); i++) out += xV[i];
| return wrap(out);
| ',plugin="Rcpp")
| ###################################################
| # Results.
| n<- 1000000; x<- rnorm(n)
| Summing1(x); Summing2(x); sum(x)
| #######################
| gives:
| [1] -396.6129
| [1] -396.6129
| [1] -396.6129
|
| ###################################################
| # Time.
| system.time(Summing1(x)); # Sugar
| system.time(Summing2(x)); # Rcpp
| system.time(sum(x)); # R-base
| ###################
|> system.time(Summing1(x));
| user system elapsed
| 0.016 0.000 0.016
|> system.time(Summing2(x));
| user system elapsed
| 0.008 0.000 0.011
|> system.time(sum(x));
| user system elapsed
| 0.000 0.000 0.003
|
|
| Sugar appears to be the slowest! What about Rcpp basic loop? Why isn't as fast
| as the standard sum() in R-base?
1) Try to think a about measurement error here; these times are all minuscule.
2) Consider reading the list archive, we have better use of benchmarks using
rbenchmark and replications; these are also some example in the examples
right in Rcpp
3) Consider reading the list archive and discussions about the NoNA tests.
4) Lastly, consider Romain's point about a baseline using an empty function.
Dirk
| Cheers,
| Cedric
|
| --
| Cedric Ginestet
| Centre for Neuroimaging Sciences (L3.04)
| NIHR Biomedical Research Centre
| Institute of Psychiatry, Box P089
| Kings College London
| De Crespigny Park
| London
| SE5 8AF
|
|
| ----------------------------------------------------------------------
| _______________________________________________
| Rcpp-devel mailing list
| Rcpp-devel@lists.r-forge.r-project.org
| https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
_______________________________________________
Rcpp-devel mailing list
Rcpp-devel@lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel