Le 19/11/12 16:31, Hadley Wickham a écrit :
Hi all,

Inspired by "Rcpp is smoking fast for agent-based models in data
frames" (http://www.babelgraph.org/wp/?p=358), I've been doing some
exploration of vectorisation in R vs C++ at
https://gist.github.com/4111256

I have five versions of the basic vaccinate function:

* vacc1: vectorisation in R with a for loop
* vacc2: used vectorised R primitives
* vacc3: vectorised with loop in C++
* vacc4: vectorised with Rcpp sugar
* vacc5: vectorised with Rcpp sugar, explicitly labelled as containing
no missing values

And the timings I get are as follows:

Unit: microseconds
                     expr    min     lq median     uq     max neval
  vacc1(age, female, ily) 6816.8 7139.4 7285.7 7823.9 10055.5   100
  vacc2(age, female, ily)  194.5  202.6  212.6  227.9   260.4   100
  vacc3(age, female, ily)   21.8   22.4   23.4   24.9    35.5   100
  vacc4(age, female, ily)   36.2   38.7   41.3   44.5    55.6   100
  vacc5(age, female, ily)   29.3   31.3   34.0   36.4    52.1   100

Unsurprisingly the R loop (vacc1) is very slow, and proper
vectorisation speeds it up immensely.  Interestingly, however, the C++
loop still does considerably better (about 10x faster) - I'm not sure
exactly why this is the case, but I suspect it may be because it
avoids the many intermediate vectors that R requires.  The sugar
version is about half as fast, but this gets quite a bit faster with
explicit no missing flags.

I'd love any feedback on my code (https://gist.github.com/4111256) -
please let me know if I've missed anything obvious.

Hadley

I've put some code in that allows us to write mapply version. Here are two examples :

// [[Rcpp::export]]
NumericVector vacc6(NumericVector age, LogicalVector female, NumericVector ily) {
  NumericVector p = mapply( age, female, ily, vacc3a ) ;
  return p;
}


inline double minmax( double a, double x, double b) {
    return x < a ? a : ( x > b ? b : x ) ;
}

class Vaccinate {
public:
    typedef double result_type ;
    Vaccinate(){} ;

    inline double operator()(double age, bool female, double ily) const {
        double p = 0.25 + 0.3 * 1 / (1 - exp(0.04 * age)) + 0.1 * ily;
        p = minmax( 0.0, p * (female ? 1.25 : 0.75), 1.0 );
        return p;
    }


} ;

// [[Rcpp::export]]
NumericVector vacc7(NumericVector age, LogicalVector female, NumericVector ily) {
  NumericVector p = mapply( age, female, ily, Vaccinate() ) ;
  return p;
}


(C++11 lambdas would make this even funnier)


I get:

$ Rscript /tmp/vaccinate.R
Unit: microseconds
                     expr     min       lq   median       uq      max
1 vacc2(age, female, ily) 284.728 287.0085 298.3930 331.3630 1087.739
2 vacc3(age, female, ily)  23.010  23.5175  24.4360  27.4620   55.129
3 vacc4(age, female, ily)  42.698  43.0965  44.0405  49.2485   54.516
4 vacc6(age, female, ily)  28.936  29.2380  30.0405  34.2685  425.066
5 vacc7(age, female, ily)  24.751  25.2070  26.0705  29.9605   67.308

Not too far from the loop version. I expected mapply to be better. Anyway, we are splitting hair.


This gives me an opportunity to give insight about why vacc7 performs better than vecc6. The issue about vecc6 is that the construct:

NumericVector p = mapply( age, female, ily, vacc3a ) ;

uses a function pointer, and so the pointer has to be dereferenced each time to call the function. which has a penalty (minor, but still).

vacc7 can properly use inlining.


I'm just trying here to show some alternatives. I like these threads that let us think about alternatives and opportunities. This resulted in some improvements in Rcpp about the assuignment operator in Vector and pmin and pmax which just needed a fresh look.

Romain


--
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30

R Graph Gallery: http://gallery.r-enthusiasts.com
`- http://bit.ly/SweN1Z : SuperStorm Sandy

blog:            http://romainfrancois.blog.free.fr
|- http://bit.ly/RE6sYH : OOP with Rcpp modules
`- http://bit.ly/Thw7IK : Rcpp modules more flexible

_______________________________________________
Rcpp-devel mailing list
Rcpp-devel@lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel

Reply via email to