Dear Dirk, Simon and Rcpp Experts. This is a message following up the thread about using OpenMP directives with Rcpp to construct probability matrices in parallel.
I followed Dirk's hint and implemented the parallel matrix generation using just C++'s STL and the "#pragma omp parallel for" for the loop of the heaviest work load in each iteration, that is the generation of a matrix. Good news: The code compiles and runs without errors. Bad news: Even though the conversion of a large RcppList and its contained NumericMatrix objects does only take less then half a second, the parallel code with 10 cores runs approximately 10 times slower than the serial pure Rcpp implementation. Serial implementation user system elapsed 9.657 0.100 9.785 Parallel implementation on 10 cores user system elapsed 443.095 26.437 100.132 Parallel implementation on 20 cores user system elapsed 719.173 35.418 85.663 Again: I measured the time required to convert the Rcpp objects and this is only half a second. Back conversion I did not even implement yet, I just wrap the resulting std::map< std string, std::vector< std::vector<double> >. Does anyone have an idea what is going on? The code can be reviewed on github: https://github.com/asishallab/PhyloFun_Rccp/blob/OpenMP You'll find very short installation and test run instructions in the README.textile. Kind regards and all the best! _______________________________________________ Rcpp-devel mailing list [email protected] https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
