Hi Asis,

parallel computing is a very delicate task in programming, which depends on one side on your hardware architecture and on the other side on your commands in your software.

1. If a sequential code is faster than the parallel code, check if something is differently programmed or, if all is the same and only a '#pragma omp'-directive is added.

2. How many iterations are you working in parallel, as building a threadpool and initialize parallel processing costs time for the computer. If the time is more than the time you win from parallel execution of the tasks in the loop, you loose in aggregate. test your program by doubling your input: when does the parallel code bcomes faster then the sequential? If never, there must be a serious red flag somewhere in your code.

3. If you have a NUMA-architecture on your computer (which is the case for almost all modern home computers) and your parallelized tasks are defined by many momery accesses, there could be non-local memory accesses which are costly. In this case the only workaround is page processing, i.e. put the objects which are processed at a certain CPU into the local cache of this CPU.

4. In the case of too much cores for too less tasks, you get high generating costs (see point 2). Better try less threads. Furthermore: take as a maximum only the amount of cores you have. Hyperthreading is nice, but a real performance jump is only possible via real cores.

5. Set off the dynamic scheduling of OpenMP!! Set the environment variable via: OMP_DYNAMIC=false.

6. Look at the workload! If you are parallelizing the 'wrong' loop it costs often more to parallelize something without much calculations the winning from doing it in parallel. Instead parallelize something with very complex calculations. Take a tool to monitor performance and the big workloads of your program. This can't be done by simply looking at the code - only if you also consider the specific hardware structure of your computer and using very simple objects.

For performance tools check out vampir (http://www.vampir.eu/) or scalasca (http://www.scalasca.org/). For debugging check valgrind (http://valgrind.org/docs/manual/drd-manual.html#drd-manual.openmp).

The commercial softwares are much better though. So, if you have access to this software I would suggest either Intel VTune Analyzer and Intel Inspector. Further especially for debugging TotalView.

Hope this helps


Best

Simon

On Mon, 3 Jun 2013 12:44:20 +0200
 Asis Hallab <[email protected]> wrote:
Dear Dirk, Simon and Rcpp Experts.

This is a message following up the thread about using OpenMP directives with Rcpp to construct probability matrices in parallel.

I followed Dirk's hint and implemented the parallel matrix generation using just C++'s STL and the "#pragma omp parallel for" for the loop of the heaviest work load in each iteration, that is the generation of
a matrix.

Good news: The code compiles and runs without errors.

Bad news: Even though the conversion of a large RcppList and its contained NumericMatrix objects does only take less then half a second, the parallel code with 10 cores runs approximately 10 times
slower than the serial pure Rcpp implementation.

Serial implementation
user  system elapsed
 9.657   0.100   9.785

Parallel implementation on 10 cores
  user  system elapsed
443.095  26.437 100.132

Parallel implementation on 20 cores
  user  system elapsed
719.173  35.418  85.663

Again: I measured the time required to convert the Rcpp objects and
this is only half a second.
Back conversion I did not even implement yet, I just wrap the resulting std::map< std string, std::vector< std::vector<double> >.

Does anyone have an idea what is going on?

The code can be reviewed on github:
https://github.com/asishallab/PhyloFun_Rccp/blob/OpenMP

You'll find very short installation and test run instructions in the
README.textile.

Kind regards and all the best!
_______________________________________________
Rcpp-devel mailing list
[email protected]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel

_______________________________________________
Rcpp-devel mailing list
[email protected]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel

Reply via email to