Re: Adding OpenMP support for some of the GSL functions

Maxime Boissonneault Thu, 13 Dec 2012 13:07:08 -0800

Hi Rhys,

While that is true in theory, it is not applicable in practice, sincethere can be no "return" within parallel sections. We need one parallelsection for each loop in this case.


Maxime

Le 2012-12-13 11:44, Rhys Ulerich a écrit :

This feels like you're getting a small
memory/cache bandwidth increase for the rkf45_apply level-1-BLAS-like
operations by using multiple cores but the cores are otherwise not
being used effectively.  I say this because a state vector 1e6 doubles
long will not generally fit in cache.  Adding more cores increases the
amount of cache available.

Hmm... I tentatively take this back on re-thinking how you've added
the #pragma omp lines to the rkf45.c file you attached elsewhere in
this thread.  Try using a single
     #pragma omp parallel
and then individual lines like
     #pragma omp for
at each for loop.  Using
     #pragma omp parallel for
repeatedly as you've done can introduce excess overhead, depending on
your compiler, because it may incur unnecessary overhead.

- Rhys

Re: Adding OpenMP support for some of the GSL functions

Reply via email to