On 11/30/20 6:05 PM, Zachary Streeter wrote:
I mainly would like a good old ```#pragma omp for``` to parallelize
for-loops. Should I use a "workstream" for this? Can someone point me in the
right direction within deal.II to look for these capabilities? I imagine all
the classics like scan and reduction, etc., must be in this project.
The problem with #pragma omp for is that, almost universally, it only helps
you parallelize the innermost loops. But these loops are also almost
universally short and breaking them up into 16 or 64 threads leads to so many
synchronization points that it's not worth it on machines with substantial
core counts.
What you need to do instead is to parallelize the outermost loops -- say, the
loop over all cells when assembling a linear system. Here, you do orders of
magnitude more work, and consequently have orders of magnitude fewer
synchronization points. But #pragma omp for can't express the complexities of
these kinds of loops: The counter over the elements may actually be an
iteration of class type (not supported by OpenMP), the variables that are
thread-local or shared are not just simple C-style objects but classes
themselves with complex semantics, etc. As a consequence, we've always tried
to avoid using OpenMP for parallelization -- the only efficient way to do it
is using systems such as WorkFlow or task-based parallelism.
Best
W.
--
------------------------------------------------------------------------
Wolfgang Bangerth email: [email protected]
www: http://www.math.colostate.edu/~bangerth/
--
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see
https://groups.google.com/d/forum/dealii?hl=en
---
You received this message because you are subscribed to the Google Groups "deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/dealii/5000383e-3bc8-4731-588c-5045e3caa55d%40colostate.edu.