Riccardo,
> i am writing since i am an OpenMP user, but i am actually quite
> curious in understanding the future directions of c++.
>
> my parallel usage is actually relatively trivial, and is covered by OpenMP
> 2.5 (openmp 3.1 with supports for iterators would be better but is not
> available in msvc)
> 99% of my user needs are about parallel loops, and with c++11 lambdas i
> could do a lot.
Right. It is a fairly simple transformation in order to turn an OpenMP parallel
loop into the equivalent parallel algorithm. We specificly added the
parallel::for_loop() (not I the Parallelism TS/C++17) to support that migration:
#pragma omp parallel for
for(int i = 0; i != N; ++i)
{
// some iteration
}
Would be equivalent to
hpx::parallel::for_loop(
hpx::parallel::par,
0, N, [](int i)
{
// some iteration
});
(for more information about for_loop() see here:
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/p0075r0.pdf)
> However i am really not clear on how i should equivalently handle
> "private" and "firstprivate of OpenMP, which allow to create objects that
> persist in the threadprivate memory during the whole lenght of a for loop.
> I now use OpenMP 2.5 and i have a code that looks like the following
>
> https://kratos.cimne.upc.es/projects/kratos/repository/entry/kratos/kratos
> /solving_strategies/builder_and_solvers/residualbased_block_builder_and_so
> lver.h
> which does an openmp parallel Finite Element assembly.
> The code i am thinking of is somethign like:
[snipped code]
> the big question is ... how shall i handle the threadprivate scratchspace
> in HPX?? Lambdas do not allow to do this ...
> that is, what is the equivalente of private & of firstprivate??
> thanks you in advance for any clarification or pointer to examples
For 'firstprivate' you can simply use lambda captures:
int nelements = 42;
for_each(par, 0, N,
[nelements](int i)
{
// the captured 'nelements' is initialized from the outer
// variable and each copy of the lambda has its own private
// copy
//
// use private 'nelements' here:
cout << nelements << endl;
});
Note, that 'nelements' will be const by default. If you want to modify its
value, the lambda has to be made mutable:
int nelements = 42;
for_each(par, 0, N,
[nelements](int i) mutable // makes captures non-const
{
++nelements;
});
Please don't be fooled however that this might give you one variable instance
per iteration. HPX runs several iterations 'in one go' (depending on the
partitioning, very much like openmp), so you will create one variable instance
per created partition. As long as you don't modify the variable this shouldn't
make a difference, however.
Emulating 'private' is even simpler. All you need is a local variable for each
iteration after all. Thus simply creating it on the stack inside the lambda is
the solution:
for_loop(par, 0, N, [](int i)
{
// create 'private' variable
int my_private = 0;
// ...
});
This also gives you a hint on how you can have one instance of your variable
per iteration and still initialize it like it was firstprivate:
int nelements = 42;
for_loop(par, 0, N, [nelements](int i)
{
// create 'private' variable
int my_private = nelements;
// ...
++my_private; // modifies instance for this iteration only.
});
Things become a bit more interesting if you need reductions. Please see the
linked document above for more details, but here is a simple example (taken
from that paper):
float dot_saxpy(int n, float a, float x[], float y[])
{
float s = 0;
for_loop(par, 0, n,
reduction(s, 0.0f, std::plus<float>()),
[&](int i, float& s_)
{
y[i] += a*x[i];
s_ += y[i]*y[i];
});
return s;
}
Here 's' is the reduction variable, and s_ is the thread-local reference to it.
HTH
Regards Hartmut
---------------
http://boost-spirit.com
http://stellar.cct.lsu.edu
_______________________________________________
hpx-users mailing list
[email protected]
https://mail.cct.lsu.edu/mailman/listinfo/hpx-users