Re: [hpx-users] equivalent of firstprivate

Hartmut Kaiser Sun, 11 Sep 2016 10:23:34 -0700

> first of all thank you very much for your quick and detailed answer.
> Nevertheless i think i did not explain my concern.
> using your code snippet, imagine i have
> 
> 
>     int nelements = 42;
>     Matrix expensive_to_construct_scratchspace
> 
>     for_each(par, 0, N,
>         [nelements, expensive_to_construct_scratchspace](int i)
>         {
>             // the captured 'nelements' is initialized from the outer
>             // variable and each copy of the lambda has its own private
>             // copy
> HERE as i understand the lambda vould capture by value my
> "expensive_to_construct_scratchspace", which as i understand implies that
> i would have one allocation per every "i". --> are u telling that this is
> not the case? If so that would be a problem since constructing it would be
> very expensive.


No, that would be the case, your analysis is correct.

> On the contrary, if the lambda does not copy by value ...  what if i do
> need that behaviour?
> 
> note that i could definitely construct a blocked range of iterators and
> define a lambda acting on a given range of iterators, however that would
> be very very verbose...

Looks like I misunderstood what firstprivate actually does... 

OTOH, in the openmp spec I read:

    firstprivate Specifies that each thread should have its own instance of 
    a variable, and that the variable should be initialized with the value 
    of the variable, because it exists before the parallel construct.

So each thread gets its own copy, which implies copying/allocation. What do I 
miss?

If however you want to share the variable in between threads, just capture it 
by reference:

    Matrix expensive_to_construct_scratchspace
    for_each(par, 0, N,
        [&expensive_to_construct_scratchspace](int i)
        {
        });

In this case you'd be responsible for making any operations on the shared 
variable thread safe, however. 

Is that what you need? 

Regards Hartmut
---------------
http://boost-spirit.com
http://stellar.cct.lsu.edu


> 
> 
> anyway,
> thanks again for your attention
> Riccardo
> 
> 
> On Sun, Sep 11, 2016 at 4:48 PM, Hartmut Kaiser <[email protected]>
> wrote:
> Riccardo,
> 
> >         i am writing since i am an OpenMP user, but i am actually quite
> > curious in understanding the future directions of c++.
> >
> > my parallel usage is actually relatively trivial, and is covered by
> OpenMP
> > 2.5 (openmp 3.1 with supports for iterators would be better but is not
> > available in msvc)
> > 99% of my user needs are about parallel loops, and with c++11 lambdas i
> > could do a lot.
> 
> Right. It is a fairly simple transformation in order to turn an OpenMP
> parallel loop into the equivalent parallel algorithm. We specificly added
> the parallel::for_loop() (not I the Parallelism TS/C++17) to support that
> migration:
> 
>     #pragma omp parallel for
>     for(int i = 0; i != N; ++i)
>     {
>         // some iteration
>     }
> 
> Would be equivalent to
> 
>     hpx::parallel::for_loop(
>         hpx::parallel::par,
>         0, N, [](int i)
>         {
>             // some iteration
>         });
> 
> (for more information about for_loop() see here: http://www.open-
> std.org/jtc1/sc22/wg21/docs/papers/2015/p0075r0.pdf)
> 
> > However i am really not clear on how i should equivalently handle
> > "private" and "firstprivate of OpenMP, which allow to create objects
> that
> > persist in the threadprivate memory during the whole lenght of a for
> loop.
> > I now use OpenMP 2.5 and i have a code that looks like the following
> >
> >
> https://kratos.cimne.upc.es/projects/kratos/repository/entry/kratos/kratos
> >
> /solving_strategies/builder_and_solvers/residualbased_block_builder_and_so
> > lver.h
> > which does an openmp parallel Finite Element assembly.
> > The code i am thinking of is somethign like:
> 
> [snipped code]
> 
> > the big question is ... how shall i handle the threadprivate
> scratchspace
> > in HPX?? Lambdas do not allow to do this ...
> > that is, what is the equivalente of private & of firstprivate??
> > thanks you in advance for any clarification or pointer to examples
> 
> For 'firstprivate' you can simply use lambda captures:
> 
>     int nelements = 42;
> 
>     for_each(par, 0, N,
>         [nelements](int i)
>         {
>             // the captured 'nelements' is initialized from the outer
>             // variable and each copy of the lambda has its own private
>             // copy
>             //
>             // use private 'nelements' here:
>             cout << nelements << endl;
>         });
> 
> Note, that 'nelements' will be const by default. If you want to modify its
> value, the lambda has to be made mutable:
> 
>     int nelements = 42;
> 
>     for_each(par, 0, N,
>         [nelements](int i) mutable // makes captures non-const
>         {
>             ++nelements;
>         });
> 
> Please don't be fooled however that this might give you one variable
> instance per iteration. HPX runs several iterations 'in one go' (depending
> on the partitioning, very much like openmp), so you will create one
> variable instance per created partition. As long as you don't modify the
> variable this shouldn't make a difference, however.
> 
> Emulating 'private' is even simpler. All you need is a local variable for
> each iteration after all. Thus simply creating it on the stack inside the
> lambda is the solution:
> 
>     for_loop(par, 0, N, [](int i)
>     {
>         // create 'private' variable
>         int my_private = 0;
>         // ...
>     });
> 
> This also gives you a hint on how you can have one instance of your
> variable per iteration and still initialize it like it was firstprivate:
> 
>     int nelements = 42;
>     for_loop(par, 0, N, [nelements](int i)
>     {
>         // create 'private' variable
>         int my_private = nelements;
>         // ...
>         ++my_private;   // modifies instance for this iteration only.
>     });
> 
> Things become a bit more interesting if you need reductions. Please see
> the linked document above for more details, but here is a simple example
> (taken from that paper):
> 
>     float dot_saxpy(int n, float a, float x[], float y[])
>     {
>         float s = 0;
>         for_loop(par, 0, n,
>             reduction(s, 0.0f, std::plus<float>()),
>             [&](int i, float& s_)
>             {
>                 y[i] += a*x[i];
>                 s_ += y[i]*y[i];
>             });
>         return s;
>     }
> 
> Here 's' is the reduction variable, and s_ is the thread-local reference
> to it.
> 
> HTH
> Regards Hartmut
> ---------------
> http://boost-spirit.com
> http://stellar.cct.lsu.edu
> 
> 
> 
> 
> --
> Riccardo Rossi
> PhD, Civil Engineer
> 
> member of the Kratos Team: www.cimne.com/kratos
> Tenure Track Lecturer at Universitat Politècnica de Catalunya,
> BarcelonaTech (UPC)
> Full Research Professor at International Center for Numerical Methods in
> Engineering (CIMNE)
> 
> C/ Gran Capità, s/n, Campus Nord UPC, Ed. C1, Despatx C9
> 08034 – Barcelona – Spain – www.cimne.com  -
> T.(+34) 93 401 56 96 skype: rougered4
> 
> 
> 
> Les dades personals contingudes en aquest missatge són tractades amb la
> finalitat de mantenir el contacte professional entre CIMNE i voste. Podra
> exercir els drets d'accés, rectificació, cancel·lació i oposició,
> dirigint-se a [email protected]. La utilització de la seva adreça de
> correu electronic per part de CIMNE queda subjecte a les disposicions de
> la Llei 34/2002, de Serveis de la Societat de la Informació i el Comerç
> Electronic.
>  Imprimiu aquest missatge, només si és estrictament necessari.

_______________________________________________
hpx-users mailing list
[email protected]
https://mail.cct.lsu.edu/mailman/listinfo/hpx-users

Re: [hpx-users] equivalent of firstprivate

Reply via email to