Dear list,
        i am writing since i am an OpenMP user, but i am actually quite
curious in understanding the future directions of c++.

my parallel usage is actually relatively trivial, and is covered by OpenMP
2.5 (openmp 3.1 with supports for iterators would be better but is not
available in msvc)

99% of my user needs are about parallel loops, and with c++11 lambdas i
could do a lot.
However i am really not clear on how i should equivalently handle "private"
and "firstprivate of OpenMP, which allow to create objects that persist in
the threadprivate memory during the whole lenght of a for loop. I now use
OpenMP 2.5 and i have a code that looks like the following

https://kratos.cimne.upc.es/projects/kratos/repository/entry/kratos/kratos/solving_strategies/builder_and_solvers/residualbased_block_builder_and_solver.h

which does an openmp parallel Finite Element assembly.
The code i am thinking of is somethign like:

        //here some to-be-threadprivate matrices are allocated. This
allocation is very SLOW so i don't want to do it often
        LocalSystemMatrixType LHS_Contribution = LocalSystemMatrixType(0,
0);
        LocalSystemVectorType RHS_Contribution = LocalSystemVectorType(0);
        Element::EquationIdVectorType EquationId;

        #pragma omp parallel for firstprivate(nelements, LHS_Contribution,
RHS_Contribution, EquationId )
        for (int k = 0; k < nelements; k++)
        {
            ModelPart::ElementsContainerType::iterator it = el_begin + k;
//iterator is random access so we do this trick to have a one line in
openmp 2.5

             //calculate elemental contribution ---> HERE I USE
LHS_Contribution etc as scratch space, so i don't have to reallocate it
                pScheme->CalculateSystemContributions(*(it.base()),
LHS_Contribution, RHS_Contribution, EquationId, CurrentProcessInfo);

                //assemble the elemental contribution --> HERE I USE LOCKS
to sum contributions to the sparse matrix A (it appeared to work faster
than using atomics)
                Assemble(A, b, LHS_Contribution, RHS_Contribution,
EquationId, mlock_array);
            }

        }



the big question is ... how shall i handle the threadprivate scratchspace
in HPX?? Lambdas do not allow to do this ...
that is, what is the equivalente of private & of firstprivate??

thanks you in advance for any clarification or pointer to examples

regards
Riccardo

-- 


*Riccardo Rossi*

PhD, Civil Engineer


member of the Kratos Team: www.cimne.com/kratos

Tenure Track Lecturer at Universitat Politècnica de Catalunya,
BarcelonaTech (UPC)

Full Research Professor at International Center for Numerical Methods in
Engineering (CIMNE)


C/ Gran Capità, s/n, Campus Nord UPC, Ed. C1, Despatx C9

08034 – Barcelona – Spain – www.cimne.com  -

T.(+34) 93 401 56 96 skype: *rougered4*



<http://www.cimne.com/>

<https://www.facebook.com/cimne> <http://blog.cimne.com/>
<http://vimeo.com/cimne> <http://www.youtube.com/user/CIMNEvideos>
<http://www.linkedin.com/company/cimne> <https://twitter.com/cimne>

Les dades personals contingudes en aquest missatge són tractades amb la
finalitat de mantenir el contacte professional entre CIMNE i voste. Podra
exercir els drets d'accés, rectificació, cancel·lació i oposició,
dirigint-se a ci...@cimne.upc.edu. La utilització de la seva adreça de
correu electronic per part de CIMNE queda subjecte a les disposicions de la
Llei 34/2002, de Serveis de la Societat de la Informació i el Comerç
Electronic.

 Imprimiu aquest missatge, només si és estrictament necessari.
<http://www.cimne.com/>
_______________________________________________
hpx-users mailing list
hpx-users@stellar.cct.lsu.edu
https://mail.cct.lsu.edu/mailman/listinfo/hpx-users

Reply via email to