Thank you very much Bruno and Martin for your very helpful replies. We will benchmark these things when we try them out.
best praveen > On 01-Oct-2021, at 9:27 PM, Martin Kronbichler <[email protected]> > wrote: > > Dear Praveen, > > In addition to what Bruno said, I would just add that I would benchmark the > respective costs carefully before putting it into the code. As you observed, > this requires some additional steps to figure out what work can be reasonably > overlapped. We have this in the matrix-free framework where enough interest > has been available, but not for the rest (where we often did not have overly > performance-sensitive applications). > > But even if you are extremely performance-sensitive, I would not immediately > subscribe to a big benefit. My observation has been that it may bring a few > percent, but not big factors, and much less what "general wisdom" seems to > indicate (and you can hardly find papers that quantify the benefit in a > reproducible way). I guess networks have improved quite a bit as to when this > wisdom was shaped. Done naively and especially on CPU systems it can even > cause slowdowns: When you have the bulk of MPI communication within a node > (rather than across nodes), it is typically the memory bandwidth that is > limiting. You would hence just shift that to another stage in your code, at > the expense of fooling the prefetchers by non-standard loop through cells. I > found that this could cost more than it helps in some contexts. Things are > different when you have lots of inter-node communication (say over > Infiniband) and you can do more useful work, or when you have GPUs where the > computations are quicker in general. But the biggest success stories I have > seen are really complicated to implement, needing separate cores for the MPI > communication as opposed to other "worker threads" and other tricks. Again, I > am not disputing it can be worth it, but I would first look whether that is > one of the low-hanging fruits or just disappears in the noise. > > Best, > Martin > > On 01.10.21 14:46, Bruno Turcksin wrote: >> Praveen, >> >> We do something like that in CUDA MatrixFree. It is slightly more >> complicated because we need to update the ghost values on both the source >> vector and the destination vector. The idea is to first loop over the mesh >> and store all the vertices that are ghosted then you loop over the active >> cells, you just need to check that none of the vertices is ghosted. The >> relevant code is here: >> https://github.com/dealii/dealii/blob/master/include/deal.II/matrix_free/cuda_matrix_free.templates.h#L1012-L1057 >> >> <https://github.com/dealii/dealii/blob/master/include/deal.II/matrix_free/cuda_matrix_free.templates.h#L1012-L1057> >> >> Best, >> >> Bruno >> >> On Thursday, September 30, 2021 at 11:48:13 PM UTC-4 Praveen C wrote: >> Dear all >> >> I use la::d::Vector in a continuous nodal FE code. I want to mix update >> ghost and assemble so that I dont wait for update ghost to finish >> >> solution.update_ghost_values_start() >> Assemble on all locally owned cells that do not need ghost values >> solution.update_ghost_values_finish() >> Assemble on all locally owned cells that need ghost values >> >> I use WorkStream for assembly. >> >> I have to create a FilteredIterator that identifies all cells that do not >> have ghost dof values. >> >> One way is to check all dofs on a cell to see if all of them are locally >> owned. >> >> Is there any other way to create such a filter ? >> >> I have to use two WorkStreams here, the second one will have less work to >> do. Is there a way to avoid using two WorkStreams ? >> >> Thanks >> praveen -- The deal.II project is located at http://www.dealii.org/ For mailing list/forum options, see https://groups.google.com/d/forum/dealii?hl=en --- You received this message because you are subscribed to the Google Groups "deal.II User Group" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/dealii/4EA213F4-BE77-4800-A1BE-E3B6D5184C83%40gmx.net.
