So I finally started to look into these⦠Pretty powerful for the unknown communication partner problem.
Check out the attached. Note that the MPI_Put() is inside a loop over all processors, but it need not be. There are a lot of places we *know* we don't have any data to send to another processor, but the other processor does not know that until we send a 0-sized message. This allows an alternative: - On all procs, initialize recv_sizes(nprocs,0) - MPI_Put() our send size only to those processors we have data to send to. - loop over nonzero sizes from previous step and do receives. This can get the number of sends to scale independent of the number of processors. So long as we are using a recv buffer like above the memory will scale with the number of processors, albeit a that is only a vector<unsigned int>(nprocs). To eliminate that we'd need something like that nonblocking consensus algorithm Jed pointed be to & I sent out a while back. What do you think? -Ben
one_sided.C
Description: one_sided.C
------------------------------------------------------------------------------ October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register > http://pubads.g.doubleclick.net/gampad/clk?id=60133471&iu=/4140/ostg.clktrk
_______________________________________________ Libmesh-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/libmesh-devel
