So I finally started to look into these… Pretty powerful for the unknown 
communication partner problem.

Check out the attached.  Note that the MPI_Put() is inside a loop over all 
processors, but it need not be.

There are a lot of places we *know* we don't have any data to send to another 
processor, but the other processor does not know that until we send a 0-sized 
message.

This allows an alternative:

 - On all procs, initialize recv_sizes(nprocs,0)
 - MPI_Put() our send size only to those processors we have data to send to.
 - loop over nonzero sizes from previous step and do receives.


This can get the number of sends to scale independent of the number of 
processors.  So long as we are using a recv buffer like above the memory will 
scale with the number of processors, albeit a that is only a vector<unsigned 
int>(nprocs).  To eliminate that we'd need something like that nonblocking 
consensus algorithm Jed pointed be to & I sent out a while back.

What do you think?

-Ben

Attachment: one_sided.C
Description: one_sided.C

------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60133471&iu=/4140/ostg.clktrk
_______________________________________________
Libmesh-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/libmesh-devel

Reply via email to