On Sep 28, 2013, at 7:59 PM, Jed Brown <[email protected]> wrote:
> "Kirk, Benjamin (JSC-EG311)" <[email protected]> writes: > >> MPI_Win_create (&recv_buf[0], recv_buf.size()*sizeof(unsigned int), >> sizeof(unsigned int), >> MPI_INFO_NULL, MPI_COMM_WORLD, >> &win); > > In practice, this is implemented using something like an MPI_Allreduce > with 3*int*comm_size data. Thanks, that's helpful! >> for (unsigned int proc=0; proc<numprocs; proc++) >> MPI_Put (&procid, 1, MPI_UNSIGNED, >> proc, procid, 1, MPI_UNSIGNED, >> win); >> >> MPI_Win_fence (0, win); > > If you don't send to all the comms, how will the receiver know which > slots have been written to? Or is your idea to initialize with > known-invalid data and then scan through it to find the now-valid data? The idea would be to initialize the buffer underlying the window with 0, and then any nonzero values on the receiver would then have been sent by a remote rank through MPI_Put(). > Given the quite high cost of creating the window, I think you're > definitely better off using MPI_Reduce_scatter from MPI-2 (or the MPI-3 > nonblocking consensus algorithm implemented in PetscCommBuildTwoSided) > to determine who you need to receive from, then using point-to-point. > If you have the window and rendezvous worked out, one-sided can be > faster on some hardware, but point-to-point is still very good and it's > not going to be your bottleneck if it's used well. Thanks - the idea is to use point-to-point but not relying on 0 message sizes to indicate that ranks do not need to communicate. Alternatively, and Alltoall() could send the sizes so everyone knows who to receive from, but I was thinking this approach (using a window) might be faster, particularly if the window is created once and reused in this way for multiple communication operations. But perhaps not… If the gov't shuts down next week I might have some time to play with these options on a few platforms. -Ben ------------------------------------------------------------------------------ October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register > http://pubads.g.doubleclick.net/gampad/clk?id=60133471&iu=/4140/ostg.clktrk _______________________________________________ Libmesh-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/libmesh-devel
