Hi Devs, When doing MPI_Reduce in a loop (collecting on Rank 0), is it the correct understanding that ranks other than root (0 in this case) will pass the collective as soon as their data is written to MPI buffers without waiting for all of them to be received at the root?
If that's the case then what would happen (semantically) if we execute MPI_Reduce in a loop without a barrier allowing non-root ranks to hit the collective multiple times while the root will be processing an earlier reduce? For example, the root can be in the first reduce invocation, while another rank is in the second the reduce invocation. Thank you, Saliya -- Saliya Ekanayake, Ph.D Postdoctoral Scholar Performance and Algorithms Research (PAR) Group Lawrence Berkeley National Laboratory Phone: 510-486-5772
_______________________________________________ devel mailing list devel@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/devel