Hi Devs,

When doing MPI_Reduce in a loop (collecting on Rank 0), is it the correct
understanding that ranks other than root (0 in this case) will pass the
collective as soon as their data is written to MPI buffers without waiting
for all of them to be received at the root?

If that's the case then what would happen (semantically) if we execute
MPI_Reduce in a loop without a barrier allowing non-root ranks to hit the
collective multiple times while the root will be processing an earlier
reduce? For example, the root can be in the first reduce invocation, while
another rank is in the second the reduce invocation.

Thank you,

Saliya Ekanayake, Ph.D
Postdoctoral Scholar
Performance and Algorithms Research (PAR) Group
Lawrence Berkeley National Laboratory
Phone: 510-486-5772
devel mailing list

Reply via email to