Hi Ralph,

Thanks for your response. My problem is removing all leaf nodes from a directed graph, which is distributed among a number of processes. Each process iterates over its portion of the graph, and if a node is a leaf (indegree(n) == 0 || outdegree(n) == 0), it removes the node (which is a local operation) and notifies each of its neighbours (using MPI_Ibsend) to remove any edges incident to the removed node. If that node becomes a leaf, it is also removed and the process cascades. I use the following algorithm to check if this cascade process is complete:

loop {
MPI_Ibsend (for every edge of every leaf node)
MPI_barrier
MPI_Iprobe/MPI_Recv (until no messages pending)
MPI_Allreduce (number of nodes removed)
} until (no nodes removed by any node)

Previously, I attempted to use a single MPI_Allreduce without the MPI_Barrier:

loop {
MPI_Iprobe/MPI_Recv (until no messages pending)
MPI_Ibsend (for every edge of every leaf node)
MPI_Allreduce (number of nodes removed)
} until (no nodes removed by any node)

This latter algorithm did not complete correctly. Now that I've written out the algorithm in pseudo-code, it looks a little clearer. There must be a race condition between the MPI_Iprobe and MPI_Recv. I wonder if using MPI_Irecv would clear it up.

Cheers,
Shaun

Ralph Castain wrote:
I think perhaps you folks are all caught up a tad too much in the standard and not reading the intent of someone's question... :-)

I believe the original question was concerned with ensuring that all procs had completed MPI_Allreduce before his algorithm attempted other operations. As you folks know, procs can leave MPI_Allreduce at significantly different times. Using an MPI_Barrier after MPI_Allreduce would accomplish the questioner's objective.

Whether or not the questioner's particular program really -needs- to do that is another matter - one I personally wouldn't attempt to answer without knowing a lot more about what that next step after MPI_Allreduce does.

Reply via email to