Re: [OMPI users] Collective operations and synchronization

Shaun Jackman Mon, 23 Mar 2009 19:37:12 -0400

Hi Ralph,

Thanks for your response. My problem is removing all leaf nodes from adirected graph, which is distributed among a number of processes. Eachprocess iterates over its portion of the graph, and if a node is aleaf (indegree(n) == 0 || outdegree(n) == 0), it removes the node(which is a local operation) and notifies each of its neighbours(using MPI_Ibsend) to remove any edges incident to the removed node.If that node becomes a leaf, it is also removed and the processcascades. I use the following algorithm to check if this cascadeprocess is complete:


loop {
MPI_Ibsend (for every edge of every leaf node)
MPI_barrier
MPI_Iprobe/MPI_Recv (until no messages pending)
MPI_Allreduce (number of nodes removed)
} until (no nodes removed by any node)

Previously, I attempted to use a single MPI_Allreduce without theMPI_Barrier:


loop {
MPI_Iprobe/MPI_Recv (until no messages pending)
MPI_Ibsend (for every edge of every leaf node)
MPI_Allreduce (number of nodes removed)
} until (no nodes removed by any node)

This latter algorithm did not complete correctly. Now that I'vewritten out the algorithm in pseudo-code, it looks a little clearer.There must be a race condition between the MPI_Iprobe and MPI_Recv. Iwonder if using MPI_Irecv would clear it up.


Cheers,
Shaun

Ralph Castain wrote:

I think perhaps you folks are all caught up a tad too much in thestandard and not reading the intent of someone's question... :-)
I believe the original question was concerned with ensuring that allprocs had completed MPI_Allreduce before his algorithm attempted otheroperations. As you folks know, procs can leave MPI_Allreduce atsignificantly different times. Using an MPI_Barrier afterMPI_Allreduce would accomplish the questioner's objective.
Whether or not the questioner's particular program really -needs- todo that is another matter - one I personally wouldn't attempt toanswer without knowing a lot more about what that next step afterMPI_Allreduce does.

Re: [OMPI users] Collective operations and synchronization

Reply via email to