Gijsbert Wiesenekker wrote:

My MPI program consists of a number of processes that send 0 or more messages 
(using MPI_Isend) to 0 or more other processes. The processes check 
periodically if messages are available to be processed. It was running fine 
until I increased the message size, and I got deadlock problems. Googling 
learned I was running into a classic deadlock problem if (see for example 
http://www.cs.ucsb.edu/~hnielsen/cs140/mpi-deadlocks.html). The workarounds 
suggested like changing the order of MPI_Send and MPI_Recv do not work in my 
case, as it could be that one processor does not send any message at all to the 
other processes, so MPI_Recv would wait indefinitely.
Any suggestions on how to avoid deadlock in this case?
The problems you describe would seem to arise with blocking functions like MPI_Send and MPI_Recv. With the non-blocking variants MPI_Isend/MPI_Irecv, there shouldn't be this problem. There should be no requirement of ordering the functions in the way that web page describes... that workaround is suggested for the blocking calls. It feels to me that something is missing from your description.

If you know the maximum size any message will be, you can post an MPI_Irecv with wild card tags and source ranks. You can post MPI_Isend calls for whatever messages you want to send. You can use MPI_Test to check if any message has been received; if so, process the received message and re-post the MPI_Irecv. You can use MPI_Test to check if any send messages have completed; if so, you can reuse those send buffers. You need some signal to indicate to processes that no further messages will be arriving.

Reply via email to