Hi all,

I am with the ZIH developers working on VampirTrace and just discovered a possibly erroneous behavior of OpenMPI (v1.4.3). I am canceling an active persistent request created with MPI_Ssend_init, in a successive MPI_Wait call the process hangs, even though according to the MPI standard this should never happen. 

The pesudo code is as follows:
if (rank == 0)
MPI_Ssend_init (&buf, 1, MPI_INT, 1, 666, MPI_COMM_WORLD, &r);
if (rank == 1)
MPI_Recv_init (&buf, 1, MPI_INT, 0, 666, MPI_COMM_WORLD, &r);


//Start
MPI_Start (&r);


//Cancel
MPI_Cancel (&r);


//Wait
MPI_Wait (&r, &status);


//Free
MPI_Request_free (&r);

The full (minimal reproducer) source code along with a dump of ompi_info is attached.

Either I am missing some passage of the standard mentioning that it is forbidden to cancel an synchronous send or there appears to be an error in OpenMPI's implementation. If it is already fixed, sorry for the spam.
(Note: changing the Ssend to Send or Bsend removes the hang)

-Tobias
 

Attachment: ssend_init_cancel.c
Description: Binary data

Attachment: ssend_init_cancel.ompi_info
Description: Binary data


--
Dipl.-Inf. Tobias Hilbrich
Wissenschaftlicher Mitarbeiter

Technische Universitaet Dresden
Zentrum fuer Informationsdienste und Hochleistungsrechnen (ZIH)
(Center for Information Services and High Performance Computing (ZIH))
Interdisziplinäre Anwenderunterstützung und Koordination
(Interdisciplinary Application Development and Coordination)
01062 Dresden
Tel.: +49 (351) 463-32041
Fax: +49 (351) 463-37773

Reply via email to