Hi all, I am with the ZIH developers working on VampirTrace and just discovered a possibly erroneous behavior of OpenMPI (v1.4.3). I am canceling an active persistent request created with MPI_Ssend_init, in a successive MPI_Wait call the process hangs, even though according to the MPI standard this should never happen. The pesudo code is as follows: if (rank == 0) MPI_Ssend_init (&buf, 1, MPI_INT, 1, 666, MPI_COMM_WORLD, &r); if (rank == 1) MPI_Recv_init (&buf, 1, MPI_INT, 0, 666, MPI_COMM_WORLD, &r); //Start MPI_Start (&r); //Cancel MPI_Cancel (&r); //Wait MPI_Wait (&r, &status); //Free MPI_Request_free (&r); The full (minimal reproducer) source code along with a dump of ompi_info is attached. Either I am missing some passage of the standard mentioning that it is forbidden to cancel an synchronous send or there appears to be an error in OpenMPI's implementation. If it is already fixed, sorry for the spam. (Note: changing the Ssend to Send or Bsend removes the hang) -Tobias |
ssend_init_cancel.c
Description: Binary data
ssend_init_cancel.ompi_info
Description: Binary data
-- Dipl.-Inf. Tobias Hilbrich Wissenschaftlicher Mitarbeiter Technische Universitaet Dresden Zentrum fuer Informationsdienste und Hochleistungsrechnen (ZIH) (Center for Information Services and High Performance Computing (ZIH)) Interdisziplinäre Anwenderunterstützung und Koordination (Interdisciplinary Application Development and Coordination) 01062 Dresden Tel.: +49 (351) 463-32041 Fax: +49 (351) 463-37773 |