If this change the behavior of MPI_Abort to only abort processes on the specified communicator how this doesn't affects the default user experience (when today it aborts everything)?
If we accept the fact that MPI_Abort will only abort the processes in the current communicator what happens with the other processes in the same MPI_COMM_WORLD (but not on the communicator that has been used by MPI_Abort)? What about all the other connected processes (based on the connectivity as defined in the MPI standard in Section 10.5.4) ? Do they see this as a fault? george. On Jun 9, 2011, at 16:32 , Josh Hursey wrote: > WHAT: Fix missing code in MPI_Abort > > WHY: MPI_Abort is missing logic to ask for termination of the process > group defined by the communicator > > WHERE: Mostly orte/mca/errmgr > > WHEN: Open MPI trunk > > TIMEOUT: Tuesday, June 14, 2011 (after teleconf) > > Details: > ------------------------------------------- > A bitbucket branch is available here (last sync to r24757 of trunk) > https://bitbucket.org/jjhursey/ompi-abort/ > > In the MPI Standard (v2.2) Section 8.7 after the introduction of > MPI_Abort, it states: > "This routine makes a best attempt to abort all tasks in the group of comm." > > Open MPI currently only calls orte_errmgr.abort() to abort the calling > process itself. The code to ask for the abort of the other processes > in the group defined by the communicator is commented out. Since one > process calling abort currently causes all processes in the job to > abort, it has not been a big deal. However as the group starts > exploring better resilience in the OMPI layer (with further support > from the ORTE layer) this aspect of MPI_Abort will become more > necessary to get right. > > This branch adds back the logic necessary for a single process calling > MPI_Abort to request, from ORTE errmgr, that a defined subgroup of > processes be aborted. Once the request is sent to the HNP, the local > process then calls abort on itself. The HNP requests that the defined > subgroup of processes be terminated using the existing plm mechanisms > for doing so. > > This change has no effect on the current default user experienced > behavior of MPI_Abort. > > -- > Joshua Hursey > Postdoctoral Research Associate > Oak Ridge National Laboratory > http://users.nccs.gov/~jjhursey > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel