Re: [OMPI devel] RFC: Fix missing code in MPI_Abort functionality

George Bosilca Thu, 9 Jun 2011 18:13:37 -0400

What I'm saying is that there is no reason to have any other type of MPI_Abort 
if we are not able to compute the set of connected processes.


With this RFC the processes on the communicator on MPI_Abort will abort. Then 
the other processes in the same MPI_COMM_WORLD (in fact jobid) will be notified 
(if we suppose that the ORTE will not make a difference between aborted and 
faulty). As a result the entire MPI_COMM_WORLD will be aborted, if we consider 
a sane application where everyone use the same type of error handler. However, 
this is not enough. We have to distribute the abort signal to every other 
process "connected", and I don't see how we can compute this list of connected 
processes in Open MPI today.It is not that I don't see it in your patch, it is 
that the definition of the connectivity in the MPI standard is transitive and 
relies heavily on a correct implementation for the MPI_Comm_disconnect.

  george.

On Jun 9, 2011, at 16:59 , Josh Hursey wrote:

> On Thu, Jun 9, 2011 at 4:47 PM, George Bosilca <bosi...@eecs.utk.edu> wrote:
>> If this change the behavior of MPI_Abort to only abort processes on the 
>> specified communicator how this doesn't affects the default user experience 
>> (when today it aborts everything)?
> 
> Open MPI does abort everything by default - decided by the runtime at
> the moment (but addressed in your RFC). So it does not matter if one
> process aborts or if many do. So the behavior of MPI_Abort experienced
> by the user will not change. Effectively the only change is an extra
> message in the runtime before the process actually calls
> errmgr.abort().
> 
> This branch just makes the implementation complete by first telling
> ORTE that a group of processes, defined by the communicator, should be
> terminated along with the calling process. Currently ORTE notices that
> there was an abort, and terminates the job. Once your RFC goes through
> then this may no longer be the case, and OMPI can determine what to do
> when it receives a process failure notification.
> 
>> 
>> If we accept the fact that MPI_Abort will only abort the processes in the 
>> current communicator what happens with the other processes in the same 
>> MPI_COMM_WORLD (but not on the communicator that has been used by MPI_Abort)?
> 
> Currently, ORTE will abort them as well. When your RFC goes through
> then the OMPI layer will be notified of the error and can take the
> appropriate action, as determined by the MPI standard.
> 
>> What about all the other connected processes (based on the connectivity as 
>> defined in the MPI standard in Section 10.5.4) ? Do they see this as a fault?
> 
> They are informed of the fault via the ORTE errmgr callback routine
> (that we have an RFC for), and then can take the appropriate action
> based on MPI semantics. So we are pushing the decision of the
> implication of the fault to the OMPI layer - where it should be.
> 
> 
> The remainder of the OMPI layer logic for MPI_ERRORS_RETURN and other
> connected error management scenarios is not included in this patch
> since that depends on there being a callback to the OMPI layer - which
> does not exist just yet. So a small patch to wire in the ORTE piece to
> allow the OMPI layer to request a set of processes to be terminated -
> to more accurately support MPI_Abort semantics.
> 
> Does that answer your questions?
> 
> -- Josh
> 
> 
>> 
>>  george.
>> 
>> On Jun 9, 2011, at 16:32 , Josh Hursey wrote:
>> 
>>> WHAT: Fix missing code in MPI_Abort
>>> 
>>> WHY: MPI_Abort is missing logic to ask for termination of the process
>>> group defined by the communicator
>>> 
>>> WHERE: Mostly orte/mca/errmgr
>>> 
>>> WHEN: Open MPI trunk
>>> 
>>> TIMEOUT: Tuesday, June 14, 2011 (after teleconf)
>>> 
>>> Details:
>>> -------------------------------------------
>>> A bitbucket branch is available here (last sync to r24757 of trunk)
>>> https://bitbucket.org/jjhursey/ompi-abort/
>>> 
>>> In the MPI Standard (v2.2) Section 8.7 after the introduction of
>>> MPI_Abort, it states:
>>> "This routine makes a best attempt to abort all tasks in the group of comm."
>>> 
>>> Open MPI currently only calls orte_errmgr.abort() to abort the calling
>>> process itself. The code to ask for the abort of the other processes
>>> in the group defined by the communicator is commented out. Since one
>>> process calling abort currently causes all processes in the job to
>>> abort, it has not been a big deal. However as the group starts
>>> exploring better resilience in the OMPI layer (with further support
>>> from the ORTE layer) this aspect of MPI_Abort will become more
>>> necessary to get right.
>>> 
>>> This branch adds back the logic necessary for a single process calling
>>> MPI_Abort to request, from ORTE errmgr, that a defined subgroup of
>>> processes be aborted. Once the request is sent to the HNP, the local
>>> process then calls abort on itself. The HNP requests that the defined
>>> subgroup of processes be terminated using the existing plm mechanisms
>>> for doing so.
>>> 
>>> This change has no effect on the current default user experienced
>>> behavior of MPI_Abort.
>>> 
>>> --
>>> Joshua Hursey
>>> Postdoctoral Research Associate
>>> Oak Ridge National Laboratory
>>> http://users.nccs.gov/~jjhursey
>>> _______________________________________________
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
>> 
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
>> 
> 
> 
> 
> -- 
> Joshua Hursey
> Postdoctoral Research Associate
> Oak Ridge National Laboratory
> http://users.nccs.gov/~jjhursey
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] RFC: Fix missing code in MPI_Abort functionality

Reply via email to