Re: [OMPI devel] [devel-core] [RFC] Exit without finalize

Aurelien Bouteiller Sat, 8 Sep 2007 14:33:32 -0400


Le 6 sept. 07 à 09:27, Terry D. Dontje a écrit :

Gleb Natapov wrote:
On Thu, Sep 06, 2007 at 06:50:43AM -0600, Ralph H Castain wrote:
WHAT:   Decide upon how to handle MPI applications where one or more
       processes exit without calling MPI_Finalize

WHY:    Some applications can abort via an exit call instead of
       calling MPI_Abort when a library (or something else) calls
       exit. This situation is outside a user's control, so they
       cannot fix it.

WHERE:  Refer to ticket #1144 - code changes are TBD

WHEN:   Up to the group
[snip]
Does the general community feel we should do anything here, or isthis a"bug" that should be fixed by the entity calling "exit"? I shouldnote thatit actually is bad behavior (IMHO) for any library to call "exit"- butthen, we do that in some situations too, so perhaps we shouldn'tcast
stones!
Any suggested solutions or comments on whether or not we shoulddo anything
would be appreciated.
IMO (a) should be implemented.
I don't think (b) should be implemented. However, one couldregister anatexit handler that calls MPI_finalize. Therefore, the exitingprocess
would be stuck until everyone else reaches their exits or finalize.
That being said I think (a) probably makes more sense and adheresto the
MPI standard.

I agree (b) is not a good idea. However I am not very pleased by (a)either. It totally prevent any process Fault Tolerant mechanism if wego that way. If we plan to add some failure detection mechanism toRTE and failure management (to avoid Finalize to hang), we should addthe ability to plug-in FT specific error handlers. The default errorhandler should do exactly what is proposed by Ralph, but nowhere else(than in this handler) the RTE code should assume that theapplication is aborting when a failure occurs. If it is a FTapplication it might just not abort and recover.


Aurelien

--td

_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] [devel-core] [RFC] Exit without finalize

Reply via email to