On May 12, 2010, at 1:07 PM, Abhishek Kulkarni wrote: > Updated RFC (w/ discussed changes): > > ====================================================================== > [RFC 2/2] merge the OPAL SOS development branch into trunk > ====================================================================== > > WHAT: Merge the OPAL SOS development branch into the OMPI trunk. > > WHY: Bring over some of the work done to enhance error reporting capabilities. > > WHERE: opal/util/ and a few changes in the ORTE notifier. > > TIMEOUT: May 17, Monday, COB. > > REFERENCE BRANCHES: http://bitbucket.org/jsquyres/opal-sos-fixed/ > > ====================================================================== > > BACKGROUND: > > The OPAL SOS framework tries to meet the following objectives: > > - Reduce the cascading error messages and the amount of code needed to > print an error message. > - Build and aggregate stacks of encountered errors and associate > related individual errors with each other. > - Allow registration of custom callbacks to intercept error events. > > The SOS system provides an interface to log events of varying > severities. These events are associated with an "encoded" error code > which can be used to refer to stacks of SOS events. When logging > events, they can also be transparently relayed to all the activated > notifier components. > > The SOS system is described in detail on this wiki page: > > http://svn.open-mpi.org/trac/ompi/wiki/ErrorMessages > https://svn.open-mpi.org/trac/ompi/attachment/wiki/ErrorMessages/OPAL_SOS.pdf > > CHANGES (since the last RFC): > > * Wrapped all hard-coded error-code checks (OMPI_ERR_* == ret), > OPAL_SOS_GET_ERR_CODE(ret). There were about 30-40 such checks > each in the OMPI and ORTE layer and about 15 in the OPAL layer. > Since OPAL_SUCCESS is preserved by SOS, also changed calls of > the form (OPAL_SUCCESS != ret) to (OPAL_ERROR == ret).
You mean the other way around, right? You changed code that previously looked like (OPAL_ERROR == ret) to (OPAL_SUCCESS != ret) where appropriate. > > * If the error is an SOS-encoded error, ORTE_ERROR_LOG decodes > the error, prints out the error stack and frees the errors. > > ====================================================================== > > > On Mar 29, 2010, at 10:58 AM, Abhishek Kulkarni wrote: > >> >> ====================================================================== >> [RFC 2/2] >> ====================================================================== >> >> WHAT: Merge the OPAL SOS development branch into the OMPI trunk. >> >> WHY: Bring over some of the work done to enhance error reporting >> capabilities. >> >> WHERE: opal/util/ and a few changes in the ORTE notifier. >> >> TIMEOUT: April 6, Wednesday, COB. >> >> REFERENCE BRANCHES: http://bitbucket.org/jsquyres/opal-sos-fixed/ >> >> ====================================================================== >> >> BACKGROUND: >> >> The OPAL SOS framework tries to meet the following objectives: >> >> - Reduce the cascading error messages and the amount of code needed to >> print an error message. >> - Build and aggregate stacks of encountered errors and associate >> related individual errors with each other. >> - Allow registration of custom callbacks to intercept error events. >> >> The SOS system provides an interface to log events of varying >> severities. These events are associated with an "encoded" error code >> which can be used to refer to stacks of SOS events. When logging >> events, they can also be transparently relayed to all the activated >> notifier components. >> >> The SOS system is described in detail on this wiki page: >> >> http://svn.open-mpi.org/trac/ompi/wiki/ErrorMessages >> >> Feel free to comment and/or provide suggestions. >> >> ====================================================================== >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel