Indeed. Nice job yesterday, Abhishek. You did it better than my hwloc merge into the trunk! :-)
On May 18, 2010, at 9:20 AM, Josh Hursey wrote: > Abhishek and Jeff, > > Awesome! Thanks for all your hard work maintaining and shepherding > this branch into the trunk. > > -- Josh > > On May 17, 2010, at 9:20 PM, Abhishek Kulkarni wrote: > > > > > On May 14, 2010, at 12:24 PM, Josh Hursey wrote: > > > >> > >> On May 12, 2010, at 1:07 PM, Abhishek Kulkarni wrote: > >> > >>> Updated RFC (w/ discussed changes): > >>> > >>> = > >>> = > >>> ==================================================================== > >>> [RFC 2/2] merge the OPAL SOS development branch into trunk > >>> = > >>> = > >>> ==================================================================== > >>> > >>> WHAT: Merge the OPAL SOS development branch into the OMPI trunk. > >>> > >>> WHY: Bring over some of the work done to enhance error reporting > >>> capabilities. > >>> > >>> WHERE: opal/util/ and a few changes in the ORTE notifier. > >>> > >>> TIMEOUT: May 17, Monday, COB. > >>> > >>> REFERENCE BRANCHES: http://bitbucket.org/jsquyres/opal-sos-fixed/ > >>> > >>> = > >>> = > >>> ==================================================================== > >>> > >>> BACKGROUND: > >>> > >>> The OPAL SOS framework tries to meet the following objectives: > >>> > >>> - Reduce the cascading error messages and the amount of code > >>> needed to > >>> print an error message. > >>> - Build and aggregate stacks of encountered errors and associate > >>> related individual errors with each other. > >>> - Allow registration of custom callbacks to intercept error events. > >>> > >>> The SOS system provides an interface to log events of varying > >>> severities. These events are associated with an "encoded" error > >>> code > >>> which can be used to refer to stacks of SOS events. When logging > >>> events, they can also be transparently relayed to all the activated > >>> notifier components. > >>> > >>> The SOS system is described in detail on this wiki page: > >>> > >>> http://svn.open-mpi.org/trac/ompi/wiki/ErrorMessages > >>> https://svn.open-mpi.org/trac/ompi/attachment/wiki/ErrorMessages/OPAL_SOS.pdf > >>> > >>> CHANGES (since the last RFC): > >>> > >>> * Wrapped all hard-coded error-code checks (OMPI_ERR_* == ret), > >>> OPAL_SOS_GET_ERR_CODE(ret). There were about 30-40 such checks > >>> each in the OMPI and ORTE layer and about 15 in the OPAL layer. > >>> Since OPAL_SUCCESS is preserved by SOS, also changed calls of > >>> the form (OPAL_SUCCESS != ret) to (OPAL_ERROR == ret). > >> > >> You mean the other way around, right? > >> You changed code that previously looked like (OPAL_ERROR == ret) to > >> (OPAL_SUCCESS != ret) where appropriate. > >> > > > > > > Yes, thanks for the correction! This (and ORTE WDC) is all in trunk > > now -- I've split the changes into smaller patches (see commits > > r23155 - r23164) so that they are easier to sift through. > > > > Abhishek > > > > > >>> > >>> * If the error is an SOS-encoded error, ORTE_ERROR_LOG decodes > >>> the error, prints out the error stack and frees the errors. > >>> > >>> = > >>> = > >>> ==================================================================== > >>> > >>> > >>> On Mar 29, 2010, at 10:58 AM, Abhishek Kulkarni wrote: > >>> > >>>> > >>>> = > >>>> = > >>>> = > >>>> =================================================================== > >>>> [RFC 2/2] > >>>> = > >>>> = > >>>> = > >>>> =================================================================== > >>>> > >>>> WHAT: Merge the OPAL SOS development branch into the OMPI trunk. > >>>> > >>>> WHY: Bring over some of the work done to enhance error reporting > >>>> capabilities. > >>>> > >>>> WHERE: opal/util/ and a few changes in the ORTE notifier. > >>>> > >>>> TIMEOUT: April 6, Wednesday, COB. > >>>> > >>>> REFERENCE BRANCHES: http://bitbucket.org/jsquyres/opal-sos-fixed/ > >>>> > >>>> = > >>>> = > >>>> = > >>>> =================================================================== > >>>> > >>>> BACKGROUND: > >>>> > >>>> The OPAL SOS framework tries to meet the following objectives: > >>>> > >>>> - Reduce the cascading error messages and the amount of code > >>>> needed to > >>>> print an error message. > >>>> - Build and aggregate stacks of encountered errors and associate > >>>> related individual errors with each other. > >>>> - Allow registration of custom callbacks to intercept error events. > >>>> > >>>> The SOS system provides an interface to log events of varying > >>>> severities. These events are associated with an "encoded" error > >>>> code > >>>> which can be used to refer to stacks of SOS events. When logging > >>>> events, they can also be transparently relayed to all the activated > >>>> notifier components. > >>>> > >>>> The SOS system is described in detail on this wiki page: > >>>> > >>>> http://svn.open-mpi.org/trac/ompi/wiki/ErrorMessages > >>>> > >>>> Feel free to comment and/or provide suggestions. > >>>> > >>>> = > >>>> = > >>>> = > >>>> =================================================================== > >>>> _______________________________________________ > >>>> devel mailing list > >>>> de...@open-mpi.org > >>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel > >>> > >>> _______________________________________________ > >>> devel mailing list > >>> de...@open-mpi.org > >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel > >> > >> > >> _______________________________________________ > >> devel mailing list > >> de...@open-mpi.org > >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > > _______________________________________________ > > devel mailing list > > de...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/devel > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel > -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/