This may be me mis-communicating with Mike off list. I had suggested he add this "feature" to help in catching a rare race condition in his MTT runs. However, I had expected him to do it on his private branch, not commit it to the main repo.
I agree that I'm not sure what I think about it for the trunk. It is indicative of a bug in the code, but if someone hits that bug at scale....generating core files at scale can be really bad. On Tue, Oct 7, 2014 at 5:54 AM, Jeff Squyres (jsquyres) <jsquy...@cisco.com> wrote: > I'm not sure how I feel about this commit: > > 1. It blindly ignores the "return" statement. I.e., if the intent for > this commit was to kill the process, that "return" statement should have > been deleted, too. > > 2. We clearly decided a long time ago that removing an item from a list > from which it does not belong is NOT a fatal error. This commit is a > fundamental change in behavior that really should have been RFC'ed (e.g., I > RFC'ed the calloc-vs-malloc idea last week). > > I'm not saying that this is a bad change in core behavior, but I would > have appreciated a little heads-up and a chance to think about it before it > was made (I'm still not sure what I think about this). > > > > On Oct 7, 2014, at 7:09 AM, <git...@crest.iu.edu> <git...@crest.iu.edu> > wrote: > > > This is an automated email from the git hooks/post-receive script. It was > > generated because a ref change was pushed to the repository containing > > the project "open-mpi/ompi". > > > > The branch, master has been updated > > via 86f1d5af3ee484f34092ad3f7a645d9a5ccbcb6c (commit) > > from cd48fbeec67f1a511a9cf5ce890fef6cc535ef60 (commit) > > > > Those revisions listed above that are new to this repository have > > not appeared on any other notification email; so we list those > > revisions in full, below. > > > > - Log ----------------------------------------------------------------- > > > https://github.com/open-mpi/ompi/commit/86f1d5af3ee484f34092ad3f7a645d9a5ccbcb6c > > > > commit 86f1d5af3ee484f34092ad3f7a645d9a5ccbcb6c > > Author: Mike Dubman <mi...@mellanox.com> > > Date: Tue Oct 7 14:07:41 2014 +0300 > > > > OPAL: drop dead with core on bad flow. rarely happens with helloworld > on large scale. > > > > diff --git a/opal/class/opal_list.h b/opal/class/opal_list.h > > index b66438e..bad4cbf 100644 > > --- a/opal/class/opal_list.h > > +++ b/opal/class/opal_list.h > > @@ -486,6 +486,7 @@ static inline opal_list_item_t *opal_list_remove_item > > if (!found) { > > fprintf(stderr," Warning :: opal_list_remove_item - the item %p > is not on the list %p \n",(void*) item, (void*) list); > > fflush(stderr); > > + abort(); > > return (opal_list_item_t *)NULL; > > } > > > > > > > > ----------------------------------------------------------------------- > > > > Summary of changes: > > opal/class/opal_list.h | 1 + > > 1 file changed, 1 insertion(+) > > > > > > hooks/post-receive > > -- > > open-mpi/ompi > > _______________________________________________ > > ompi-commits mailing list > > ompi-comm...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/ompi-commits > > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > _______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/10/16019.php >