Let me get this straight - you are advocating that I call “exit” directly from within a library?? I thought that was “verboten” - MPI_Init should just return an error somehow, yes?
> On Dec 4, 2014, at 12:35 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com> > wrote: > > Oh, good catch -- thanks. > > I wouldn't call abort -- that will dump core. Just show_help() and > exit(nonzero), I guess. > > > On Dec 4, 2014, at 3:31 PM, George Bosilca <bosi...@icl.utk.edu> wrote: > >> You can't use the PML error reporting mechanism in this particular instance, >> it is too early in the setup process (in the BTL component init function) >> and the PML has not setup the error callback yet. >> >> This function is called during the MPI_Init, at a time where most of the >> Open MPI infrastructure is not yet setup. I guess the safest way to force >> the process to fail is to call exit or maybe abort. >> >> George. >> >> >> >> On Fri, Dec 5, 2014 at 3:40 AM, Jeff Squyres (jsquyres) <jsquy...@cisco.com> >> wrote: >> You're supposed to call the PML error handler, which was passed down to the >> BTL during initialization. >> >> That is, the BTL registers a btl_register_error function with the PML. The >> PML then calls this function and passes in its error handler function >> pointer. The BTL can then use that error handler to tell the PML when an >> error occurs. >> >> Right now, the only PML error handler aborts the job. So this should be a >> sufficient mechanism. >> >> >> On Dec 3, 2014, at 12:15 PM, Ralph Castain <r...@open-mpi.org> wrote: >> >>> We talked during the telecon about the user-reported issue where they asked >>> for knem support, it wasn’t available on the system, but we ran anyway at a >>> reduced performance level. The agreement we had was that OMPI should >>> instead fail at that point since the user had requested something we could >>> not do. I got tasked with implementing this. >>> >>> Here is the problem code: >>> >>> /* If "use_knem" is positive, then it's an error if knem support >>> is not available -- deactivate the sm btl. */ >>> if (mca_btl_sm_component.use_knem > 0) { >>> opal_show_help("help-mpi-btl-sm.txt", >>> "knem requested but not available", >>> true, opal_process_info.nodename); >>> return NULL; >>> >>> As you can see, we deactivate sm but do not necessarily fail. Question for >>> you folks: how do I cause us to safely fail from within a BTL?? >>> >>> Thanks >>> Ralph >>> >>> _______________________________________________ >>> devel mailing list >>> de...@open-mpi.org >>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >>> Link to this post: >>> http://www.open-mpi.org/community/lists/devel/2014/12/16425.php >> >> >> -- >> Jeff Squyres >> jsquy...@cisco.com >> For corporate legal information go to: >> http://www.cisco.com/web/about/doing_business/legal/cri/ >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >> Link to this post: >> http://www.open-mpi.org/community/lists/devel/2014/12/16435.php >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >> Link to this post: >> http://www.open-mpi.org/community/lists/devel/2014/12/16436.php > > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > _______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/12/16437.php