You're supposed to call the PML error handler, which was passed down to the BTL during initialization.
That is, the BTL registers a btl_register_error function with the PML. The PML then calls this function and passes in its error handler function pointer. The BTL can then use that error handler to tell the PML when an error occurs. Right now, the only PML error handler aborts the job. So this should be a sufficient mechanism. On Dec 3, 2014, at 12:15 PM, Ralph Castain <r...@open-mpi.org> wrote: > We talked during the telecon about the user-reported issue where they asked > for knem support, it wasn’t available on the system, but we ran anyway at a > reduced performance level. The agreement we had was that OMPI should instead > fail at that point since the user had requested something we could not do. I > got tasked with implementing this. > > Here is the problem code: > > /* If "use_knem" is positive, then it's an error if knem support > is not available -- deactivate the sm btl. */ > if (mca_btl_sm_component.use_knem > 0) { > opal_show_help("help-mpi-btl-sm.txt", > "knem requested but not available", > true, opal_process_info.nodename); > return NULL; > > As you can see, we deactivate sm but do not necessarily fail. Question for > you folks: how do I cause us to safely fail from within a BTL?? > > Thanks > Ralph > > _______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/12/16425.php -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/