Let me get this straight - you are advocating that I call “exit” directly from 
within a library?? I thought that was “verboten” - MPI_Init should just return 
an error somehow, yes?

> On Dec 4, 2014, at 12:35 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com> 
> wrote:
> 
> Oh, good catch -- thanks.
> 
> I wouldn't call abort -- that will dump core.  Just show_help() and 
> exit(nonzero), I guess.
> 
> 
> On Dec 4, 2014, at 3:31 PM, George Bosilca <bosi...@icl.utk.edu> wrote:
> 
>> You can't use the PML error reporting mechanism in this particular instance, 
>> it is too early in the setup process (in the BTL component init function) 
>> and the PML has not setup the error callback yet.
>> 
>> This function is called during the MPI_Init, at a time where most of the 
>> Open MPI infrastructure is not yet setup. I guess the safest way to force 
>> the process to fail is to call exit or maybe abort.
>> 
>> George.
>> 
>> 
>> 
>> On Fri, Dec 5, 2014 at 3:40 AM, Jeff Squyres (jsquyres) <jsquy...@cisco.com> 
>> wrote:
>> You're supposed to call the PML error handler, which was passed down to the 
>> BTL during initialization.
>> 
>> That is, the BTL registers a btl_register_error function with the PML.  The 
>> PML then calls this function and passes in its error handler function 
>> pointer.  The BTL can then use that error handler to tell the PML when an 
>> error occurs.
>> 
>> Right now, the only PML error handler aborts the job.  So this should be a 
>> sufficient mechanism.
>> 
>> 
>> On Dec 3, 2014, at 12:15 PM, Ralph Castain <r...@open-mpi.org> wrote:
>> 
>>> We talked during the telecon about the user-reported issue where they asked 
>>> for knem support, it wasn’t available on the system, but we ran anyway at a 
>>> reduced performance level. The agreement we had was that OMPI should 
>>> instead fail at that point since the user had requested something we could 
>>> not do. I got tasked with implementing this.
>>> 
>>> Here is the problem code:
>>> 
>>>   /* If "use_knem" is positive, then it's an error if knem support
>>>      is not available -- deactivate the sm btl. */
>>>   if (mca_btl_sm_component.use_knem > 0) {
>>>       opal_show_help("help-mpi-btl-sm.txt",
>>>                      "knem requested but not available",
>>>                      true, opal_process_info.nodename);
>>>       return NULL;
>>> 
>>> As you can see, we deactivate sm but do not necessarily fail. Question for 
>>> you folks: how do I cause us to safely fail from within a BTL??
>>> 
>>> Thanks
>>> Ralph
>>> 
>>> _______________________________________________
>>> devel mailing list
>>> de...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> Link to this post: 
>>> http://www.open-mpi.org/community/lists/devel/2014/12/16425.php
>> 
>> 
>> --
>> Jeff Squyres
>> jsquy...@cisco.com
>> For corporate legal information go to: 
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>> 
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/devel/2014/12/16435.php
>> 
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/devel/2014/12/16436.php
> 
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to: 
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2014/12/16437.php

Reply via email to