Brian: you were the one that had an allergic reaction to #1 on the call.

Thoughts?


On Nov 2, 2011, at 1:23 PM, George Bosilca wrote:

> As it has been said, this is not something supposed to make it in a release. 
> On the unfortunate case where it does, always having a show_help will ensure 
> a quick complaint on one of our mailing lists and increase the probability of 
> a [very] quick fix.
> 
>   george.
> 
> On Nov 2, 2011, at 06:26 , TERRY DONTJE wrote:
> 
>> 
>> 
>> On 11/1/2011 7:48 PM, Jeff Squyres wrote:
>>> So this was slightly different than the opinion that was discussed on the 
>>> call today, which was 2.  The rationale for #2 was to punish developers, 
>>> but if such a bug did make it through to production, users wouldn't be 
>>> annoyed with show_help messages all the time.
>>> 
>>> Does anyone have strong opinions here?  I don't.
>>> 
>>> I offer the following two points:
>>> 
>>> - this is a coding error on the OMPI developer
>>> - it's pretty rare
>>> 
>>> 
>> I think a show_help + return is very helpful in this case.  I wouldn't think 
>> that we'd run into this case that much and it would seem that it would be a 
>> rare occurance that one could just fix when they run into it.  However, 
>> since there was some opposition to having show_help messages possibly coming 
>> up all over the place I     thought a fall back of only doing the show_help 
>> on enable_debug builds was a reasonable middle ground.
>> 
>> --td
>>> On Nov 1, 2011, at 7:30 PM, George Bosilca wrote:
>>> 
>>> 
>>>> 1
>>>> 
>>>>  george.
>>>> 
>>>> On Nov 1, 2011, at 17:23 , Jeff Squyres wrote:
>>>> 
>>>> 
>>>>> Can you clarify -- I can parse your text multiple ways.  Which are you 
>>>>> voting for?
>>>>> 
>>>>> 1. show_help + return error code in all cases.
>>>>> 2. if OPAL_ENABLE_DEBUG, show_help + exit(1), else silently return error 
>>>>> code.
>>>>> 3. show_help.  if OPAL_ENABLE_DEBUG, exit(1), else return error code.
>>>>> 
>>>>> 
>>>>> 
>>>>> On Nov 1, 2011, at 4:50 PM, George Bosilca wrote:
>>>>> 
>>>>> 
>>>>>> This is a much saner solution. We [mostly] stayed away from calling exit 
>>>>>> deep into our libraries, there is no reason to add it now. I'll vote in 
>>>>>> favor of show_help + return code.
>>>>>> 
>>>>>> george.
>>>>>> 
>>>>>> On Nov 1, 2011, at 15:14 , Jeff Squyres wrote:
>>>>>> 
>>>>>> 
>>>>>>> We talked about this on the call today.
>>>>>>> 
>>>>>>> A good suggestion was made: call show_help/opal_finalize/exit only when 
>>>>>>> OPAL_ENABLE_DEBUG is true.  Otherwise, return an error code.
>>>>>>> 
>>>>>>> If no one objects to this, I'll commit this tomorrow.
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> On Oct 31, 2011, at 4:16 PM, Jeff Squyres wrote:
>>>>>>> 
>>>>>>> 
>>>>>>>> WHAT: what to do if registering an MCA param results in an error?
>>>>>>>> 
>>>>>>>> WHERE: opal/mca/base/mca_base_param.c
>>>>>>>> 
>>>>>>>> WHY: MCA param re-registration issues should be treated as OMPI 
>>>>>>>> developer errors
>>>>>>>> 
>>>>>>>> WHEN: COB Friday, 4 Nov 2011
>>>>>>>> 
>>>>>>>> -----------------
>>>>>>>> 
>>>>>>>> Short version: 
>>>>>>>> 
>>>>>>>> Re-registering an MCA param to be a different type (e.g., it was 
>>>>>>>> initially registered to be a string, but was later re-registered to be 
>>>>>>>> an int) should be treated as an OMPI developer error, and should 
>>>>>>>> opal_finalize()/exit(1).
>>>>>>>> 
>>>>>>>> More details:
>>>>>>>> 
>>>>>>>> A mistaken MCA param re-registration recently caused an orted segv.
>>>>>>>> 
>>>>>>>> The MCA param subsystem was fixed to avoid this segv, but silently 
>>>>>>>> convert the MCA param to the newly-registered type.  Upon reflection 
>>>>>>>> and some discussion, this seems to be a bad idea.  Instead, we should 
>>>>>>>> loudly complain via a show_help message and then exit(1).
>>>>>>>> 
>>>>>>>> Specifically: this kind of behavior is clearly an error and should be 
>>>>>>>> fixed.  Unfortunately, in most cases, we don't actually check the 
>>>>>>>> return value from MCA param registration functions, so if we change 
>>>>>>>> the MCA param function to simply return a non OPAL_SUCCESS status, 
>>>>>>>> it's unlikely that anyone will notice until some code tries to read 
>>>>>>>> the param value, likely still resulting in a segv.
>>>>>>>> 
>>>>>>>> Does anyone have heartburn if I change the error behavior to 
>>>>>>>> opal_finalize()/exit(1)?
>>>>>>>> 
>>>>>>>> -- 
>>>>>>>> Jeff Squyres
>>>>>>>> 
>>>>>>>> jsquy...@cisco.com
>>>>>>>> 
>>>>>>>> For corporate legal information go to:
>>>>>>>> 
>>>>>>>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> _______________________________________________
>>>>>>>> devel mailing list
>>>>>>>> 
>>>>>>>> de...@open-mpi.org
>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>> -- 
>>>>>>> Jeff Squyres
>>>>>>> 
>>>>>>> jsquy...@cisco.com
>>>>>>> 
>>>>>>> For corporate legal information go to:
>>>>>>> 
>>>>>>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> devel mailing list
>>>>>>> 
>>>>>>> de...@open-mpi.org
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>> _______________________________________________
>>>>>> devel mailing list
>>>>>> 
>>>>>> de...@open-mpi.org
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>> -- 
>>>>> Jeff Squyres
>>>>> 
>>>>> jsquy...@cisco.com
>>>>> 
>>>>> For corporate legal information go to:
>>>>> 
>>>>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>>>> 
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> devel mailing list
>>>>> 
>>>>> de...@open-mpi.org
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>> _______________________________________________
>>>> devel mailing list
>>>> 
>>>> de...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
>> -- 
>> <Mail Attachment.gif>
>> Terry D. Dontje | Principal Software Engineer
>> Developer Tools Engineering | +1.781.442.2631
>> Oracle - Performance Technologies
>> 95 Network Drive, Burlington, MA 01803
>> Email terry.don...@oracle.com
>> 
>> 
>> 
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/


Reply via email to