Re: [OMPI devel] [OMPI svn] svn:open-mpi r25323

George Bosilca Wed, 19 Oct 2011 16:54:51 -0400

Can I have an example on how the current trunk is broken due to this change?


Thanks,
  george.

On Oct 19, 2011, at 16:32 , Ralph Castain wrote:

> I propose that we retain the rest of the changeset, but revert the OMPI 
> constants to bring back their ORTE equivalents. We clearly should scrub those 
> and update them to ensure they are both used and current, but it seems to me 
> we lose more than we gain by removing them.
> 
> 
> On Oct 19, 2011, at 12:09 PM, Jeff Squyres wrote:
> 
>> Oy, yes, that is bad -- we cannot have overlapping ORTE and OMPI error 
>> codes. That seems like a very bad idea (in addition to the mixing of + and 
>> -).
>> 
>> For one thing, that breaks opal_strerror().  That, in itself, seems like a 
>> dealbreaker.
>> 
>> 
>> On Oct 19, 2011, at 1:51 PM, Barrett, Brian W wrote:
>> 
>>> I actually think it's worse than that.  An ORTE error code can now have
>>> the same error code as an OMPI error.  OMPI_ERR_REQUEST and
>>> ORTE_ERR_RECV_LESS_THANK_POSTED now share the same integer return code.
>>> Or, they should, if George hadn't made a mistake (see below).  The sharing
>>> of return codes seems... bad.
>>> 
>>> Also, there's a bug in George's patch.  Error codes are all negative, so
>>> OMPI_ERR_REQUEST should be OMPI_ERR_BASE -1 and OMPI_ERR_MAX should be
>>> OMPI_ERR_BASE - 1, not plus 2.
>>> 
>>> Brian
>>> 
>>> On 10/19/11 1:32 PM, "Ralph Castain" <r...@open-mpi.org> wrote:
>>> 
>>>> I've been wrestling with something from this commit, and I'm unsure of
>>>> the right answer. So please consider this a general design question for
>>>> the community.
>>>> 
>>>> This commit removes all the OMPI <-> ORTE equivalent constants - i.e., we
>>>> used to declare OMPI-prefixed equivalents to every ORTE-prefixed
>>>> constant. I understand the thinking (or at least, what I suspect was the
>>>> thought), but it creates an issue.
>>>> 
>>>> Suppose I have an ompi-level function (A) that calls another ompi-level
>>>> function (B). Invisible to A is that B calls an orte-level function. B
>>>> dutifully checks the error return from the orte-level function against an
>>>> ORTE-prefixed constant.
>>>> 
>>>> However, if that return isn't "success", what does B return up to A? It
>>>> cannot return the OMPI equivalent to the orte error constant because it
>>>> no longer exists. It could return the orte error code, but A has no way
>>>> of knowing it is going to get a non-OMPI constant, and therefore won't be
>>>> able to understand it - it will be an "unrecognized error".
>>>> 
>>>> I guess one option is to require that B "translate" the return code and
>>>> pass some OMPI error up the chain, but this prevents anything upwards
>>>> from understanding the nature of the problem and potentially taking
>>>> corrective and/or alternative action. Seems awfully limiting, as most of
>>>> the time the only option will be the vanilla "OMPI_ERROR".
>>>> 
>>>> Thoughts?
>>> -- 
>>> Brian W. Barrett
>>> Dept. 1423: Scalable System Software
>>> Sandia National Laboratories
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
>> 
>> -- 
>> Jeff Squyres
>> jsquy...@cisco.com
>> For corporate legal information go to:
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>> 
>> 
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] [OMPI svn] svn:open-mpi r25323

Reply via email to