Oy, yes, that is bad -- we cannot have overlapping ORTE and OMPI error codes. 
That seems like a very bad idea (in addition to the mixing of + and -).

For one thing, that breaks opal_strerror().  That, in itself, seems like a 
dealbreaker.


On Oct 19, 2011, at 1:51 PM, Barrett, Brian W wrote:

> I actually think it's worse than that.  An ORTE error code can now have
> the same error code as an OMPI error.  OMPI_ERR_REQUEST and
> ORTE_ERR_RECV_LESS_THANK_POSTED now share the same integer return code.
> Or, they should, if George hadn't made a mistake (see below).  The sharing
> of return codes seems... bad.
> 
> Also, there's a bug in George's patch.  Error codes are all negative, so
> OMPI_ERR_REQUEST should be OMPI_ERR_BASE -1 and OMPI_ERR_MAX should be
> OMPI_ERR_BASE - 1, not plus 2.
> 
> Brian
> 
> On 10/19/11 1:32 PM, "Ralph Castain" <r...@open-mpi.org> wrote:
> 
>> I've been wrestling with something from this commit, and I'm unsure of
>> the right answer. So please consider this a general design question for
>> the community.
>> 
>> This commit removes all the OMPI <-> ORTE equivalent constants - i.e., we
>> used to declare OMPI-prefixed equivalents to every ORTE-prefixed
>> constant. I understand the thinking (or at least, what I suspect was the
>> thought), but it creates an issue.
>> 
>> Suppose I have an ompi-level function (A) that calls another ompi-level
>> function (B). Invisible to A is that B calls an orte-level function. B
>> dutifully checks the error return from the orte-level function against an
>> ORTE-prefixed constant.
>> 
>> However, if that return isn't "success", what does B return up to A? It
>> cannot return the OMPI equivalent to the orte error constant because it
>> no longer exists. It could return the orte error code, but A has no way
>> of knowing it is going to get a non-OMPI constant, and therefore won't be
>> able to understand it - it will be an "unrecognized error".
>> 
>> I guess one option is to require that B "translate" the return code and
>> pass some OMPI error up the chain, but this prevents anything upwards
>> from understanding the nature of the problem and potentially taking
>> corrective and/or alternative action. Seems awfully limiting, as most of
>> the time the only option will be the vanilla "OMPI_ERROR".
>> 
>> Thoughts?
> -- 
>  Brian W. Barrett
>  Dept. 1423: Scalable System Software
>  Sandia National Laboratories
> 
> 
> 
> 
> 
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/


Reply via email to