On Oct 19, 2011, at 5:05 PM, George Bosilca wrote:
> Wonderful!!! We've been waiting for such functionality for a while.
My pleasure :-)
>
> I do have some questions/remarks related to this patch.
>
> What is the my_node_rank in the orte_proc_info_t structure?
The node rank is a local
Wonderful!!! We've been waiting for such functionality for a while.
I do have some questions/remarks related to this patch.
What is the my_node_rank in the orte_proc_info_t structure? Is there any
difference between using the field my_node_rank or the vpid part of the
my_daemon? What is the
There are several OPAL level error codes not used in the current code.
OPAL_ERR_TOPO_SLOT_LIST_NOT_SUPPORTED
OPAL_ERR_TOPO_SOCKET_NOT_SUPPORTED
OPAL_ERR_TOPO_CORE_NOT_SUPPORTED
OPAL_ERR_NOT_ENOUGH_SOCKETS
OPAL_ERR_NOT_ENOUGH_CORES
OPAL_ERR_INVALID_PHYS_CPU
OPAL_ERR_MULTIPLE_AFFINITIES
If
A careful reading of the committed patch, would have pointed out that none of
the concerns raised so far were true, the "old-way" behavior of the OMPI code
was preserved. Moreover, every single of the error codes removed were not used
in ages.
What Brian pointed out as evil, evil being a
WHAT: upgrade to libevent 2.0.13
WHY: libevent bug fixes
WHEN: Nov 2, 2011
TIMEOUT: 2 weeks
***
Jeff, Ralph, and I have been using the libevent2013 component for the last
month without issue. In 2 weeks I will:
- remove
George -
I wrote the error code gorp; I'm pretty sure I know exactly how it was
supposed to work.
There are 58 codes unused between OPAL_NETWORK_NOT_PARSEABLE and
OPAL_ERR_MAX. I now see what you did with ERR_REQUEST, and it's evil.
THat's not the intent of the error code logic at all. If you
On Oct 19, 2011, at 2:50 PM, George Bosilca wrote:
> I don't know how you think that the error codes work in Open MPI, so I'll
> take the liberty to depict it here so we all agree we're talking about the
> same thing.
>
> The opal_strerror is a nice feature, it allow to register a range of
Can I have an example on how the current trunk is broken due to this change?
Thanks,
george.
On Oct 19, 2011, at 16:32 , Ralph Castain wrote:
> I propose that we retain the rest of the changeset, but revert the OMPI
> constants to bring back their ORTE equivalents. We clearly should scrub
I don't know how you think that the error codes work in Open MPI, so I'll take
the liberty to depict it here so we all agree we're talking about the same
thing.
The opal_strerror is a nice feature, it allow to register a range of error
codes with a particular error converter. Every time you
I propose that we retain the rest of the changeset, but revert the OMPI
constants to bring back their ORTE equivalents. We clearly should scrub those
and update them to ensure they are both used and current, but it seems to me we
lose more than we gain by removing them.
On Oct 19, 2011, at
Sorry - referenced the wrong commit. It was r25331
On Oct 19, 2011, at 2:28 PM, Ralph Castain wrote:
> Hi folks
>
> For those of you who don't follow the commits...
>
> I just committed (r25323) an extension of the orte_ess.proc_get_locality
> function that allows a process to get its
Hi folks
For those of you who don't follow the commits...
I just committed (r25323) an extension of the orte_ess.proc_get_locality
function that allows a process to get its relative resource usage with any
other proc in the job. In other words, you can provide a process name to the
function,
I posted my findings about the bad version no. macros to the same
thread that described the Intel V12.1 optimizer bug (http://software.intel.com/en-us/forums/showthread.php?t=87132
). The response I got is:
Posted By: Hubert Haberstock (Intel)
__
The
Oy, yes, that is bad -- we cannot have overlapping ORTE and OMPI error codes.
That seems like a very bad idea (in addition to the mixing of + and -).
For one thing, that breaks opal_strerror(). That, in itself, seems like a
dealbreaker.
On Oct 19, 2011, at 1:51 PM, Barrett, Brian W wrote:
>
I actually think it's worse than that. An ORTE error code can now have
the same error code as an OMPI error. OMPI_ERR_REQUEST and
ORTE_ERR_RECV_LESS_THANK_POSTED now share the same integer return code.
Or, they should, if George hadn't made a mistake (see below). The sharing
of return codes
Did this get reported to the Intel compiler support people?
On Oct 19, 2011, at 8:24 AM, George Bosilca wrote:
> Thanks Larry,
>
> Will forward this info upstream.
>
> george.
>
> On Oct 18, 2011, at 21:56 , Larry Baker wrote:
>
>> George,
>>
>> Thanks for the update. FYI, here's all
I've been wrestling with something from this commit, and I'm unsure of the
right answer. So please consider this a general design question for the
community.
This commit removes all the OMPI <-> ORTE equivalent constants - i.e., we used
to declare OMPI-prefixed equivalents to every
Indeed, I removed some of the OMPI level error codes. As you can see in the
patch they were defined but never used.
I don't think they were worth an RFC, as they are not only never used in the
trunk, but on 1.5 and 1.4. And I did check it because I was wondering why they
existed in the first
George --
Did you actually remove some of the error codes?
I think that should have been worthy of a (quick) RFC first, just to let people
know who are working in non-trunk branches who might have been using them.
On Oct 18, 2011, at 11:51 PM, bosi...@osl.iu.edu wrote:
> Author: bosilca
>
George --
Can you put this back?
I don't think the error message is meaningless. It's there because people
typically copy-n-paste the error message to the user's list (or whatever their
support channel is). That error message will mean something to an OMPI
developer; (I'm guessing/assuming)
It's not just my components, George - there are people with branches out there
that have OMPI components and changes in them. If you are going to gripe when
others make changes without warning, then you should abide by your own rules.
:-)
On Oct 19, 2011, at 8:16 AM, George Bosilca wrote:
>
I run an entire battery of tests on these without any issues. Moreover it is an
OMPI related thing, and these error messages were never used. Anyway, please
let me know what exactly failed, I'll fix it asap.
Thanks,
george.
On Oct 19, 2011, at 10:06 , Ralph Castain wrote:
> If you are
If you are going to make such sweeping changes, could you please provide a
little warning as per our usual methods? This broke several things which can be
repaired, but would have been nice to know that we were going to make such a
change.
Thx
On Oct 18, 2011, at 9:51 PM, bosi...@osl.iu.edu
Thanks Larry,
Will forward this info upstream.
george.
On Oct 18, 2011, at 21:56 , Larry Baker wrote:
> George,
>
> Thanks for the update. FYI, here's all the version numbers reported by the
> compiler releases I have installed:
>
>> [baker@hydra ~]$ module load compilers/intel/11.1.080
24 matches
Mail list logo