Re: [OMPI devel] Locality info

2011-10-19 Thread Ralph Castain
On Oct 19, 2011, at 5:05 PM, George Bosilca wrote: > Wonderful!!! We've been waiting for such functionality for a while. My pleasure :-) > > I do have some questions/remarks related to this patch. > > What is the my_node_rank in the orte_proc_info_t structure? The node rank is a local

Re: [OMPI devel] Locality info

2011-10-19 Thread George Bosilca
Wonderful!!! We've been waiting for such functionality for a while. I do have some questions/remarks related to this patch. What is the my_node_rank in the orte_proc_info_t structure? Is there any difference between using the field my_node_rank or the vpid part of the my_daemon? What is the

Re: [OMPI devel] [OMPI svn] svn:open-mpi r25323

2011-10-19 Thread George Bosilca
There are several OPAL level error codes not used in the current code. OPAL_ERR_TOPO_SLOT_LIST_NOT_SUPPORTED OPAL_ERR_TOPO_SOCKET_NOT_SUPPORTED OPAL_ERR_TOPO_CORE_NOT_SUPPORTED OPAL_ERR_NOT_ENOUGH_SOCKETS OPAL_ERR_NOT_ENOUGH_CORES OPAL_ERR_INVALID_PHYS_CPU OPAL_ERR_MULTIPLE_AFFINITIES If

Re: [OMPI devel] [OMPI svn] svn:open-mpi r25323

2011-10-19 Thread George Bosilca
A careful reading of the committed patch, would have pointed out that none of the concerns raised so far were true, the "old-way" behavior of the OMPI code was preserved. Moreover, every single of the error codes removed were not used in ages. What Brian pointed out as evil, evil being a

[OMPI devel] RFC: upgrade to libevent 2.0.13 (removing 2.0.7)

2011-10-19 Thread Nathan Hjelm
WHAT: upgrade to libevent 2.0.13 WHY: libevent bug fixes WHEN: Nov 2, 2011 TIMEOUT: 2 weeks *** Jeff, Ralph, and I have been using the libevent2013 component for the last month without issue. In 2 weeks I will: - remove

Re: [OMPI devel] [OMPI svn] svn:open-mpi r25323

2011-10-19 Thread Barrett, Brian W
George - I wrote the error code gorp; I'm pretty sure I know exactly how it was supposed to work. There are 58 codes unused between OPAL_NETWORK_NOT_PARSEABLE and OPAL_ERR_MAX. I now see what you did with ERR_REQUEST, and it's evil. THat's not the intent of the error code logic at all. If you

Re: [OMPI devel] [OMPI svn] svn:open-mpi r25323

2011-10-19 Thread Ralph Castain
On Oct 19, 2011, at 2:50 PM, George Bosilca wrote: > I don't know how you think that the error codes work in Open MPI, so I'll > take the liberty to depict it here so we all agree we're talking about the > same thing. > > The opal_strerror is a nice feature, it allow to register a range of

Re: [OMPI devel] [OMPI svn] svn:open-mpi r25323

2011-10-19 Thread George Bosilca
Can I have an example on how the current trunk is broken due to this change? Thanks, george. On Oct 19, 2011, at 16:32 , Ralph Castain wrote: > I propose that we retain the rest of the changeset, but revert the OMPI > constants to bring back their ORTE equivalents. We clearly should scrub

Re: [OMPI devel] [OMPI svn] svn:open-mpi r25323

2011-10-19 Thread George Bosilca
I don't know how you think that the error codes work in Open MPI, so I'll take the liberty to depict it here so we all agree we're talking about the same thing. The opal_strerror is a nice feature, it allow to register a range of error codes with a particular error converter. Every time you

Re: [OMPI devel] [OMPI svn] svn:open-mpi r25323

2011-10-19 Thread Ralph Castain
I propose that we retain the rest of the changeset, but revert the OMPI constants to bring back their ORTE equivalents. We clearly should scrub those and update them to ensure they are both used and current, but it seems to me we lose more than we gain by removing them. On Oct 19, 2011, at

Re: [OMPI devel] Locality info

2011-10-19 Thread Ralph Castain
Sorry - referenced the wrong commit. It was r25331 On Oct 19, 2011, at 2:28 PM, Ralph Castain wrote: > Hi folks > > For those of you who don't follow the commits... > > I just committed (r25323) an extension of the orte_ess.proc_get_locality > function that allows a process to get its

[OMPI devel] Locality info

2011-10-19 Thread Ralph Castain
Hi folks For those of you who don't follow the commits... I just committed (r25323) an extension of the orte_ess.proc_get_locality function that allows a process to get its relative resource usage with any other proc in the job. In other words, you can provide a process name to the function,

Re: [OMPI devel] make check fails for Intel 2011.6.233 (OpenMPI 1.4.3)

2011-10-19 Thread Larry Baker
I posted my findings about the bad version no. macros to the same thread that described the Intel V12.1 optimizer bug (http://software.intel.com/en-us/forums/showthread.php?t=87132 ). The response I got is: Posted By: Hubert Haberstock (Intel) __ The

Re: [OMPI devel] [OMPI svn] svn:open-mpi r25323

2011-10-19 Thread Jeff Squyres
Oy, yes, that is bad -- we cannot have overlapping ORTE and OMPI error codes. That seems like a very bad idea (in addition to the mixing of + and -). For one thing, that breaks opal_strerror(). That, in itself, seems like a dealbreaker. On Oct 19, 2011, at 1:51 PM, Barrett, Brian W wrote: >

Re: [OMPI devel] [OMPI svn] svn:open-mpi r25323

2011-10-19 Thread Barrett, Brian W
I actually think it's worse than that. An ORTE error code can now have the same error code as an OMPI error. OMPI_ERR_REQUEST and ORTE_ERR_RECV_LESS_THANK_POSTED now share the same integer return code. Or, they should, if George hadn't made a mistake (see below). The sharing of return codes

Re: [OMPI devel] make check fails for Intel 2011.6.233 (OpenMPI 1.4.3)

2011-10-19 Thread Jeff Squyres
Did this get reported to the Intel compiler support people? On Oct 19, 2011, at 8:24 AM, George Bosilca wrote: > Thanks Larry, > > Will forward this info upstream. > > george. > > On Oct 18, 2011, at 21:56 , Larry Baker wrote: > >> George, >> >> Thanks for the update. FYI, here's all

Re: [OMPI devel] [OMPI svn] svn:open-mpi r25323

2011-10-19 Thread Ralph Castain
I've been wrestling with something from this commit, and I'm unsure of the right answer. So please consider this a general design question for the community. This commit removes all the OMPI <-> ORTE equivalent constants - i.e., we used to declare OMPI-prefixed equivalents to every

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r25323

2011-10-19 Thread George Bosilca
Indeed, I removed some of the OMPI level error codes. As you can see in the patch they were defined but never used. I don't think they were worth an RFC, as they are not only never used in the trunk, but on 1.5 and 1.4. And I did check it because I was wondering why they existed in the first

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r25323

2011-10-19 Thread Jeff Squyres
George -- Did you actually remove some of the error codes? I think that should have been worthy of a (quick) RFC first, just to let people know who are working in non-trunk branches who might have been using them. On Oct 18, 2011, at 11:51 PM, bosi...@osl.iu.edu wrote: > Author: bosilca >

[OMPI devel] Removing error message

2011-10-19 Thread Jeff Squyres
George -- Can you put this back? I don't think the error message is meaningless. It's there because people typically copy-n-paste the error message to the user's list (or whatever their support channel is). That error message will mean something to an OMPI developer; (I'm guessing/assuming)

Re: [OMPI devel] [OMPI svn] svn:open-mpi r25323

2011-10-19 Thread Ralph Castain
It's not just my components, George - there are people with branches out there that have OMPI components and changes in them. If you are going to gripe when others make changes without warning, then you should abide by your own rules. :-) On Oct 19, 2011, at 8:16 AM, George Bosilca wrote: >

Re: [OMPI devel] [OMPI svn] svn:open-mpi r25323

2011-10-19 Thread George Bosilca
I run an entire battery of tests on these without any issues. Moreover it is an OMPI related thing, and these error messages were never used. Anyway, please let me know what exactly failed, I'll fix it asap. Thanks, george. On Oct 19, 2011, at 10:06 , Ralph Castain wrote: > If you are

Re: [OMPI devel] [OMPI svn] svn:open-mpi r25323

2011-10-19 Thread Ralph Castain
If you are going to make such sweeping changes, could you please provide a little warning as per our usual methods? This broke several things which can be repaired, but would have been nice to know that we were going to make such a change. Thx On Oct 18, 2011, at 9:51 PM, bosi...@osl.iu.edu

Re: [OMPI devel] make check fails for Intel 2011.6.233 (OpenMPI 1.4.3)

2011-10-19 Thread George Bosilca
Thanks Larry, Will forward this info upstream. george. On Oct 18, 2011, at 21:56 , Larry Baker wrote: > George, > > Thanks for the update. FYI, here's all the version numbers reported by the > compiler releases I have installed: > >> [baker@hydra ~]$ module load compilers/intel/11.1.080