from:"Terry . Dontje"

Re: [OMPI devel] Multirail + Open MPI 1.6.1 = very big latency for the first communication

2012-11-01 Thread TERRY DONTJE

IIRC, the first 16 or so messages over the openib btl uses the send/recv 
API as opposed to rdma which is significantly faster.  I am not sure as 
to how 1.5.3 and multi-rail affects this but the preconnected I believe 
short circuits when one cuts over to use rdma for eager messages.

--td

On 10/31/2012 3:36 PM, Paul Kapinos wrote:

Hello all,

Open MPI is clever and use by default multiple IB adapters, if available.
http://www.open-mpi.org/faq/?category=openfabrics#ofa-port-wireup

Open MPI is lazy and establish connections only iff needed.

Both is good.

We have kinda special nodes: up to 16 sockets, 128 cores, 4 boards, 4 
IB cards. Multirail works!

The crucial thing is, that starting with v1.6.1 the latency of the 
very first PingPong sample between two nodes take really a lot of time 
- some 100x - 200x of usual latency. You cannot see this using usual 
latency benchmark(*) because they tend to omit the first samples as 
"warmup phase", but we use a kinda self-written parallel test which 
clearly show this (and let me to muse some days).
If Miltirail is forbidden (-mca btl_openib_max_btls 1), or if v.1.5.3 
used, or if the MPI processes are preconnected 
(http://www.open-mpi.org/faq/?category=running#mpi-preconnect) there 
is no such huge latency outliers for the first sample.

Well, we know about the warm-up and lazy connections.

But 200x ?!

Any comments about that is OK so?

Best,

Paul Kapinos

(*) E.g. HPCC explicitely say in 
http://icl.cs.utk.edu/hpcc/faq/index.html#132
> Additional startup latencies are masked out by starting the 
measurement after

> one non-measured ping-pong.

P.S. Sorry for cross-posting to both Users and Developers, but my last 
questions to Users have no reply until yet, so trying to broadcast...

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] [OMPI svn] svn:open-mpi r26868 - in trunk/orte/mca/plm: base rsh

2012-07-26 Thread TERRY DONTJE

Interestingly enough it worked for me for a while and then after many 
runs I started seeing the below too.


--td

On 7/26/2012 11:07 AM, Ralph Castain wrote:

Hmmm...it was working for me, but I'll recheck. Thanks!

On Jul 26, 2012, at 8:04 AM, George Bosilca wrote:


r26868 seems to have some issues. It works well as long as all processes are 
started on the same node (aka. there is a single daemon), but it breaks with 
the error message attached below if there are more than two daemons.

$ mpirun -np 2 --bynode ./runme
[node01:07767] [[21341,0],1] ORTE_ERROR_LOG: A message is attempting to be sent 
to a process whose contact information is unknown in file 
../../../../../ompi/orte/mca/rml/oob/rml_oob_send.c at line 362
[node01:07767] [[21341,0],1] attempted to send to [[21341,0],2]: tag 15
[node01:07767] [[21341,0],1] ORTE_ERROR_LOG: A message is attempting to be sent 
to a process whose contact information is unknown in file 
../../../../ompi/orte/mca/grpcomm/base/grpcomm_base_xcast.c at line 157

I confirm that applying the reverted commit brings the trunk to a normal state.

Please - a tad more care in what gets committed??

  george.


On Jul 25, 2012, at 23:46 , svn-commit-mai...@open-mpi.org wrote:


Author: rhc (Ralph Castain)
Date: 2012-07-25 17:46:45 EDT (Wed, 25 Jul 2012)
New Revision: 26868
URL: https://svn.open-mpi.org/trac/ompi/changeset/26868

Log:
Reconnect the rsh/ssh error reporting code for remote spawns to report failure 
to launch. Ensure the HNP correctly reports non-zero exit status when ssh 
encounters a problem.

Thanks to Terry for spotting it!

Text files modified:
  trunk/orte/mca/plm/base/plm_base_launch_support.c |44 

  trunk/orte/mca/plm/base/plm_base_receive.c| 6 +
  trunk/orte/mca/plm/base/plm_private.h | 4 +++
  trunk/orte/mca/plm/rsh/plm_rsh_module.c   |18 +++-
  4 files changed, 62 insertions(+), 10 deletions(-)

Modified: trunk/orte/mca/plm/base/plm_base_launch_support.c
==
--- trunk/orte/mca/plm/base/plm_base_launch_support.c   Wed Jul 25 12:32:51 
2012(r26867)
+++ trunk/orte/mca/plm/base/plm_base_launch_support.c   2012-07-25 17:46:45 EDT 
(Wed, 25 Jul 2012)  (r26868)
@@ -741,6 +741,50 @@

}

+void orte_plm_base_daemon_failed(int st, orte_process_name_t* sender,
+ opal_buffer_t *buffer,
+ orte_rml_tag_t tag, void *cbdata)
+{
+int status, rc;
+int32_t n;
+orte_vpid_t vpid;
+orte_proc_t *daemon;
+
+/* get the daemon job, if necessary */
+if (NULL == jdatorted) {
+jdatorted = orte_get_job_data_object(ORTE_PROC_MY_NAME->jobid);
+}
+
+/* unpack the daemon that failed */
+n=1;
+if (OPAL_SUCCESS != (rc = opal_dss.unpack(buffer,,, ORTE_VPID))) {
+ORTE_ERROR_LOG(rc);
+ORTE_UPDATE_EXIT_STATUS(ORTE_ERROR_DEFAULT_EXIT_CODE);
+goto finish;
+}
+
+/* unpack the exit status */
+n=1;
+if (OPAL_SUCCESS != (rc = opal_dss.unpack(buffer,,, OPAL_INT))) {
+ORTE_ERROR_LOG(rc);
+status = ORTE_ERROR_DEFAULT_EXIT_CODE;
+ORTE_UPDATE_EXIT_STATUS(ORTE_ERROR_DEFAULT_EXIT_CODE);
+} else {
+ORTE_UPDATE_EXIT_STATUS(WEXITSTATUS(status));
+}
+
+/* find the daemon and update its state/status */
+if (NULL == (daemon = 
(orte_proc_t*)opal_pointer_array_get_item(jdatorted->procs, vpid))) {
+ORTE_ERROR_LOG(ORTE_ERR_NOT_FOUND);
+goto finish;
+}
+daemon->state = ORTE_PROC_STATE_FAILED_TO_START;
+daemon->exit_code = status;
+
+ finish:
+ORTE_ACTIVATE_PROC_STATE(>name, ORTE_PROC_STATE_FAILED_TO_START);
+}
+
int orte_plm_base_setup_orted_cmd(int *argc, char ***argv)
{
int i, loc;

Modified: trunk/orte/mca/plm/base/plm_base_receive.c
==
--- trunk/orte/mca/plm/base/plm_base_receive.c  Wed Jul 25 12:32:51 2012
(r26867)
+++ trunk/orte/mca/plm/base/plm_base_receive.c  2012-07-25 17:46:45 EDT (Wed, 
25 Jul 2012)  (r26868)
@@ -87,6 +87,12 @@
  
orte_plm_base_daemon_callback, NULL))) {
ORTE_ERROR_LOG(rc);
}
+if (ORTE_SUCCESS != (rc = orte_rml.recv_buffer_nb(ORTE_NAME_WILDCARD,
+  
ORTE_RML_TAG_REPORT_REMOTE_LAUNCH,
+  ORTE_RML_PERSISTENT,
+  
orte_plm_base_daemon_failed, NULL))) {
+ORTE_ERROR_LOG(rc);
+}
}
recv_issued = true;


Modified: trunk/orte/mca/plm/base/plm_private.h
==
--- trunk/orte/mca/plm/base/plm_private.h   Wed Jul 25 12:32:51 2012

Re: [hwloc-devel] hwloc v1.5rc1 coming soon

2012-07-10 Thread TERRY DONTJE


Is this also going to include the topology_solaris.c improvements?

--td

On 7/10/2012 3:16 PM, Brice Goglin wrote:

I am preparing a v1.5rc1 release, so here's the current status in case
somebody wants to comment.


Major changes for is v1.5:

* instruction caches
* lstopo becomes lstopo + lstopo-no-graphics
* improvements to misc backends (AIX, FreeBSD)


Full v1.5 NEWS list:

* Backends
   + Do not limit the number of processors to 1024 on Solaris anymore.
   + Gather total machine memory on FreeBSD.
   + XML topology files do not depend on the locale anymore. Float numbers
 such as NUMA distances or PCI link speeds now always use a dot as a
 decimal separator.
   + Add instruction caches detection on Linux, AIX, Windows and Darwin.
   + Add get_last_cpu_location() support for the current thread on AIX.
   + Support binding on AIX when threads or processes were bound with
 bindprocessor(). Thanks to Hendryk Bockelmann for reporting the issue
 and testing patches, and to Farid Parpia for explaining the binding
 interfaces.
   + Improve AMD topology detection in the x86 backend (for FreeBSD) using
 the topoext feature.
* API
   + Increase HWLOC_API_VERSION to 0x00010500 so that API changes may be
 detected at build-time.
   + Add a cache type attribute describind Data, Instruction and Unified
 caches. Caches with different types but same depth (for instance L1d
 and L1i) are placed on different levels.
   + Add hwloc_get_cache_type_depth() to retrieve the hwloc level depth of
 of the given cache depth and type, for instance L1i or L2.
 It helps  disambiguating the case where hwloc_get_type_depth() returns
 HWLOC_TYPE_DEPTH_MULTIPLE.
   + Instruction caches are ignored unless HWLOC_TOPOLOGY_FLAG_ICACHES is
 passed to hwloc_topology_set_flags() before load.
   + Add hwloc_ibv_get_device_osdev_by_name() OpenFabrics helper in
 openfabrics-verbs.h to find the hwloc OS device object corresponding to
 an OpenFabrics device.
* Tools
   + Add lstopo-no-graphics, a lstopo built without graphical support to
 avoid dependencies on external libraries such as Cairo and X11. When
 supported, graphical outputs are only available in the original lstopo
 program.
 - Packagers splitting lstopo and lstopo-no-graphics into different
   packages are advised to use the alternatives system so that lstopo
   points to the best available binary.
   + Instruction caches are enabled in lstopo by default. User --no-icaches
 to disable them.
   + Add -t/--threads to show threads in hwloc-ps.
* Removal of obsolete components
   + Remove the old cpuset interface (hwloc/cpuset.h) which is deprecated and
 superseded by the bitmap API (hwloc/bitmap.h) since v1.1.
 hwloc_cpuset and nodeset types are still defined, but all hwloc_cpuset_*
 compatibility wrappers are now gone.
   + Remove Linux libnuma conversion helpers for the deprecated and
 broken nodemask_t interface.
   + Remove support for "Proc" type name, it was superseded by "PU" in v1.0.
   + Remove hwloc-mask symlinks, it was replaced by hwloc-calc in v1.0.
* Misc
   + Non-printable characters are dropped from strings during XML export.
   + Assert hwloc_is_thissystem() in several I/O related helpers.
   + Limit the number of retries when operating on all threads within a
 process on Linux if the list of threads is heavily getting modified.


Plus some item currently only listed in the v1.4 branch NEWS:

* Fix PCIe 3.0 link speed computation.
* Fix importing of escaped characters with the minimalistic XML backend.
* Fix a memory leak in the x86 backend.


Open tickets against v1.5:

* #77: improve windows get_cpubind. I don't think I'll have time to work
   on this soon since only Hartmut can test such patches on large windows
   machines.
* #79: annotate the lstopo textual output for offline/unavailable/bound
   CPUs (red/green/black in the graphical output). Easy to implement but
   I don't have any obviously good solution yet.
* There's an OMPI ticket about hwloc fixes for a native windows build.
   We're supposed to get a patch one day.


Brice

___
hwloc-devel mailing list
hwloc-de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel


--
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com

Re: [OMPI devel] openib compiler warnings

2012-07-10 Thread TERRY DONTJE




On 7/10/2012 1:57 PM, TERRY DONTJE wrote:



On 7/10/2012 12:50 PM, Jeff Squyres wrote:

I'm getting these compiler warnings on the SVN trunk HEAD (r26776):

btl_openib.c: In function 'mca_btl_openib_put':
btl_openib.c:1652: warning: assignment makes integer from pointer without a cast
btl_openib.c: In function 'mca_btl_openib_get':
btl_openib.c:1734: warning: assignment makes integer from pointer without a cast

This is mine, I'll fix.


That is the above is mine.  I didn't touch the below as far as I know.

connect/btl_openib_connect_udcm.c:948: warning: 'i_initiate' defined but not 
used



--
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>





--
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>

Re: [OMPI devel] openib compiler warnings

2012-07-10 Thread TERRY DONTJE




On 7/10/2012 12:50 PM, Jeff Squyres wrote:

I'm getting these compiler warnings on the SVN trunk HEAD (r26776):

btl_openib.c: In function 'mca_btl_openib_put':
btl_openib.c:1652: warning: assignment makes integer from pointer without a cast
btl_openib.c: In function 'mca_btl_openib_get':
btl_openib.c:1734: warning: assignment makes integer from pointer without a cast

This is mine, I'll fix.


connect/btl_openib_connect_udcm.c:948: warning: 'i_initiate' defined but not 
used



--
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com

[OMPI devel] openib max_cqe

2012-07-05 Thread TERRY DONTJE

With Jeff's latest changes to how we set up the cq_size I am now seeing 
error messages saying that my machine's memlocked limits are too low.  I 
am concerned that it might be something else because my max'd locked 
memory is unlimited on my machine.


So if I do a run of -np 2 across two separate node than the use of the 
max_cqe of my ib device (4194303) is ok.  Once I go beyond 1 process on 
the node I start getting the memlocked limits message.  So how much 
memory does a cqe take?  Is it 1k by any chance?  I ask this because the 
machine I am running on has 4GB of memory and so I am wondering if I 
just don't have enough backing memory and if that is so I am wondering 
how commone of a case this may be?


--
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com

Re: [OMPI devel] [OMPI svn] svn:open-mpi r26707 - in trunk/ompi: config mca/btl/ofud mca/btl/openib mca/common/ofacm mca/common/ofautils mca/dpm

2012-07-02 Thread TERRY DONTJE


So is ofacm another replacement for ibcm and rdmacm?

--td

On 7/2/2012 11:20 AM, Nathan Hjelm wrote:

Nice! Are we moving this to 1.7 as well?

-Nathan

On Mon, Jul 02, 2012 at 11:20:12AM -0400, svn-commit-mai...@open-mpi.org wrote:

Author: pasha (Pavel Shamis)
Date: 2012-07-02 11:20:12 EDT (Mon, 02 Jul 2012)
New Revision: 26707
URL: https://svn.open-mpi.org/trac/ompi/changeset/26707

Log:
1. Adding 2 new components:
ofacm - generic connection manager for IB interconnects.
ofautils - IB common utilities and compatibility code

2. Updating OpenIB configure code

- ORNL&  Mellanox Teams

Added:
trunk/ompi/config/ompi_check_openfabrics.m4
trunk/ompi/mca/common/ofacm/
trunk/ompi/mca/common/ofacm/Makefile.am
trunk/ompi/mca/common/ofacm/base.h
trunk/ompi/mca/common/ofacm/common_ofacm_base.c
trunk/ompi/mca/common/ofacm/common_ofacm_empty.c
trunk/ompi/mca/common/ofacm/common_ofacm_empty.h
trunk/ompi/mca/common/ofacm/common_ofacm_oob.c
trunk/ompi/mca/common/ofacm/common_ofacm_oob.h
trunk/ompi/mca/common/ofacm/common_ofacm_xoob.c
trunk/ompi/mca/common/ofacm/common_ofacm_xoob.h
trunk/ompi/mca/common/ofacm/configure.m4
trunk/ompi/mca/common/ofacm/configure.params
trunk/ompi/mca/common/ofacm/connect.h
trunk/ompi/mca/common/ofacm/help-mpi-common-ofacm-base.txt
trunk/ompi/mca/common/ofacm/help-mpi-common-ofacm-oob.txt
trunk/ompi/mca/common/ofautils/
trunk/ompi/mca/common/ofautils/Makefile.am
trunk/ompi/mca/common/ofautils/common_ofautils.c
trunk/ompi/mca/common/ofautils/common_ofautils.h
trunk/ompi/mca/common/ofautils/configure.m4
trunk/ompi/mca/common/ofautils/configure.params
Deleted:
trunk/ompi/config/ompi_check_openib.m4
Text files modified:
trunk/ompi/config/ompi_check_openfabrics.m4|   403 +
/dev/null  |   329 ---
trunk/ompi/mca/btl/ofud/configure.m4   | 2
trunk/ompi/mca/btl/openib/Makefile.am  | 4
trunk/ompi/mca/btl/openib/btl_openib_component.c   |49 -
trunk/ompi/mca/btl/openib/configure.m4 | 5
trunk/ompi/mca/common/ofacm/Makefile.am|76 +
trunk/ompi/mca/common/ofacm/base.h |   193 
trunk/ompi/mca/common/ofacm/common_ofacm_base.c|   678 

trunk/ompi/mca/common/ofacm/common_ofacm_empty.c   |48 +
trunk/ompi/mca/common/ofacm/common_ofacm_empty.h   |22
trunk/ompi/mca/common/ofacm/common_ofacm_oob.c |  1672 

trunk/ompi/mca/common/ofacm/common_ofacm_oob.h |20
trunk/ompi/mca/common/ofacm/common_ofacm_xoob.c|  1537 

trunk/ompi/mca/common/ofacm/common_ofacm_xoob.h|69 +
trunk/ompi/mca/common/ofacm/configure.m4   |63 +
trunk/ompi/mca/common/ofacm/configure.params   |26
trunk/ompi/mca/common/ofacm/connect.h  |   541 

trunk/ompi/mca/common/ofacm/help-mpi-common-ofacm-base.txt |41
trunk/ompi/mca/common/ofacm/help-mpi-common-ofacm-oob.txt  |20
trunk/ompi/mca/common/ofautils/Makefile.am |68 +
trunk/ompi/mca/common/ofautils/common_ofautils.c   |89 ++
trunk/ompi/mca/common/ofautils/common_ofautils.h   |26
trunk/ompi/mca/common/ofautils/configure.m4|43 +
trunk/ompi/mca/common/ofautils/configure.params|26
trunk/ompi/mca/dpm/dpm.h   | 4
26 files changed, 5674 insertions(+), 380 deletions(-)


Diff not shown due to size (240057 bytes).
To see the diff, run the following command:

svn diff -r 26706:26707 --no-diff-deleted

___
svn mailing list
s...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/svn

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


--
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com

Re: [hwloc-devel] HWLOC_NBMAXCPUS

2012-06-29 Thread TERRY DONTJE

How soon will 1.5 be released?  I would like to get it into the OMPI 
trunk and 1.7.


--td

On 6/29/2012 8:19 AM, Brice Goglin wrote:

Thanks for testing.
I'll commit to trunk for sure. I don't know about 1.4 since I am 
planning a 1.5 rather than a 1.4.3 in the near future.

Brice



TERRY DONTJE <terry.don...@oracle.com> a écrit :

I finally got access to the big machine again.  The code you sent
me seems to work nicely.  Are you going to put it back to the
hwloc trunk and 1.4 branches?

--td

On 6/25/2012 11:31 AM, TERRY DONTJE wrote:

I'll test out the patch once the test machine is available again.

--td

On 6/25/2012 3:42 AM, Brice Goglin wrote:

Hello Terry,

Here's a patch that should help. It cleans the code and makes all arrays
dynamic. I artificially set the initial array sizes to 4 to experience
the code on our 24-way T1 machine. I will set it to 256 or so in the
final commit. Please let me know if it helps on your 1440-way machine.

Brice



-- 
Terry D. Dontje | Principal Software Engineer

Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>





___
hwloc-devel mailing list
hwloc-de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel


-- 
Terry D. Dontje | Principal Software Engineer

Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>





___
hwloc-devel mailing list
hwloc-de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel


--
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>

Re: [hwloc-devel] HWLOC_NBMAXCPUS

2012-06-25 Thread TERRY DONTJE


I'll test out the patch once the test machine is available again.

--td

On 6/25/2012 3:42 AM, Brice Goglin wrote:

Hello Terry,

Here's a patch that should help. It cleans the code and makes all arrays
dynamic. I artificially set the initial array sizes to 4 to experience
the code on our 24-way T1 machine. I will set it to 256 or so in the
final commit. Please let me know if it helps on your 1440-way machine.

Brice



--
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com

Re: [OMPI devel] OpenIB compile error

2012-06-25 Thread TERRY DONTJE




On 6/25/2012 10:12 AM, Jeff Squyres wrote:

On Jun 25, 2012, at 5:44 AM, TERRY DONTJE wrote:


Hmmm, I guess I could see the thinking of tying ofud and openib btls 
configuring together.  However it seems inconsistent to me that one btl doesn't 
allow you to control configuring it in or not directly.  What if I really do 
not want to build ofud but do want to build openib?

What if you don't want to build the TCP BTL?  Or the sm BTL?  Or you want to 
build the MX BTL but not the MX MTL?
Funny, I thought there was a --without-XXX option for each btl but there 
is not.  Guess my mind was thinking more of mca parameters and not 
configure options.  I withdraw any objection I have regarding 
configuration options.

The fine-grained control we have for such things is --enable-mca-no-build.



--
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>

Re: [OMPI devel] OpenIB compile error

2012-06-25 Thread TERRY DONTJE




On 6/23/2012 6:32 AM, Jeff Squyres wrote:

On Jun 22, 2012, at 11:26 PM, TERRY DONTJE wrote:


4. The behavior of --with[out]-verbs is as was described in a prior mail:
   - if --with-verbs is specified, all 3 verbs-based components must succeed
   - if --without-verbs is specified, all 4 verbs-based components will not 
build
   - if --with-verbs=DIR is specified, all 3 verbs-based components must 
succeed and will use DIR to find verbs headers and libraries


What does it mean that "all 3 verbs-based components must succeed"?
Does that mean you cannot specify --with-verbs=DIR --with-openib --without-ofud?

Yes.  --with/without-ofud never worked, anyway (i.e., there was no code that 
implemented it).  Ditto that there was no --with/without-ud.


Does it mean that if you specify --with-verbs=DIR  and some other dependency is 
not found for openib btl that the configure fails?

Yes.  Same was true for --with-openib=DIR.
Hmmm, I guess I could see the thinking of tying ofud and openib btls 
configuring together.  However it seems inconsistent to me that one btl 
doesn't allow you to control configuring it in or not directly.  What if 
I really do not want to build ofud but do want to build openib?


That being said it seems this happened some time ago so I guess I'll 
grin and bare it.



What is the 4th verbs-based component this is not built when one specifies 
--without-verbs.

There isn't one.
You're probably thinking of hwloc; hwloc can *use* verbs, but it doesn't 
*require* verbs.  The other 3 (OOB UD, BTL OFUD, BTL openib) all *require* 
verbs and cannot be built without it.

Ok, well I just asked because in the list above *you* mention 4 verbs 
components in one of the items and I was just curious what that might me.


--
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>

Re: [OMPI devel] OpenIB compile error

2012-06-23 Thread TERRY DONTJE

On 6/22/2012 3:36 PM, Jeff Squyres wrote:

To update everyone: there was much more discussion about this off-list. :-)

We decided to do the following:

1. The name --with-verbs seems better than --with-openfabrics, if for no other reason
than the name "openfabrics" encompasses more things than we intend with this
name (e.g., udapl, psm, etc.).

2. There is a definite problem that needs to be fixed, but it is only partially
related to the renaming. Basically: the proposed option renaming is occurring
opportunistically with this bug fix.

3. We are *not* renaming the openib BTL at this time. It would be great if
someone would do this in the future, hint hint!

4. The behavior of --with[out]-verbs is as was described in a prior mail:
- if --with-verbs is specified, all 3 verbs-based components must succeed
- if --without-verbs is specified, all 4 verbs-based components will not
build
- if --with-verbs=DIR is specified, all 3 verbs-based components must
succeed and will use DIR to find verbs headers and libraries

What does it mean that "all 3 verbs-based components must succeed"?
Does that mean you cannot specify --with-verbs=DIR --with-openib
--without-ofud?
Does it mean that if you specify --with-verbs=DIR and some other
dependency is not found for openib btl that the configure fails?
What is the 4th verbs-based component this is not built when one
specifies --without-verbs.

--td

5. The new collectives / non-blocking collectives will likely revise some more
configury in this area when it comes in mid/late next week, because a bunch of
verbs stuff moved out of the openib BTL and into common. Pasha and I will
revisit this when all that stuff is merged in next week.

6. I'll be committing the current --with-verbs stuff right now in order to fix
the bug that Brian is running in to.

On Jun 21, 2012, at 2:41 PM, Jeff Squyres wrote:

On Jun 21, 2012, at 1:54 PM, Shamis, Pavel wrote:

About naming - I would agree with Terry, it makes sense to name it after network API used
for this btl - "verbs" (it is not obverts).

I don't think that "verbs" is terrible, but I think that "openfabrics" has more
user-level recognition.

For example, if you ask a customer what kind of network stack they have installed on their machine,
they'll say "I have OFED installed". They won't say "I have the verbs stack
installed."

--
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

--
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com

[OMPI devel] bug in r26626

2012-06-22 Thread TERRY DONTJE

It looks like compilation of 32 bit platforms is failing due to a 
missing field.  It looks to me that for some reason r26626 deleted 
hdr_segkey in ompi/mca/osc/rdma/osc_rdma_header.h which is used in the 
macro OMPI_OSC_RDMA_RDMA_INFO_HDR_NTOH and HTON.  Is there a reason that 
hdr_segkey was removed, if so more changes are needed.


--
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com

Re: [hwloc-devel] HWLOC_NBMAXCPUS

2012-06-21 Thread TERRY DONTJE

I guess I was looking at the wrong version of code since I now see that 
topology-linux.c has fixed this issue.  I guess I will need to look to 
port this change over to solaris-topology.c


--td

On 6/21/2012 9:46 AM, TERRY DONTJE wrote:
I see a couple places where HWLOC_NBMAXCPUS is defined with a comment 
of "FIXME: drop".  This static size just bit me on a machine that has 
1440 CPUs.  I can bump up the define in my clone but I was wondering 
if this fixed size might change in the near future?


--
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>





___
hwloc-devel mailing list
hwloc-de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel


--
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>

[hwloc-devel] HWLOC_NBMAXCPUS

2012-06-21 Thread TERRY DONTJE

I see a couple places where HWLOC_NBMAXCPUS is defined with a comment of 
"FIXME: drop".  This static size just bit me on a machine that has 1440 
CPUs.  I can bump up the define in my clone but I was wondering if this 
fixed size might change in the near future?


--
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com

Re: [OMPI devel] OpenIB compile error

2012-06-21 Thread TERRY DONTJE

On 6/21/2012 6:38 AM, Jeff Squyres wrote:

On Jun 21, 2012, at 6:11 AM, TERRY DONTJE wrote:

As far as I understand it is not reason to rename it. The OFED-lovin components
should look at $with_openib.

I agree with Pasha that the reason you give for renaming openib btl seem
orthogonal to renaming the btl.

Note that I'm not talking about renaming the BTL (*).

I'm only talking about renaming --with-openib to --with- (see
below).
So you specify --with-ofed and you get mca_btl_openib generated?
ICK!!! I think that will just make things more confusing. I am against
this unless you change the btl name.

I don't like the ofed name because isn't "ofed" the name of the standards body
and not the protocol being used? I'd be in favor of renaming the btl ibverbs after the
library the btl accesses. However isn't this btl (and the underlying library) used with
networks other than IB?

Yes, it is used with at least 2 flavors of Ethernet networks, too -- that's why I shied away from
anything with "ib" in the name. But "verbs" is another possibility. Here's
some options:

1. --with-ofed
+++ Everyone recognizes the name
--- OFED refers to a specific software package; the name is not accurate

2. --with-of
--- "of" could mean anything; seems too generic

3. --with-openfabrics
+++ Explicit, obvious as to what it is for
--- A little long, but who cares?

4. --with-verbs
+++ A little shorter than "openfabrics"
--- A little generic of a name; not as specific as "openfabrics"

I'm personally gravitating towards --with-openfabrics.

(*) Although renaming the openib BTL would certainly be nice, that can be a different
effort. It would definitely need some additional "synonym" work in the MCA for
backwards compatibility during 1.7/1.8, though. To be clear: this email thread is NOT
about renaming the openib BTL.

--
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>

Re: [OMPI devel] OpenIB compile error

2012-06-21 Thread TERRY DONTJE




On 6/20/2012 5:02 PM, Jeff Squyres wrote:

On Jun 20, 2012, at 4:25 PM, Shamis, Pavel wrote:


I hate it ...

As far as I understand it is not reason to rename it. The OFED-lovin components 
should look at $with_openib.
I agree with Pasha that the reason you give for renaming openib btl seem 
orthogonal to renaming the btl.

Ah, sorry -- I didn't think this would be controversial.

Just curious: why do you hate it?  OpenIB is a name that hasn't existed in 
years -- we already have to 'splain it.

Why not use a name that is commonly recognizable, like --with-ofed, or 
--with-of?

I don't like the ofed name because isn't "ofed" the name of the 
standards body and not the protocol being used?  I'd be in favor of 
renaming the btl ibverbs after the library the btl accesses.  However 
isn't this btl (and the underlying library) used with networks other 
than IB?


--
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com

[OMPI devel] openib Dynamic SL opensm-devel usage

2012-06-18 Thread TERRY DONTJE

I've ran into an issue compiling openib's Dynamic SL support  on a RH 
6.2 based system with the Oracle Studio compilers.


Turns out if I compile btl_openib_connect_sl.c with the Oracle Studio 
compilers with the "-g" option the compiler compiles some static inline 
functions in ib_types.h standalone (as opposed to ignoring the functions 
since they are not called in the btl_openib_connect_sl.c source).  This 
creates a dependency on the symbol ib_error_str in 
btl_openib_connect_sl.o .  Note this symbol is defined in libosmcomp.so.


My question is should btl_openib_connect_sl.c be linking to 
libosmcomp.so since btl_openib_connect_sl.c  is including ib_types.h or 
is there an assumption being made that btl_openib_connect_sl.c is just 
using macros/defines provided by the header and nothing requiring access 
to libosmcomp.so?


I ask this because I can make my original issue go away on RH 5.X 
systems if I link in libosmcomp.so however, this library doesn't exist 
on RH 6.2 systems without RH 5 compat headers package and doesn't have a 
32 bit version on RH 6.2 systems at all.  The point is if I try to fix 
the libosmcomp.so dependency by doing an AC_CHECK_LIB that RH 6.X system 
will actually stop configuring in Dynamic SL.


--
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com

Re: [OMPI devel] MPI_Cart_coords_f segv with Intel compiler

2012-05-24 Thread TERRY DONTJE

Forgot to add the date of my compiler was 2011.10.11 so I wonder if it 
might not have this issue you mentioned below.  Anyways, I'll keep the 
below in mind as I try to run more tests.


thanks,

--td

On 5/24/2012 2:06 PM, Larry Baker wrote:

Terry,

What you are seeing is a bug in the vectorizer in the Intel 2011.6.233 
release.  We've talked about this before.  You should probably remove 
that compiler from your system(s).  I think the new release of OpenMPI 
describes this problem, but does not stop if from occurring.  I write 
a patch for ptmalloc2/malloc.c for OpenMPI 1.4.3 which automatically 
adjusts the optimization level for _int_malloc(), which is where the 
bug occurs.


Larry Baker
US Geological Survey
650-329-5608
ba...@usgs.gov <mailto:ba...@usgs.gov>

-- Start of Patch --
--- opal/mca/memory/ptmalloc2/malloc.c.original2010-04-13 
10:30:26.0 -0700

+++ opal/mca/memory/ptmalloc2/malloc.c2011-11-04 15:01:37.0 -0700
@@ -2,6 +2,17 @@
 /* Copyright (c) 2010  Cisco Systems, Inc.  All rights reserved.
  */
+/* With Intel Composer XE V12.1.0, release 2011.6.233, any launch   */
+/* fails, even before main(), due to a bug in the vectorizer (see   */
+/* https://svn.open-mpi.org/trac/ompi/changeset/25290).  The fix is */
+/* to disable vectorization by reducing the optimization level to   */
+/* -O1 for _int_malloc().  The only reliable method to identify */
+/* release 2011.6.233 is the predefined __INTEL_COMPILER_BUILD_DATE */
+/* macro, which will have the value 20110811 (Linux, Windows, and   */
+/* Mac OS X).  (The predefined __INTEL_COMPILER macro is nonsense,  */
+/* , and both the 2011.6.233 and 2011.7.256 releases identify   */
+/* themselves as V12.1.0 from the -v command line option.)  */
+
 #define OPAL_DISABLE_ENABLE_MEM_DEBUG 1
 #include "opal_config.h"
@@ -3945,6 +3956,12 @@
   -- malloc --
 */
+#ifdef __INTEL_COMPILER_BUILD_DATE
+#if __INTEL_COMPILER_BUILD_DATE == 20110811
+#pragma GCC optimization_level 1
+#endif
+#endif
+
 Void_t*
 _int_malloc(mstate av, size_t bytes)
 {
-- End of Patch --

On 24 May 2012, at 6:54 AM, TERRY DONTJE wrote:

I am seeing several Cart Fortran tests (like MPI_Cart_coords_f) segv 
in opal_memory_ptmalloc2_int_free when OMPI trunk is compiled with 
icc 12.1.0 for 64 bit on linux.   Just wondering if anyone has seen 
anything similar to this with a different version of icc.  Other 
non-Intel compilers seem to not exhibit this issue.


--
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>



___
devel mailing list
de...@open-mpi.org <mailto:de...@open-mpi.org>
http://www.open-mpi.org/mailman/listinfo.cgi/devel




--
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>

Re: [OMPI devel] MPI_Cart_coords_f segv with Intel compiler

2012-05-24 Thread TERRY DONTJE

Actually, I don't think the below is the issue.  I think the 
OMPI_ARRAY_INT_2_LOGICAL macro is doing a free on line 193 when it 
shouldn't because the OMPI_ARRAY_LOGICAL_2_INT macro calling an empty 
OMPI_ARRAY_LOGICAL__2_INT_ALLOC macro which in the other case that macro 
actually does a malloc.


It was interesting looking at the diff between 26283 and the prior 
version for fint_2_int.h and seeing commented out "frees" being 
uncommented.  I suspect only one of the frees should have been commented 
out.


--td

On 5/24/2012 2:06 PM, Larry Baker wrote:

Terry,

What you are seeing is a bug in the vectorizer in the Intel 2011.6.233 
release.  We've talked about this before.  You should probably remove 
that compiler from your system(s).  I think the new release of OpenMPI 
describes this problem, but does not stop if from occurring.  I write 
a patch for ptmalloc2/malloc.c for OpenMPI 1.4.3 which automatically 
adjusts the optimization level for _int_malloc(), which is where the 
bug occurs.


Larry Baker
US Geological Survey
650-329-5608
ba...@usgs.gov <mailto:ba...@usgs.gov>

-- Start of Patch --
--- opal/mca/memory/ptmalloc2/malloc.c.original2010-04-13 
10:30:26.0 -0700

+++ opal/mca/memory/ptmalloc2/malloc.c2011-11-04 15:01:37.0 -0700
@@ -2,6 +2,17 @@
 /* Copyright (c) 2010  Cisco Systems, Inc.  All rights reserved.
  */
+/* With Intel Composer XE V12.1.0, release 2011.6.233, any launch   */
+/* fails, even before main(), due to a bug in the vectorizer (see   */
+/* https://svn.open-mpi.org/trac/ompi/changeset/25290).  The fix is */
+/* to disable vectorization by reducing the optimization level to   */
+/* -O1 for _int_malloc().  The only reliable method to identify */
+/* release 2011.6.233 is the predefined __INTEL_COMPILER_BUILD_DATE */
+/* macro, which will have the value 20110811 (Linux, Windows, and   */
+/* Mac OS X).  (The predefined __INTEL_COMPILER macro is nonsense,  */
+/* , and both the 2011.6.233 and 2011.7.256 releases identify   */
+/* themselves as V12.1.0 from the -v command line option.)  */
+
 #define OPAL_DISABLE_ENABLE_MEM_DEBUG 1
 #include "opal_config.h"
@@ -3945,6 +3956,12 @@
   -- malloc --
 */
+#ifdef __INTEL_COMPILER_BUILD_DATE
+#if __INTEL_COMPILER_BUILD_DATE == 20110811
+#pragma GCC optimization_level 1
+#endif
+#endif
+
 Void_t*
 _int_malloc(mstate av, size_t bytes)
 {
-- End of Patch --

On 24 May 2012, at 6:54 AM, TERRY DONTJE wrote:

I am seeing several Cart Fortran tests (like MPI_Cart_coords_f) segv 
in opal_memory_ptmalloc2_int_free when OMPI trunk is compiled with 
icc 12.1.0 for 64 bit on linux.   Just wondering if anyone has seen 
anything similar to this with a different version of icc.  Other 
non-Intel compilers seem to not exhibit this issue.


--
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>



___
devel mailing list
de...@open-mpi.org <mailto:de...@open-mpi.org>
http://www.open-mpi.org/mailman/listinfo.cgi/devel




--
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>

[OMPI devel] MPI_Cart_coords_f segv with Intel compiler

2012-05-24 Thread TERRY DONTJE

I am seeing several Cart Fortran tests (like MPI_Cart_coords_f) segv in 
opal_memory_ptmalloc2_int_free when OMPI trunk is compiled with icc 
12.1.0 for 64 bit on linux.   Just wondering if anyone has seen anything 
similar to this with a different version of icc.  Other non-Intel 
compilers seem to not exhibit this issue.


--
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com

Re: [OMPI devel] Non-zero exit status

2012-04-14 Thread TERRY DONTJE


On 4/13/2012 6:40 PM, Ralph Castain wrote:
Did you have the param set? I found some missing code in the orted 
errmgr that contributed to it, but unless you had set the param in 
your test, there is no way it would abort no matter how many procs 
exit with non-zero status.


Is mpirun sticking around after all procs have gone a bug?  If not then 
what is the use of leaving mpirun hanging around?
I'm guessing you have that param set in your test due to our earlier 
defining the default to "no abort". I'm content to leave it there, but 
wanted to ensure your tests ran clean.


I don't believe we are setting the env-var which is why I think we have 
a regression.  It also seems very suspicious to me that both Oracle and 
IU are seeing the same condition in MTT.  I'll look into this more on 
Monday.


--td


On Apr 13, 2012, at 4:32 PM, TERRY DONTJE wrote:

I could see if less then N processes exit with non-zero exit code 
that the ORTE may choose not to abort the job.  However, if all N 
processes have exited or aborted I expect everything to clean up and 
mpirun to exit.  It does not do that at the moment which I think is 
what is causing most of the hangs in the MTT trunk runs which did not 
occur prior to this week.


--td

On 4/13/2012 5:18 PM, Ralph Castain wrote:

This has come up again because some of the MTT tests depend on a specific 
behavior when a process exits with a non-zero status - in this case, they 
expect ORTE to abort the job. At some point, the default had been switched to 
NOT abort the job if a process exited with a non-zero status.

So I'll throw this out to the community: if any process exits with a non-zero 
status, should ORTE abort the job?

I don't personally care, but we ought to decide on something. In the meantime, 
I will set the default so we DO abort, thus allowing the MTT runs to complete 
correctly.

FWIW: the MCA param orte_abort_non_zero_exit can always be set to control this 
behavior.

Ralph


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


--
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>



___
devel mailing list
de...@open-mpi.org <mailto:de...@open-mpi.org>
http://www.open-mpi.org/mailman/listinfo.cgi/devel




___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


--
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>

Re: [OMPI devel] Non-zero exit status

2012-04-13 Thread TERRY DONTJE

I could see if less then N processes exit with non-zero exit code that 
the ORTE may choose not to abort the job.  However, if all N processes 
have exited or aborted I expect everything to clean up and mpirun to 
exit.  It does not do that at the moment which I think is what is 
causing most of the hangs in the MTT trunk runs which did not occur 
prior to this week.


--td

On 4/13/2012 5:18 PM, Ralph Castain wrote:

This has come up again because some of the MTT tests depend on a specific 
behavior when a process exits with a non-zero status - in this case, they 
expect ORTE to abort the job. At some point, the default had been switched to 
NOT abort the job if a process exited with a non-zero status.

So I'll throw this out to the community: if any process exits with a non-zero 
status, should ORTE abort the job?

I don't personally care, but we ought to decide on something. In the meantime, 
I will set the default so we DO abort, thus allowing the MTT runs to complete 
correctly.

FWIW: the MCA param orte_abort_non_zero_exit can always be set to control this 
behavior.

Ralph


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


--
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com

Re: [OMPI devel] r26255 has made openib unusable on Solaris platforms

2012-04-13 Thread TERRY DONTJE

I am thinking MEMORY_LINUX_PTMALLOC2 is probably the right define to key 
off of but this is really going to look gross ifdef'ing out the lines 
that are accessing the Linux memory module.  One other idea I have is to 
create a dummy __malloc_hook in the Solaris memory module but might 
there be other OSes that could run into the same problem?   Or what 
happens if PTMALLOC2 is not used (does that happen)?


--td

On 4/13/2012 10:45 AM, TERRY DONTJE wrote:
r26255 is forcing the use of __malloc_hook which is implemented in 
opal/mca/memory/linux however that is not compiled in the library when 
built on Solaris thus causing a referenced symbol not found when 
libmpi tries to load the openib btl.


I am looking how to fix this now but if someone has a good idea how to 
detect when __malloc_hook is used (or not) I'd be interested in 
hearing it.




--
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>

[OMPI devel] r26255 has made openib unusable on Solaris platforms

2012-04-13 Thread TERRY DONTJE

r26255 is forcing the use of __malloc_hook which is implemented in 
opal/mca/memory/linux however that is not compiled in the library when 
built on Solaris thus causing a referenced symbol not found when libmpi 
tries to load the openib btl.


I am looking how to fix this now but if someone has a good idea how to 
detect when __malloc_hook is used (or not) I'd be interested in hearing it.


--
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com

[OMPI devel] trunk regressions

2012-04-09 Thread TERRY DONTJE

After looking at Oracles MTT results there seem to be a (some??) 
regressions between r26240 and 26249 detected by the ibm and intel tests 
suites.  An example of this is the failures in the comm_join, final and 
loop_spawn tests of the ibm test suite as seen in 
http://www.open-mpi.org/mtt/index.php?do_redir=2055.


Note, I've seen similar errors detected by IU runs too.

I'll look further into this but I thought I would post this just in case 
someone else has seen this.

--
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com

Re: [OMPI devel] [EXTERNAL] Re: Developers Meeting

2012-04-09 Thread TERRY DONTJE


+1 here too.

--td

On 4/6/2012 11:19 PM, Barrett, Brian W wrote:

Agreed.

Brian

On Apr 6, 2012, at 7:31 PM, Ralph Castain wrote:


+1 for SJ - much easier to be someplace with a major airport.


On Apr 5, 2012, at 7:54 AM, Gutierrez, Samuel K wrote:


My vote is for San Jose.

Sam


From: devel-boun...@open-mpi.org [devel-boun...@open-mpi.org] on behalf of Josh 
Hursey [jjhur...@open-mpi.org]
Sent: Wednesday, April 04, 2012 5:14 AM
To: Open MPI Developers
Subject: Re: [OMPI devel] [EXTERNAL] Re: Developers Meeting

I second Oak Ridge (or even UTK) sometime in June.

-- Josh

On Tue, Apr 3, 2012 at 3:07 PM, Barrett, Brian W  wrote:

On 4/3/12 11:08 AM, "Jeffrey Squyres"  wrote:


On Apr 3, 2012, at 11:44 AM, Barrett, Brian W wrote:


There is discussion of attempting to have a developers meeting this
summer.  We haven't had one in a while and people thought it would be
good
to work through some of the ideas on how to implement features for 1.7.
We don't have a location yet, but possibilities include Los Alamos and
San
Jose.  To help us get an idea of who can attend, please add your
information to the doodle poll below.

http://www.doodle.com/cei3ve3qyeer9bv9


Since the meeting is likely to take a whole week, might I suggest making
each Doodle entry represent an entire week?  E.g., June 4-11, June 11-15,
etc.

We talked about 3 days, so I was thinking that perhaps there were half
weeks that worked well for people.

Brian

--
Brian W. Barrett
Dept. 1423: Scalable System Software
Sandia National Laboratories






___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


--
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com

Re: [OMPI devel] OpenMPI and R

2012-04-06 Thread TERRY DONTJE

Have you tried to compile and run a simple MPI program with your 
installed Open MPI?  If that works then you need to figure out what is 
being done by the Makefile when it is "testing if installed package can 
be loaded" and try and reproduce the issue manually.


BTW, I normally configure my OMPI with -enable-orterun-prefix-by-default 
to get OMPI to pull in the right library paths instead of using ldconfig.


In the below ldconfig -p you may want to also grep for mca to make sure 
the plugins being complained about in the R testing are found too, 
though I suspect they are but it would be good to double check.


--td

On 4/5/2012 7:59 PM, Benedict Holland wrote:
So I am now back on this full time as I need this to work. OpenMPI 
1.4.3 is deadlocking with Rmpi and I need the latest code. I still get 
the exact same problem. I configured it with a --prefix=/usr to get it 
to install everything in default directories and added 
/usr/lib/openmpi to my ldconfig. I don't have a LD_LIBRARY_PATH global 
variable on ubuntu 11.10.


ldconfig -p |grep mpi
libvt-mpi.so.0 (libc6,x86-64) => /usr/lib/libvt-mpi.so.0
libvt-mpi.so (libc6,x86-64) => /usr/lib/libvt-mpi.so
libvt-mpi-unify.so.0 (libc6,x86-64) => /usr/lib/libvt-mpi-unify.so.0
libvt-mpi-unify.so (libc6,x86-64) => /usr/lib/libvt-mpi-unify.so
libopenmpi_malloc.so.0 (libc6,x86-64) => /usr/lib/libopenmpi_malloc.so.0
libompitrace.so.0 (libc6,x86-64) => /usr/lib/libompitrace.so.0
libompitrace.so (libc6,x86-64) => /usr/lib/libompitrace.so
libompi_dbg_msgq.so (libc6,x86-64) => /usr/lib/openmpi/libompi_dbg_msgq.so
libmpi_f90.so.1 (libc6,x86-64) => /usr/lib/libmpi_f90.so.1
libmpi_f90.so.0 (libc6,x86-64) => /usr/lib/libmpi_f90.so.0
libmpi_f90.so (libc6,x86-64) => /usr/lib/libmpi_f90.so
libmpi_f77.so.1 (libc6,x86-64) => /usr/lib/libmpi_f77.so.1
libmpi_f77.so.0 (libc6,x86-64) => /usr/lib/libmpi_f77.so.0
libmpi_f77.so (libc6,x86-64) => /usr/lib/libmpi_f77.so
libmpi_cxx.so.1 (libc6,x86-64) => /usr/lib/libmpi_cxx.so.1
libmpi_cxx.so.0 (libc6,x86-64) => /usr/lib/libmpi_cxx.so.0
libmpi_cxx.so (libc6,x86-64) => /usr/lib/libmpi_cxx.so
libmpi.so.1 (libc6,x86-64) => /usr/lib/libmpi.so.1
libmpi.so.0 (libc6,x86-64) => /usr/lib/libmpi.so.0
libmpi.so (libc6,x86-64) => /usr/lib/libmpi.so
libexempi.so.3 (libc6,x86-64) => /usr/lib/libexempi.so.3
libcompizconfig.so.0 (libc6,x86-64) => /usr/lib/libcompizconfig.so.0

Compiling Rmpi from inside R gives me:

* installing *source* package 'Rmpi' ...
checking for gcc... gcc -std=gnu99
checking for C compiler default output file name... a.out
checking whether the C compiler works... yes
checking whether we are cross compiling... no
checking for suffix of executables...
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc -std=gnu99 accepts -g... yes
checking for gcc -std=gnu99 option to accept ISO C89... none needed
I am here /usr and it is OpenMPI
Trying to find mpi.h ...
Found in /usr/include
Trying to find libmpi.so or libmpich.a ...
Found libmpi in /usr/lib
checking for openpty in -lutil... yes
checking for main in -lpthread... yes
configure: creating ./config.status
config.status: creating src/Makevars
** Creating default NAMESPACE file
** libs
gcc -std=gnu99 -I/usr/share/R/include -DNDEBUG -DPACKAGE_NAME=\"\" 
-DPACKAGE_TARNAME=\"\" -DPACKAGE_VERSION=\"\" -DPACKAGE_STRING=\"\" 
-DPACKAGE_BUGREPORT=\"\" -I/usr/include  -DMPI2 -DOPENMPI -fpic 
 -O3 -pipe  -g  -c RegQuery.c -o RegQuery.o
gcc -std=gnu99 -I/usr/share/R/include -DNDEBUG -DPACKAGE_NAME=\"\" 
-DPACKAGE_TARNAME=\"\" -DPACKAGE_VERSION=\"\" -DPACKAGE_STRING=\"\" 
-DPACKAGE_BUGREPORT=\"\" -I/usr/include  -DMPI2 -DOPENMPI -fpic 
 -O3 -pipe  -g  -c Rmpi.c -o Rmpi.o
gcc -std=gnu99 -I/usr/share/R/include -DNDEBUG -DPACKAGE_NAME=\"\" 
-DPACKAGE_TARNAME=\"\" -DPACKAGE_VERSION=\"\" -DPACKAGE_STRING=\"\" 
-DPACKAGE_BUGREPORT=\"\" -I/usr/include  -DMPI2 -DOPENMPI -fpic 
 -O3 -pipe  -g  -c conversion.c -o conversion.o
gcc -std=gnu99 -I/usr/share/R/include -DNDEBUG -DPACKAGE_NAME=\"\" 
-DPACKAGE_TARNAME=\"\" -DPACKAGE_VERSION=\"\" -DPACKAGE_STRING=\"\" 
-DPACKAGE_BUGREPORT=\"\" -I/usr/include  -DMPI2 -DOPENMPI -fpic 
 -O3 -pipe  -g  -c internal.c -o internal.o
gcc -std=gnu99 -shared -o Rmpi.so RegQuery.o Rmpi.o conversion.o 
internal.o -L/usr/lib -lmpi -lutil -lpthread -L/usr/lib/R/lib -lR

installing to /usr/local/lib/R/site-library/Rmpi/libs
** R
** demo
** inst
** preparing package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded
[ben-Inspiron-1764:18216] mca: base: component_find: unable to open 
/usr/lib/openmpi/mca_paffinity_hwloc: 
/usr/lib/openmpi/mca_paffinity_hwloc.so: undefined symbol: 
opal_hwloc_topology (ignored)
[ben-Inspiron-1764:18216] mca: base: component_find: unable to open 
/usr/lib/openmpi/mca_shmem_posix: /usr/lib/openmpi/mca_shmem_posix.so: 
undefined symbol: opal_shmem_base_output (ignored)
[ben-Inspiron-1764:18216]

Re: [OMPI devel] Intel test MPI_Keyval3_f now failing

2012-04-05 Thread TERRY DONTJE

Ok, I'll leave it alone then.  You may want to keep this in mind just in 
case your merge with the trunk pollutes your bindings somehow.


--td

On 4/5/2012 8:45 AM, Jeffrey Squyres wrote:

I'm able to duplicate the problem, but I don't know if this is worth digging in 
to.

The entire Fortran bindings will be replaced in about 2 weeks, and the problem 
doesn't occur on my mpi3-fortran bitbucket.



On Apr 5, 2012, at 7:03 AM, TERRY DONTJE wrote:


I noticed both IU and Oracle are seeing failures on the trunk with Intel test 
MPI_Keyval3_f.  This was with r26237 and the last successful MTT run of this 
test was r26232.  I looked at the log and nothing popped out at me.  I'll try 
and narrow this down a little further but that won't be until later this 
morning.

--
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle - Performance Technologies
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com



___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




--
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>

[OMPI devel] Intel test MPI_Keyval3_f now failing

2012-04-05 Thread TERRY DONTJE

I noticed both IU and Oracle are seeing failures on the trunk with Intel 
test MPI_Keyval3_f.  This was with r26237 and the last successful MTT 
run of this test was r26232.  I looked at the log and nothing popped out 
at me.  I'll try and narrow this down a little further but that won't be 
until later this morning.


--
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com

Re: [OMPI devel] SUSE verification

2012-03-21 Thread TERRY DONTJE

Sorry the below cc line if for Solaris Studio compilers if you have gcc 
replace "-G" with "-shared".


thanks,

--td

On 3/21/2012 11:32 AM, TERRY DONTJE wrote:
I ran into a problem on a Suse 10.1 system and was wondering if anyone 
has a version of Suse newer than 10.1 that can try the following test 
and send me the results.

-testpci
cat <testpci.c
#include 

int testpci() {
struct pci_access *pciaccess;

pci_init(pciaccess);
}

EOF

cc -G -m64 testpci.c -lpci




--
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>

[OMPI devel] SUSE verification

2012-03-21 Thread TERRY DONTJE

I ran into a problem on a Suse 10.1 system and was wondering if anyone 
has a version of Suse newer than 10.1 that can try the following test 
and send me the results.

-testpci
cat

Re: [OMPI devel] v1.5 build failure w/ Solaris Studio 12.2 on Linux

2012-02-23 Thread TERRY DONTJE




On 2/22/2012 8:53 PM, Jeffrey Squyres wrote:

Terry / Eugene --

Can you comment?

Sorry I cannot.

--td


On Feb 22, 2012, at 3:16 PM, Paul H. Hargrove wrote:


I think I have the beginning of a fix for this issue.

I had not even noticed earlier that the error in event.h is from the C++ compiler, when 
compiling file.cxx in the c++ bindings.  That makes the vendor-specific addition of 
"-library=stlport4" to CXXFLAGS quite relevant to the problem/solution.

It eventually occurred to me that when VT's sub-configure told me to add 
configure arguments, I could have used --with-contrib-vt-flags to pass that 
ONLY to VT and perhaps NOT mess with whatever karma was providing the 
definition of u_char.  However, when I tried that I was disappointed to find 
that the bit of configure logic that suggests/requires 
CXXFLAGS=-library=stlport4 (from ompi/contrib/vt/configure.m4) runs BEFORE the 
processing of --with-contrib-vt-flags.  So, that was a dead end.

So, the next idea was to look for a fix specific to sltport.  I tried adding 
near the top of opal/event/event.h (after the WINDOWS equivalent):

#ifdef STLPORT
typedef unsigned char u_char;
#endif

That managed to clear up the original problem w/ SS12.2.
With SS12.3, things also built fine.
This suggests the typedef is not "conflicting" with whatever other defn was 
present.
I think the "safety" of this needs to be examined more widely before this can 
be adopted.
My concern is that some system could "typedef char u_char" if it has char 
unsigned by default, leading to a conflict.
Now that would, I suppose, only be a problem if STLPORT is also defined.
So, maybe I am over thinking this.

-Paul

On 2/21/2012 11:10 PM, Paul H. Hargrove wrote:

More notes:

I've tested ompi-1.5.4 and it has the same problem.  So, this is NOT a 
regression.

Terry D. has observed that Ubuntu is NOT a supported platform for the Solaris 
Studio compilers.
So, I've reproduced on a Scientific Linux 5.5 platform (Red Hat Enterprise 
Linux 5.5 clone, like CentOS) to be sure that was NOT the cause.

When I configure for the SS12.x compilers, I've been passing  
CXXFLAGS="-library=stlport4" as the VT sub-configure has informed me I should, 
due to something wrong the the default STL.  I tried dropping that from configure, and 
THE BUILD WAS SUCCESSFUL.

So, one has 2 choices:
+ build w/ SS12.2 without VT
+ update to SS12.3 and have VT

I don't think there is sufficient reason to delay 1.5.5 for this.

-Paul

On 2/21/2012 4:39 PM, Paul H. Hargrove wrote:

A few things to note:

1) This is NOT a problem w/ the SS12.3 compilers on the same machine.
So, one could say "upgrade your compiler" (a free download) and not delay 1.5.5 
for this issue.

2) This is ONLY a problem on Linux, and not on Solaris (both SS12.2 and SS12.3 
tested for x86, x86-64, Sparc/v9 and Sparc/v8plus)

3) Testing the trunk I DON'T see the problem with either SS12.2 or SS12.3.
This is interesting, because it probably means that a u_char definition is 
SOMEWHERE in the headers (because libevent *is* getting built).

Whatever else may be done, I think this should be fixed "properly" (whatever 
that may equate to) for 1.6.
The way I see it now, it feels like OMPI is getting a definition of u_char only "by 
accident".

-Paul

On 2/21/2012 12:16 PM, Paul H. Hargrove wrote:

Building the v1.5 branch on Linux with the Solaris Studio 12.2 compilers I see 
the following failure:

"[srcdir]/opal/event/event.h", line 797: Error: Type name expected instead of 
"u_char".
"[srcdir]/opal/event/event.h", line 798: Error: Type name expected instead of 
"u_char".
"[srcdir]/opal/event/event.h", line 1184: Error: "," expected instead of "*".

Where line 1184 is a prototype containing "u_char *".

As far as I can find, only several files below opal/event/ contain any use of 
"u_char".
There is a typedef for u_char in hwloc, but no use that I could see.

To the best of my knowledge u_char is NOT defined by any standard, and thus 
there is no particular header one can reliably find it in.
The alternatives, of course are "unsigned char" or "uint8_t" (defined in 
stdint.h).

I had a look at the trunk and VISUALLY is appears the same problem exists in:
   opal/event/event.h
   opal/mca/event/libevent2013/libevent/event.h
However, my testing is currently confined to the v1.5 branch in the hopes of 
finally getting the next 1.5.5rc out the door.

-Paul


--
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
HPC Research Department   Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




--
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com

Re: [OMPI devel] 1.5 supported systems

2012-02-23 Thread TERRY DONTJE


I actually think the systems tested line for Solaris should read:
- Oracle Solaris 10 and 11, 32 and 64 bit (SPARC, i386, x86_64), with
 Oracle Solaris Studio 12.2 and 12.3

--td
On 2/22/2012 8:55 PM, Paul H. Hargrove wrote:
Folks at Oracle should decide, but I suspect "Solaris 10" should be 
updated to "Solaris 10 and 11", or just "11".


-Paul

On 2/22/2012 2:44 PM, Jeffrey Squyres wrote:

Please verify this list of supported systems for the v1.5.5 release:

- The run-time systems that are currently supported are:
   - rsh / ssh
   - LoadLeveler
   - PBS Pro, Open PBS, Torque
   - Platform LSF (v7.0.2 and later)
   - SLURM
   - Cray XT-3, XT-4, and XT-5
   - Oracle Grid Engine (OGE) 6.1, 6.2 and open source Grid Engine
   - Microsoft Windows CCP (Microsoft Windows server 2003 and 2008)

- Systems that have been tested are:
   - Linux (various flavors/distros), 32 bit, with gcc, and Oracle
 Solaris Studio 12
   - Linux (various flavors/distros), 64 bit (x86), with gcc, Absoft,
 Intel, Portland, and Oracle Solaris Studio 12 compilers (*)
   - OS X (10.5, 10.6, 10.7), 32 and 64 bit (x86_64), with gcc and
 Absoft compilers (*)
   - Oracle Solaris 10, 32 and 64 bit (SPARC, i386, x86_64), with
 Oracle Solaris Studio 12

   (*) Be sure to read the Compiler Notes, below.

- Other systems have been lightly (but not fully tested):
   - Other 64 bit platforms (e.g., Linux on PPC64)
   - Microsoft Windows CCP (Microsoft Windows server 2003 and 2008);
 see the README.WINDOWS file.





--
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com

Re: [OMPI devel] non-portable code in examples/Makefile

2012-02-21 Thread TERRY DONTJE




On 2/21/2012 5:55 AM, Jeff Squyres (jsquyres) wrote:

That is truly bizarre "make" behavior.

Heads up that in the upcoming fortran revamp, we *only* use FC. I.E., 
there's only mpifort wrapper compiler (mpif77 and mpif90 still exist, 
but only as sym links to mpifort, signifying that mpifort is the way 
of the future).


This was done because there have been no f77 compilers for decades 
(literally), and no f90 compilers for 10+ years. All the fortran 
compiler vendors have long-since moved to a single compiler executable 
name (e.g., ifort, gfortran), so mpifort just reflects that.


Hmmm, well Oracle's compiler is still named f90 :-).   (now to duck and 
cover)


--td

Sent from my phone. No type good.

On Feb 21, 2012, at 5:01 AM, "Paul H. Hargrove" > wrote:



Thanks, Ralph.
Excellent point about not needing to use the "FC" name with its 
special (absurd?) behavior.


-Paul

On 2/21/2012 1:52 AM, Ralph Castain wrote:
I went ahead and applied this, with a tweak. There is no reason to 
call our flag "FC" as all we use it for is to call the write 
wrapper. So I renamed it to something less problematic.


On Feb 21, 2012, at 1:20 AM, Paul H. Hargrove wrote:

And while we are looking at examples/Makefile on Solaris-10, why 
are the F77 examples getting built w/ mpif90?
Because w/ the Solaris make setting FC also silently sets F77 (yes, 
I am NOT kidding)!
So, reordering the F77= and FC= lines in Makefile resolves that 
mis-behavior.


Attached is my patch to fix both F77/FC and the "better" ompi_info 
queries mentioned in my previous post.

This REPLACES the patch in the previous post.

-Paul

On 2/20/2012 11:36 PM, Paul H. Hargrove wrote:
The addition on Monday of the Java cases to examples/Makefile has 
shown that the default "make" in Solaris-10 will stop on the first 
failed grep command in the "all" rule:

$ make
mpicc -g   -o hello_c hello_c.c
mpicc -g   -o ring_c ring_c.c
mpicc -g   -o connectivity_c connectivity_c.c
mpic++ -g   -o hello_cxx hello_cxx.cc
mpic++ -g   -o ring_cxx ring_cxx.cc
mpif90 -g hello_f77.f -o hello_f77
mpif90 -g ring_f77.f -o ring_f77
mpif90 -g hello_f90.f90 -o hello_f90
mpif90 -g ring_f90.f90 -o ring_f90
*** Error code 1
The following command caused the error:
if test "`ompi_info --parsable | grep bindings:java:yes`" != ""; 
then \

make Hello.class; \
fi
make: Fatal error: Command failed for target `all'


The addition of java did NOT break anything, but exposed a 
pre-existing problem which  was not evident in my prior testing 
because all language bindings were being build prior to adding java.


The attached patch resolves the problem in my (admittedly minimal) 
testing with the smallest possible change.
However an entirely different avoids both "test" and "true" and 
simply looks like:
@ if ompi_info --parsable | grep bindings:cxx:yes >/dev/null; 
then

I have *also* tested that approach, and it works fine too.

I *did* warn that the introduction of the java bindings would 
bring collateral damage.

I just didn't anticipate encountering it personally.

-Paul



___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


--
Paul H. hargrovephhargr...@lbl.gov
Future Technologies Group
HPC Research Department   Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
___
devel mailing list
de...@open-mpi.org 
http://www.open-mpi.org/mailman/listinfo.cgi/devel




___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


--
Paul H. hargrovephhargr...@lbl.gov
Future Technologies Group
HPC Research Department   Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
___
devel mailing list
de...@open-mpi.org 
http://www.open-mpi.org/mailman/listinfo.cgi/devel



___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


--
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com

Re: [OMPI devel] RFC: Add "virbr0" to [btl|oob]_tcp_if_exclude?

2012-02-10 Thread TERRY DONTJE

On 2/10/2012 11:50 AM, Jeff Squyres wrote:

This is an open question to OMPI developers...

It looks like RHEL (and maybe others?) adds the "virbr0" IP interface when Xen
is activated. This IP interface is only used to communicate with the local Xen
instance(s); it is not used to communicate over the real network.

In a case that I saw, the interface is created, set to "up", and is given an IP address
in the 192.168.1.x range. This was done by default -- all the user had done was either say
"yes, I want Xen enabled", or he didn't say he wanted it *disabled* (I'm not sure which).
I've done the latter and hit the same problem. There were instructions
somewhere on the web that I found that told one how to disable vibr0.

This causes a problem if you have Xen enabled on multiple machines in an OMPI job. OMPI
will see the 192.168.1.x address and see that it's "up", so it'll add it to the
eligible subnets that can be used. When OMPI sees that its peer processes also have
192.168.1.x, it'll try to use that network for OOB/BTL traffic -- which will fail,
because these are local-only interfaces.

Should we add "virbr0" to the default value for [btl|oob]_tcp_if_exclude?
What happens to that value if you then set btl_tcp_if_exclude to some
value on the mpirun command line? So this brings me to something that
has annoyed me for a bit. It seems to me that maybe it would be nice to
have a conf file that you can dump interface names to exclude but would
not be interpreted as a btl_tcp_if_exclude options. For example there
were some interfaces on certain Sun machine (a long time ago) that went
to the diagnostic processor and caused a similar issue as the virbr0
issue. So we started delivering a conf file that set btl_tcp_if_exclude
but then this precluded anyone from being able to set
btl_tcp_if_include. If we had a file one could specify the set of
interfaces to use or exclude but allow the user to operate on the result
of that set it seems that would be nice.

--td

Or is there another way to detect that an interface is local-only and should
not be used for OOB/BTL communication?

See this post on the user's list:

http://www.open-mpi.org/community/lists/users/2012/02/18432.php

--
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com

Re: [OMPI devel] 1.4.5rc5 has been released

2012-02-08 Thread TERRY DONTJE




On 2/7/2012 9:57 PM, Paul H. Hargrove wrote:


On 2/7/2012 2:37 PM, Paul H. Hargrove wrote:


+ "make check" fails atomics tests using GCCFSS-4.0.4 compilers on 
Solaris10/SPARC
Originally reported in: 
http://www.open-mpi.org/community/lists/devel/2012/01/10234.php
This is a matter of the Sun/Oracle fork of GCC (known as GCC For 
SPARC Systems, or GCCFSS) being buggy with respect to GNU inline asm.
The original failures were with gccfss-4.0.4, but am now retested 
with gccfss-4.3.3.
I'll report on those results later. 


Use of gccfss-4.3.3 is not an improvement.
Instead of failing the atomic_cmpset test, the compiler HANGS when 
compiling atomic_cmpset.c.
I allowed the compiler just over 4 hours accumulated CPU time before 
being convinced it was hung.


So, I'd like to request documenting "gccfss" as unusable in README.
This is important because this broken compiler is installed as 
/usr/bin/gcc on some Solaris systems.


Ugh.  Isn't there a configure option to not use inline asm (looks like 
no to me)?  I'll have to see if Oracle has a bug on this but I think 
Paul is right that this should be documented in the README.  I'll do it 
in conjunction with the 64 bit /lib issue once Paul confirms my suspicions.


--td

-Paul



--
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com

Re: [OMPI devel] 1.4.5rc5 has been released

2012-02-08 Thread TERRY DONTJE




On 2/7/2012 4:25 PM, Paul H. Hargrove wrote:



On 2/7/2012 8:59 AM, Jeff Squyres wrote:

This fixes all known issues.


Well, not quite...

I've SUCCESSFULLY retested 44 out of the 55 cpu/os/compiler/abi 
combinations currently on my list.
I expect 9 more by the end of the day (the older/slower hosts), but 
two of my test hosts are down.


So far I see only two problems that remain:

+ I can't build w/ the PGI compilers on MacOS Lion.
This was previously reported in 
http://www.open-mpi.org/community/lists/devel/2012/01/10258.php


+ Building w/ Solaris Studio 12.2 or 12.3 on Linux x86-64, with "-m32" 
required setting LD_LIBRARY_PATH.
Can the LD_LIBRARY_PATH be substituted with a rpath change in LDFLAGS of 
the build?
This is could either be Oracle's bug in the compiler, or a libtool 
problem.
My report was: 
http://www.open-mpi.org/community/lists/devel/2012/01/10272.php


I thought I responded to the above issue.  I think this may be a OS 
distribution (Solaris Studio assumption) issue.  On my RH system /lib 
contains the 32 libraries and /lib64 has the 64 bit libs.  I assume your 
system may have it the other way around (/lib = 64 bit libs and /lib32 
has 32 bit).  Can you confirm that your /lib contains 64 bit libs.  Also 
can you do a "cc -### -m32" compile and link of a simple program and 
confirm that the compiler is pulling in /lib (I am 99% certain it is).  
Also, is this /lib is 64 bit libraries a common thing, none of my Linux 
systems are set up this way.


Anyways, I think maybe a note in the README is in store for such setups.
--
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com

Re: [OMPI devel] 1.4.5rc2 more Solaris Studio compiler results

2012-01-30 Thread TERRY DONTJE




On 1/29/2012 7:40 PM, Paul Hargrove wrote:
I can additionally report success w/ ILP32 builds with both SS12.2 and 
12.3 compilers on x86-64 and sun4v systems running Solaris and 
x86-64/Linux:
   solaris-10 Generic_137111-07/sun4v (*FLAGS="-m32 -xarch=sparc" for 
v8plus ABI)

   solaris-11 snv_151a/amd64 [incl. ofud, openib and dapl]  (*FLAGS=-m32)
   linux/x86-64 (*FLAGS=-m32)

On Linux I had to "LD_LIBRARY_PATH=:/lib32:/usr/lib32", but that seems 
to be an Solaris Studio issue, rather than an OMPI or libtool one. 
 That was NOT necessary to get a ILP32 using GCC.


This sounds like more a runpath (mis)setting to me.  Can you send me 
your config.log and a copy of your make output?  Did you run into the 
same issue with -m64?
My sun4u (single-CPU UltraSparcIII) system is just too painfully slow 
to test yet again.



I'd imagine so :-).

Thanks,

--td

-Paul

On Wed, Jan 25, 2012 at 11:49 PM, Paul H. Hargrove > wrote:


I am pleased to report that w/ help from Terry I can now build
nearly everything w/ the Solaris Studio 12.2 and 12.3 compilers.
Upon comparing our build environments we discovered that CXX=CC
works but CXX=sunCC does not, even though they are both symlinks
to the same compiler executable.  I still don't know the root
cause (though libtool and associated configure logic is still the
obvious suspect), but the work around is simple:
   When using the Solaris Studio compilers on Solaris, one must
set CXX=CC rather than  CXX=sunCC.

So I am following that advice, and have additionally:
+ gotten  up-to-date patches applied to resolve my FORTRAN and OMP
issues on the SPARC-T2 system.
+ installed both 12.2 and 12.3 compilers on Linux/x86-64

So, I can now report the following ALL work (defined as "make all
check install") with BOTH 12.2 and 12.3 Solaris Studio compilers.
The only configure flags are --prefix, setting the CC, CXX, F77
and FC variables, and (when appropriate) setting *FLAGS=-m64.
   solaris-10 s10_69/sun4u (w/ -m64)
   solaris-10 Generic_137111-07/sun4v (w/ -m64)
   solaris-11 snv_151a/amd64 [including ofud, openib and dapl] (w/
-m64)
   linux/x86-64 (no -m64 needed because it is the default)

The following works w/ the 12.2 compilers:
   solaris-10 Generic_142901-03/i386
However, the f77/f90 compilers in 12.3 are generating code using
SSE2 instructions even when passed -xarch=pentium_pro.
So this machine cannot run the resulting executables.  So, I had
to --disable-mpi-f77 to get things to work.
That, however, is NOT an OMPI problem.

-Paul

On 1/19/2012 11:21 PM, Paul H. Hargrove wrote:

As promised earlier today, here are results from my Solaris
platforms.
Note that there are libtool-related failures below that may be
worth pursuing.
If necessary, access to most of my machines can be arranged
for qualified persons.

== GNU compilers with {C,CXX,F77,FC}FLAGS=-mcpu=v9 on SPARCs,
and -m64 on amd64

PASS:
   solaris-10 s10_69/sun4u (w/ g77, no FC)
   solaris-10 Generic_142901-03/i386 (w/ Sun's f77 and f95,
both dated April 2009)
   solaris-11 snv_151a/amd64 [including ofud, openib and dapl]
(w/ g77, no FC)

FAIL:
   solaris-10 Generic_137111-07/sun4v with default GNU compilers
Using system default gcc, which is actually Sun's
gccfss-4.0.4, there are assertion failures seen in the atomics
in "make check".  I can provide details is anybody cares, but
I know from past experience that support for gcc-style inline
asm is marginal in this compiler.

== Sun Studio 12.2 compilers w/ {C,CXX,F77,FC}=-m64 on SPARCs
and amd64

Both of my SPARC systems appear to have an out-of-date
libmtsk.so, which both prevents the Sun f77 and f90 compilers
from running at all, and additionally leads to failure like
the following when building OpenMP support in VT:

/bin/bash ../../libtool --tag=CXX--mode=link sunCC
-xopenmp -DVT_OMP  -m64 -xopenmp  -o vtfilter
vtfilter-vt_filter.o  vtfilter-vt_filthandler.o
 vtfilter-vt_otfhandler.o  vtfilter-vt_tracefilter.o
../../util/util.o  -L../../extlib/otf/otflib
-L../../extlib/otf/otflib/.libs -lotf  -lz -lsocket -lnsl
 -lrt -lm -lthread
libtool: link: sunCC -xopenmp -DVT_OMP -m64 -xopenmp -o
vtfilter vtfilter-vt_filter.o vtfilter-vt_filthandler.o
vtfilter-vt_otfhandler.o vtfilter-vt_tracefilter.o
../../util/util.o
 
-L/home/hargrove/OMPI/openmpi-1.4.5rc2-solaris10-sparcT2-ss12u2/BLD/ompi/contrib/vt/vt/extlib/otf/otflib/.libs

Re: [OMPI devel] 1.4.5rc2 now released

2012-01-20 Thread TERRY DONTJE




On 1/19/2012 5:22 PM, Paul H. Hargrove wrote:
Minor documentation nit, which might apply to the 1.5 branch as well 
(didn't check).


README says:

- Open MPI does not support the Sparc v8 CPU target, which is the
  default on Sun Solaris.  The v8plus (32 bit) or v9 (64 bit)
  targets must be used to build Open MPI on Solaris.  This can be
  done by including a flag in CFLAGS, CXXFLAGS, FFLAGS, and FCFLAGS,
  -xarch=v8plus for the Sun compilers, -mcpu=v9 for GCC.


However, following that instruction w/ Sun Studio 12 Update 2 yields:

cc: Warning: -xarch=v8plus is deprecated, use -m32 -xarch=sparc instead

for every single compilation.

I vaguely recall noting this once before, perhaps 2 years or so.

Additionally, it appears that the "Sun" example is for the 32-bit ABI 
and the "GCC" example for the 64-bit ABI.
Actually I think the whole comment is incorrect (at least from Solaris 
Studio 12u2 on) in that the default is no longer SPARC v8 target and 
that one can actually specify just -m32 and -m64 without the -xarch 
option.  So I wonder if we should just strike that whole block of text 
from the README.


Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com

Re: [hwloc-devel] Solaris visibility issue

2012-01-18 Thread TERRY DONTJE


Ok, I tried the patch and it seems to work for me.

On 1/18/2012 2:10 PM, Samuel Thibault wrote:

TERRY DONTJE, le Wed 18 Jan 2012 19:23:03 +0100, a écrit :

Don't you need the function to make lstopo work?

lstopo itself does not need it, only internal functions of libhwloc
needs it. Could you try the attached patch?

Thanks,
Samuel


--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>

Re: [hwloc-devel] Solaris visibility issue

2012-01-18 Thread TERRY DONTJE




On 1/18/2012 1:17 PM, Samuel Thibault wrote:

TERRY DONTJE, le Wed 18 Jan 2012 18:52:50 +0100, a écrit :

Also, I tried to build v1.4 and the trunk and I keep getting linkage errors
on lstopo-lstopo-draw.o complaining about hwloc_insert_object_by_cpuset
being undefined.

It is defined in ./src/topology.c. Please check with make V=1 that
topology.o is really included in the link. Also paste the whole log
output, the issue actually come from somewhere.


There must be something screwy with how visibility is done because when I
disabled visibility I got a workable lstopo and friends.

Are you building with optimizations disabled?  I notice that
hwloc_insert_object_by_cpuset is the only function called in header
inlines which is not external. Maybe we can simply ifdef that inline out
when not building the library.

Samuel
I didn't specify using optimizations or not on my configure line.  Don't 
you need the function to make lstopo work?


--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>

Re: [hwloc-devel] topology-solaris-chiptype.c bugs on processortype

2012-01-18 Thread TERRY DONTJE




On 1/18/2012 12:59 PM, TERRY DONTJE wrote:



On 1/18/2012 12:51 PM, Samuel Thibault wrote:

Samuel Thibault, le Wed 18 Jan 2012 15:55:18 +0100, a écrit :

I'm getting an issue with the topology-solaris-chiptype.c work on
Solaris 10: in the ProcessorType case, the returned value does not look
like a readable string, I'm getting "¨", which consequently poses
problems in the xml parser. Of course, googling picl_get_propinfo
"ProcessorType" returns... topology-solaris-chiptype.c ...

Can somebody have a look?

using ptrpicl -v, I'm getting

   :name  cpus
  cpu (cpu, 890735)
   :StateBeginWed Jan 18 07:18:41 2012
   :FPUType
   :ProcessorType \360\275^T^H

which matches what I'm getting with hwloc (0xf0 0xbd 0x14 0x8...) (yes,
it's different from by previous report because my job request got
another but similar hardware machine).
Weird, my prtpicl on my v20z (which is a Sun system) has readable 
fields FPUType and ProcessorType.
However, I am using an older S10 Generic_120012-11.  I'll see if I can 
find something newer to try.


--td
Just tried an S10 Generic_144489-05 which has the same version of picl 
packages as you show below.  The system is showing readable fields, 
unlike yours.


I can only infer that there must be some issue with compatibility 
between Dell platforms and S10's initialization of the picl information.


Sorry can't be much more help,

--td

pkginfo reports picl packages as being the following version:

VERSION:  11.10.0,REV=2005.01.21.16.34
 PSTAMP:  on10ptchfeatx20080814051153

Samuel


--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>




___
hwloc-devel mailing list
hwloc-de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel


--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>

Re: [hwloc-devel] topology-solaris-chiptype.c bugs on processortype

2012-01-18 Thread TERRY DONTJE




On 1/18/2012 12:51 PM, Samuel Thibault wrote:

Samuel Thibault, le Wed 18 Jan 2012 15:55:18 +0100, a écrit :

I'm getting an issue with the topology-solaris-chiptype.c work on
Solaris 10: in the ProcessorType case, the returned value does not look
like a readable string, I'm getting "¨", which consequently poses
problems in the xml parser. Of course, googling picl_get_propinfo
"ProcessorType" returns... topology-solaris-chiptype.c ...

Can somebody have a look?

using ptrpicl -v, I'm getting

   :name  cpus
  cpu (cpu, 890735)
   :StateBeginWed Jan 18 07:18:41 2012
   :FPUType
   :ProcessorType \360\275^T^H

which matches what I'm getting with hwloc (0xf0 0xbd 0x14 0x8...) (yes,
it's different from by previous report because my job request got
another but similar hardware machine).
Weird, my prtpicl on my v20z (which is a Sun system) has readable fields 
FPUType and ProcessorType.
However, I am using an older S10 Generic_120012-11.  I'll see if I can 
find something newer to try.


--td

pkginfo reports picl packages as being the following version:

VERSION:  11.10.0,REV=2005.01.21.16.34
 PSTAMP:  on10ptchfeatx20080814051153

Samuel


--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com

Re: [OMPI devel] missing file when running autogen - ALSO in 1.4.5rc1

2011-12-21 Thread TERRY DONTJE

Paul's probably more than likely right.  The nightly runs Oracle does 
using MTT and tarballs do not do autogen.sh (which I believe is not 
expected anyways, right).  All other builds we do using autogen.* are 
from an svn workspace.


--td

On 12/20/2011 8:21 PM, Paul H. Hargrove wrote:

Not too bizarre.
This probably just means that nobody has ever run autogen.sh from a 
tarball on a system using Sun's FORTRAN compiler.


-Paul

On 12/20/2011 5:01 PM, Ralph Castain wrote:
Bizarre - fixed it too, but no promise when that might appear in a 
release.


Thanks!

On Dec 20, 2011, at 5:10 PM, Paul H. Hargrove wrote:


You are welcome.
NOTE: the same issue exists in 1.4.5rc1

$ grep for-sun-fortran openmpi-1.4.5rc1/autogen.sh
config/modify-configure-for-sun-fortran.pl
$ tar tfj openmpi-1.4.5rc1.tar.bz2 | grep 
modify-configure-for-sun-fortran.pl || echo NOPE

NOPE

-Paul

On 12/20/2011 3:55 PM, Ralph Castain wrote:
You are quite correct - it was indeed missing from Makefile.am! 
Fixed - and thanks!


On Dec 20, 2011, at 4:40 PM, Paul H. Hargrove wrote:

Regardless of any other issues the referenced file does not appear 
in the tarball at all:


$ tar tfj openmpi-1.5.5rc1.tar.bz2 | grep 
modify-configure-for-sun-fortran.pl || echo NOPE

NOPE

-Paul

On 12/20/2011 3:30 PM, Jeff Squyres wrote:

Yeah, we've known about this one and mostly ignored it.

It occurs when autogen.sh is in a non-root directory and tries to 
run that script in a place where it doesn't exist (it does exist 
in the root directory).  We hadn't previously bothered to fix it 
because a) it's in autogen and users won't see it, b) we've 
revamped autogen on the trunk such that this doesn't happen 
anyway, and c) it's a non-fatal error (and reflects correct 
behavior anyway -- we don't need that script run anywhere except 
the root).


I'll add it to the list, but I don't know if it'll actually get 
fixed.



On Dec 20, 2011, at 6:22 PM, Paul H. Hargrove wrote:

While trying to resolve the gmake-vs-bmake problem I ran 
autogen.sh and saw:


/home/phargrov/OMPI/openmpi-1.5.5rc1/openmpi-1.5.5rc1/autogen.sh: line 
701: config/modify-configure-for-sun-fortran.pl: No such file or 
directory


I suspect this just requires an addition to EXTRA_DIST in 
config/Makefile.am


-Paul

--
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
HPC Research Department   Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

--
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
HPC Research Department   Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

--
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
HPC Research Department   Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com

Re: [OMPI devel] Nodes already filled when spawning

2011-12-15 Thread TERRY DONTJE


There's an oversubscribe option I can set in my case, right?

Thanks,

--td

On 12/15/2011 1:22 PM, Ralph Castain wrote:
This is fixed, to a degree, with r25659. However, note that there is 
one big change that occurred back when we first committed the mapping 
change.


As I noted at that time, we changed the default for RM-given 
allocations to be no-oversubscribe. So your MTTs may well fail if they 
weren't updated as all those tests oversubscribe the nodes, and are 
running in RM environments.



On Dec 15, 2011, at 8:37 AM, TERRY DONTJE wrote:

Last night MTT test results for 1.7a1r25652 from IU and Oracle is 
showing failures during some of the spawn tests see 
http://www.open-mpi.org/mtt/index.php?do_redir=2036.


Essentially, the test are failing with the message:
All nodes which are allocated for this job are already filled.

I wonder if this is related to some of the hostfile changes done lately.  
Anyways, I am
working on narrowing down the revision that introduced that but if someone 
figures this out
before me that would be great.


Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>



___
devel mailing list
de...@open-mpi.org <mailto:de...@open-mpi.org>
http://www.open-mpi.org/mailman/listinfo.cgi/devel



___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>

[OMPI devel] Nodes already filled when spawning

2011-12-15 Thread TERRY DONTJE

Last night MTT test results for 1.7a1r25652 from IU and Oracle is 
showing failures during some of the spawn tests see 
http://www.open-mpi.org/mtt/index.php?do_redir=2036.


Essentially, the test are failing with the message:

All nodes which are allocated for this job are already filled.

I wonder if this is related to some of the hostfile changes done lately.  
Anyways, I am
working on narrowing down the revision that introduced that but if someone 
figures this out
before me that would be great.


Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com

Re: [OMPI devel] [BUG?] OpenMPI with openib on SPARC64: Signal: Bus error (10)

2011-11-23 Thread TERRY DONTJE


On 11/23/2011 1:45 PM, Lukas Razik wrote:

TERRY DONTJE<terry.don...@oracle.com>  wrote


Can you build OMPI as a 32 bit library and see if that works any better?

So you mean I shall leave the whole OFED stack as 64 bit and build only openmpi 
as 32 bit?
I believe the OFED user libraries will need to be 32 bit also or the 32 
bit MPI libraries will not be able to use them.

How must I configure openmpi that it'll be definitely built as 32bit?
You need to change the CFLAGS, CXXFLAGS, FFLAGS and FCFLAGS in the 
configure line such that you replace "-m64" with "-m32" or just "-m32" 
if "-m64" is not there?

Regards,
Lukas



--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>

Re: [OMPI devel] [BUG?] OpenMPI with openib on SPARC64: Signal: Bus error (10)

2011-11-23 Thread TERRY DONTJE




On 11/23/2011 11:05 AM, Lukas Razik wrote:

TERRY DONTJE<terry.don...@oracle.com>  wrote:

Nuts!!! Ok I am going to have to think about this a little more.  Do you have 
the ability to configure and remake your ompi install? I might want to have you 
add some stuff to help me track this down some more if you can recompile your 
ompi.


As I wrote you I've already built the binutils, compiler, kernel, the whole 
OFED stack and openmpi-1.4.4 (with --enable-debug) from their sources. So it's 
no problem for me to apply patches you send me on any of the source packages 
and to build them. :)


Can you build OMPI as a 32 bit library and see if that works any better?

BTW:

In the morning I've taken the newest binutils-2.22, gcc-4.6.2 etc. and built them. Now, 
with this new "SDK", I try to builtd the whole OFED stack. Maybe the new gcc 
handles something differently...




Note, tomorrow and Friday are holiday's here in the U.S. so I won't

 probably get to responding to any email until Monday after today.

Thanks for this hint!

Regards,
Lukas



--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>

Re: [OMPI devel] [BUG?] OpenMPI with openib on SPARC64: Signal: Bus error (10)

2011-11-23 Thread TERRY DONTJE




On 11/23/2011 10:11 AM, Lukas Razik wrote:

TERRY DONTJE<terry.don...@oracle.com>  wrote:

Can you try running the benchmark with coalescing off?  To do that

 add the following option to your mpirun line "-mca
 btl_openib_use_message_coalescing 0".

I've tried this:

# /usr/mpi/gcc/openmpi-1.4.4/bin/mpirun -np 2
   --mca btl_openib_use_message_coalescing 0
   --mca btl_base_verbose 50
   --mca btl_openib_verbose 1
   -host cluster1,cluster2
 /usr/mpi/gcc/openmpi-1.4.3/tests/osu_benchmarks-3.1.1/osu_latency


And that's the result (which isn't different from the run without "--mca 
btl_openib_use_message_coalescing 0"):
http://net.razik.de/linux/T5120/openmpi-1.4.4-verbose_no_coalescing.txt
Nuts!!!  Ok I am going to have to think about this a little more.  Do 
you have the ability to configure and remake your ompi install?   I 
might want to have you add some stuff to help me track this down some 
more if you can recompile your ompi.


Note, tomorrow and Friday are holiday's here in the U.S. so I won't 
probably get to responding to any email until Monday after today.

--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>

Re: [OMPI devel] [BUG?] OpenMPI with openib on SPARC64: Signal: Bus error (10)

2011-11-23 Thread TERRY DONTJE

On 11/23/2011 9:57 AM, Lukas Razik wrote:

TERRY DONTJE<terry.don...@oracle.com>  wrote:

On 11/22/2011 6:59 PM, Lukas Razik wrote:

Roland Dreier<rol...@purestorage.com>   wrote:

On Tue, Nov 22, 2011 at 3:05 PM, Lukas Razik<li...@razik.name>

wrote:

#0  0xf8010229ba9c in mca_pml_ob1_send_request_start_copy

(sendreq=0xb23200, bml_btl=0xb29050, size=0) at pml_ob1_sendreq.c:551

551hdr->hdr_match.hdr_ctx =

sendreq->req_send.req_base.req_comm->c_contextid;

(gdb) backtrace

If you can get into gdb here, I guess it would be useful to print the
address of hdr->hdr_match.hdr_ctx and
sendreq->req_send.req_base.req_comm->c_contextid to see which one

is

misaligned.

Not sure of the gdb syntax... does it work to just do

p>hdr_match.hdr_ctx and sendreq->req_send.req_base.req
p>req_send.req_base.req_comm->c_contextid

Oh, sorry that I didn't do that before...
The values are:
>hdr_match.hdr_ctx and sendreq->req_send.req_base.req  =

(uint16_t *) 0xad7393

>req_send.req_base.req_comm->c_contextid  =  (uint32_t

*) 0x201c20

So hdr_ctx is the bad one...

PS:
I always don't know the syntax of gdb - hence I use the nice kdbg. *g*
http://net.razik.de/linux/T5120/kdbg-openmpi-1.4.4-osu_latency-02.png

Can you get me the value of hdr too.  I bet it is an odd value too.

You're right! :)
The value of hdr you can see in the first screenshot I've sent sent you:
http://net.razik.de/linux/T5120/kdbg-openmpi-1.4.4-osu_latency.png

It's

hdr = (mca_pml_ob1_hdr_t*) 0xad7391

Which now leads me to wondering if this is due to the coalescing code.  
If you can run with coalescing off (as described in my last email) that 
might be telling.

--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>

Re: [OMPI devel] [BUG?] OpenMPI with openib on SPARC64: Signal: Bus error (10)

2011-11-23 Thread TERRY DONTJE

On 11/22/2011 6:59 PM, Lukas Razik wrote:

Roland Dreier  wrote:

On Tue, Nov 22, 2011 at 3:05 PM, Lukas Razik  wrote:

  #0  0xf8010229ba9c in mca_pml_ob1_send_request_start_copy

(sendreq=0xb23200, bml_btl=0xb29050, size=0) at pml_ob1_sendreq.c:551

  551 hdr->hdr_match.hdr_ctx =

sendreq->req_send.req_base.req_comm->c_contextid;

  (gdb) backtrace

If you can get into gdb here, I guess it would be useful to print the
address of hdr->hdr_match.hdr_ctx and
sendreq->req_send.req_base.req_comm->c_contextid to see which one is
misaligned.

Not sure of the gdb syntax... does it work to just do

p>hdr_match.hdr_ctx and sendreq->req_send.req_base.req
p>req_send.req_base.req_comm->c_contextid

Oh, sorry that I didn't do that before...
The values are:
>hdr_match.hdr_ctx and sendreq->req_send.req_base.req  =  (uint16_t *) 
0xad7393
>req_send.req_base.req_comm->c_contextid  =  (uint32_t *) 0x201c20

So hdr_ctx is the bad one...

Regards,
Lukas

PS:
I always don't know the syntax of gdb - hence I use the nice kdbg. *g*
http://net.razik.de/linux/T5120/kdbg-openmpi-1.4.4-osu_latency-02.png

Lukas,

Can you try running the benchmark with coalescing off?  To do that add 
the following option to your mpirun line "-mca 
btl_openib_use_message_coalescing 0".

thanks,
--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com

Re: [OMPI devel] Rename "vader" BTL to "xpmem"

2011-11-22 Thread TERRY DONTJE

So with the aliasing scheme the code for openib would still under
ompi/mca/btl/openib but you could access it with -mca btl ofrc? Ok, so
when an error happens in the openib btl how does it identify itself?
Does it use openib or ofrc? This seems like there could be some user
confusion by adopting the aliasing scheme.

--td

On 11/22/2011 9:22 AM, Jeff Squyres wrote:

Here's what Nathan and I discussed / decided:

1. Nathan shied away from the name "xpmem" in case some other shared memory scheme
basically did the same thing as XPMEM (i.e., single copy techniques). (just FYI: xpmem's setup is
a little different from KNEM, though, so they didn't merge in KNEM support to vader) Hence, he
wanted a neutral name that could apply to xpmem and others. He and Sam have some possible names
that could be suitable ("single copy ...something..."; I don't remember offhand).

2. We've long talked about having an MCA component aliasing scheme. Perhaps
now is the time to do it. Such a scheme would do two things:

- provide alias names for components. For example, both of the following
would be equivalent:

mpirun --mca btl openib,self ...
mpirun --mca btl ofrc,self ...

- automatically register alias MCA parameters. For example, both of the
following would be equivalent:

mpirun --mca btl_openib_param 1 ...
mpirun --mca btl_ofrc_param 1 ...

This would solve two problems:

2a. Finally be able to rename the "openib" module to something more sensical; "ofrc", perhaps?
("ofrc" = OpenFabrics reliable connected transport, as opposed to the existing "ofud" = OpenFabrics
unreliable datagram transport BTL).

2b. Rename vader to be xpmem, because it only supports xpmem at the moment. If that
component is expanded in the future to support other similar single-copy schemes, it can
be renamed to some neutral name and have "xpmem" as an alias.

Nathan agreed to look into a module aliasing scheme / vader->xpmem rename after
he works the hide-OB1/BTL-descriptor-lengths issue that was previously discussed
on the list. This will likely be in early/mid December.

On Nov 17, 2011, at 8:11 AM, Jeff Squyres wrote:

After having to explain to someone at SC for the umpteenth time this week that the "vader" BTL uses
the XPMEM transport under the covers, I'd like to put forth an appeal to rename the "vader" BTL to
be "xpmem."

Here's my rationale for why:

1. Although we have a history of Star Wars-related names, the "ob1" and "r2"
components got their names because they're mainly algorithms that have no obvious name that
describes what they do.

2. All other components that tie into some back-end system are named reflecting the back-end system
(e.g., tcp, mx, portals, ...etc.). "openib" is the weakest example, but we all know that
it was named way back when OFED was named "OpenIB", and the name has kinda stuck.

3. The BTL name "xpmem" follows the law of least astonishment from the user's
perspective.

4. Cute names rarely seem so after 6 months.

I'll even volunteer to do the work to rename it (a bunch of file moves and
global search-and-replaces).

--
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com

Re: [OMPI devel] [BUG?] OpenMPI with openib on SPARC64: Signal: Bus error (10)

2011-11-22 Thread TERRY DONTJE




On 11/22/2011 5:49 AM, TERRY DONTJE wrote:
The error you are seeing is usually indicative of some code operating 
on memory that isn't aligned properly for a SPARC instruction being 
used.  The address that is causing the failure is odd aligned which is 
more than likely the culprit.  If you have a core dump and can 
disassemble the code that is being ran at the time it probably will be 
some sort of instruction requiring an alignment.  If the MPI you are 
using is something you built can you try and build OMPI with -g and 
get the line number in the PML that is failing?


I haven't seen this type of error for some time but I do all of my 
SPARC testing on Solaris with Solaris Studio Compilers.  You may want 
to try to compile the benchmark with "-m32" to see if that helps.  
Though being an odd address I suspect it might not.  If you can use 
the Studio Compilers you could try giving the compilers the 
-xmemalign=8i option when building the benchmark and see if that 
resolves the issue.  This would help to assure the issue is just an 
alignment of data we are slicing and dicing as opposed to wrongly 
addressing memory.


After thinking about this you probably won't be able to use the Studio 
Compilers because they only support compiling on Linux with x86 
platforms not Linux with SPARC.  Not sure if gcc has anything like the 
xmemalign options.


--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>

Re: [OMPI devel] Rename "vader" BTL to "xpmem"

2011-11-17 Thread TERRY DONTJE

On 11/17/2011 9:54 AM, Ralph Castain wrote:

On Nov 17, 2011, at 7:45 AM, TERRY DONTJE wrote:

I could possibly buy your argument Ralph if this was a one off BTL
that only Nathan (and his employer) is going to use. I am assuming
though this is a more general protocol for a vendor specific
protocol. Thus it seems that a sane naming of the BTL is within the
realm of the community.

Guess I disagree - I would hate to get to a stage where we all have to
pass every name thru a group approval process.

That's not what is happening here at all. I think there is kind of a
precedent already set here where BTL's are named via protocols (sans
openib).Actually, I think most of our components follow some sort of
unspoken standard naming. There has been relatively few (none?)
renaming of items that I remember. So this is clearly and exception case.

That being said, I think I would agree that Jeff should have passed
this by Nathan first before posting the RFC (which for all I know he has)

That was my point, really, and an RFC was not required - if Nathan
wants to change it, he can certainly do so.
I think it is within any community members right to propose a component
be renamed. That's all Jeff is requesting and is signing up for the work.

just in case there is some background that would convince Jeff that
vader is appropriate.

Regardless of whether or not it convinced Jeff, it remains Nathan's
decision, IMO. I very much doubt Jeff wants to be in the position of
"naming overlord", nor do I get the impression he was suggesting such
a thing.

I don't think Jeff is wanting to be the naming overlord either but he
has ran into an issue with the particular naming of a component and so
it is up to him to try and convince others (first Nathan) that renaming
the component is a sane thing to do.

I guess we'll have to agree to disagree.

--td

On 11/17/2011 9:29 AM, Ralph Castain wrote:
Frankly, the only vote that counts is Nathan's - it's his btl, and
we have never forcibly made someone rename their component. I would
suggest we not set that precedent. I'm comfortable with whatever he
decides to call it.

On Nov 17, 2011, at 7:00 AM, TERRY DONTJE wrote:

Isn't there precedent with the other BTLs to name them based on the
messaging protocol they are supporting instead of some movie
character (tcp, openib, shmem, portals, ...).

--td

On 11/17/2011 8:11 AM, Jeff Squyres wrote:

Here's my rationale for why:

1. Although we have a history of Star Wars-related names, the "ob1" and "r2"
components got their names because they're mainly algorithms that have no obvious name that
describes what they do.

3. The BTL name "xpmem" follows the law of least astonishment from the user's
perspective.

4. Cute names rarely seem so after 6 months.

I'll even volunteer to do the work to rename it (a bunch of file moves and
global search-and-replaces).

Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>

___
devel mailing list
de...@open-mpi.org <mailto:de...@open-mpi.org>
http://www.open-mpi.org/mailman/listinfo.cgi/devel

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

___
devel mailing list
de...@open-mpi.org <mailto:de...@open-mpi.org>
http://www.open-mpi.org/mailman/listinfo.cgi/devel

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] Rename "vader" BTL to "xpmem"

2011-11-17 Thread TERRY DONTJE

I could possibly buy your argument Ralph if this was a one off BTL that
only Nathan (and his employer) is going to use. I am assuming though
this is a more general protocol for a vendor specific protocol. Thus it
seems that a sane naming of the BTL is within the realm of the community.

That being said, I think I would agree that Jeff should have passed this
by Nathan first before posting the RFC (which for all I know he has)
just in case there is some background that would convince Jeff that
vader is appropriate.

--td

On 11/17/2011 9:29 AM, Ralph Castain wrote:
Frankly, the only vote that counts is Nathan's - it's his btl, and we
have never forcibly made someone rename their component. I would
suggest we not set that precedent. I'm comfortable with whatever he
decides to call it.

On Nov 17, 2011, at 7:00 AM, TERRY DONTJE wrote:

Isn't there precedent with the other BTLs to name them based on the
messaging protocol they are supporting instead of some movie
character (tcp, openib, shmem, portals, ...).

--td

On 11/17/2011 8:11 AM, Jeff Squyres wrote:

Here's my rationale for why:

1. Although we have a history of Star Wars-related names, the "ob1" and "r2"
components got their names because they're mainly algorithms that have no obvious name that
describes what they do.

3. The BTL name "xpmem" follows the law of least astonishment from the user's
perspective.

4. Cute names rarely seem so after 6 months.

I'll even volunteer to do the work to rename it (a bunch of file moves and
global search-and-replaces).

___
devel mailing list
de...@open-mpi.org <mailto:de...@open-mpi.org>
http://www.open-mpi.org/mailman/listinfo.cgi/devel

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] Rename "vader" BTL to "xpmem"

2011-11-17 Thread TERRY DONTJE


+1

Isn't there precedent with the other BTLs to name them based on the 
messaging protocol they are supporting instead of some movie character 
(tcp, openib, shmem, portals, ...).


--td

On 11/17/2011 8:11 AM, Jeff Squyres wrote:

After having to explain to someone at SC for the umpteenth time this week that the "vader" BTL uses 
the XPMEM transport under the covers, I'd like to put forth an appeal to rename the "vader" BTL to 
be "xpmem."

Here's my rationale for why:

1. Although we have a history of Star Wars-related names, the "ob1" and "r2" 
components got their names because they're mainly algorithms that have no obvious name that 
describes what they do.

2. All other components that tie into some back-end system are named reflecting the back-end system 
(e.g., tcp, mx, portals, ...etc.).  "openib" is the weakest example, but we all know that 
it was named way back when OFED was named "OpenIB", and the name has kinda stuck.

3. The BTL name "xpmem" follows the law of least astonishment from the user's 
perspective.

4. Cute names rarely seem so after 6 months.

I'll even volunteer to do the work to rename it (a bunch of file moves and 
global search-and-replaces).



--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com

Re: [OMPI devel] r25470 (hwloc CMR) breaks v1.5

2011-11-16 Thread TERRY DONTJE




On 11/15/2011 10:16 PM, Jeff Squyres wrote:

On Nov 14, 2011, at 10:17 PM, Eugene Loh wrote:


I tried building v1.5.  r25469 builds for me, r25470 does not.  This is 
Friday's hwloc putback of CMR 2866.  I'm on Solaris11/x86.  The problem is 
basically:

Doh!


Making all in tools/ompi_info
  CC ompi_info.o
"../../../opal/include/opal/sys/ia32/atomic.h", line 173: warning: parameter in 
inline asm statement unused: %2

Have these warnings always been there for you?  r25470 should not have changed 
any of the assembly stuff.

Yes.  You can ignore these warnings they aren't the droids you are 
looking for.

  CCLD   ompi_info
Undefined   first referenced
symbol in file
opal_hwloc122_hwloc_bitmap_dup  components.o
opal_hwloc122_hwloc_bitmap_weight   components.o
ld: fatal: symbol referencing errors. No output written to .libs/ompi_info

I do notice some minor differences between ompi-trunk/ompi-1.5 in the 
opal/mca/hwloc/hwloc122ompi/hwloc trees.

Terry: did you add some stuff to the trunk in topology-solaris-chiptype.c, for 
example?


Yes, but they have nothing to do with the undefined symbols above.

If so, the right solution might just be to copy from 
trunk/opal/mca/hwloc/hwloc122ompi/hwloc/* to 
ompi-1.5/opal/mca/hwloc/hwloc122ompi/hwloc/.




--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com

Re: [OMPI devel] RFC: MCA param registration errors

2011-11-02 Thread TERRY DONTJE




On 11/1/2011 7:48 PM, Jeff Squyres wrote:

So this was slightly different than the opinion that was discussed on the call 
today, which was 2.  The rationale for #2 was to punish developers, but if such 
a bug did make it through to production, users wouldn't be annoyed with 
show_help messages all the time.

Does anyone have strong opinions here?  I don't.

I offer the following two points:

- this is a coding error on the OMPI developer
- it's pretty rare

I think a show_help + return is very helpful in this case.  I wouldn't 
think that we'd run into this case that much and it would seem that it 
would be a rare occurance that one could just fix when they run into 
it.  However, since there was some opposition to having show_help 
messages possibly coming up all over the place I thought a fall back of 
only doing the show_help on enable_debug builds was a reasonable middle 
ground.


--td


On Nov 1, 2011, at 7:30 PM, George Bosilca wrote:


1

  george.

On Nov 1, 2011, at 17:23 , Jeff Squyres wrote:


Can you clarify -- I can parse your text multiple ways.  Which are you voting 
for?

1. show_help + return error code in all cases.
2. if OPAL_ENABLE_DEBUG, show_help + exit(1), else silently return error code.
3. show_help.  if OPAL_ENABLE_DEBUG, exit(1), else return error code.



On Nov 1, 2011, at 4:50 PM, George Bosilca wrote:


This is a much saner solution. We [mostly] stayed away from calling exit deep 
into our libraries, there is no reason to add it now. I'll vote in favor of 
show_help + return code.

george.

On Nov 1, 2011, at 15:14 , Jeff Squyres wrote:


We talked about this on the call today.

A good suggestion was made: call show_help/opal_finalize/exit only when 
OPAL_ENABLE_DEBUG is true.  Otherwise, return an error code.

If no one objects to this, I'll commit this tomorrow.



On Oct 31, 2011, at 4:16 PM, Jeff Squyres wrote:


WHAT: what to do if registering an MCA param results in an error?

WHERE: opal/mca/base/mca_base_param.c

WHY: MCA param re-registration issues should be treated as OMPI developer errors

WHEN: COB Friday, 4 Nov 2011

-

Short version:

Re-registering an MCA param to be a different type (e.g., it was initially 
registered to be a string, but was later re-registered to be an int) should be 
treated as an OMPI developer error, and should opal_finalize()/exit(1).

More details:

A mistaken MCA param re-registration recently caused an orted segv.

The MCA param subsystem was fixed to avoid this segv, but silently convert the 
MCA param to the newly-registered type.  Upon reflection and some discussion, 
this seems to be a bad idea.  Instead, we should loudly complain via a 
show_help message and then exit(1).

Specifically: this kind of behavior is clearly an error and should be fixed.  
Unfortunately, in most cases, we don't actually check the return value from MCA 
param registration functions, so if we change the MCA param function to simply 
return a non OPAL_SUCCESS status, it's unlikely that anyone will notice until 
some code tries to read the param value, likely still resulting in a segv.

Does anyone have heartburn if I change the error behavior to 
opal_finalize()/exit(1)?

--
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


--
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


--
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com

Re: [OMPI devel] [OMPI svn] svn:open-mpi r25302

2011-10-18 Thread TERRY DONTJE


Strange - it ran fine for me on multiple tests. I'll check to see if something 
strange got into the mix and recommit.

Not sure it is the same issue but it looks like all my MTT tests on the 
trunk r25308 are timing out.

--td


On Oct 17, 2011, at 8:51 PM, George Bosilca wrote:


This commit put the mpirun process in an infinite loop for the simple case
mpirun -np 2 --mca orte_default_hostfile machinefile --bynode *my_app*

  george.

On Oct 17, 2011, at 15:49 , r...@osl.iu.edu wrote:


Author: rhc
Date: 2011-10-17 15:49:04 EDT (Mon, 17 Oct 2011)
New Revision: 25302
URL: https://svn.open-mpi.org/trac/ompi/changeset/25302

Log:
Fix the mapping algo for computing vpids - it was borked for bynode operations 
when using nperxxx directives

Text files modified:
  trunk/orte/mca/rmaps/base/rmaps_base_support_fns.c |67 
---
  1 files changed, 34 insertions(+), 33 deletions(-)

Modified: trunk/orte/mca/rmaps/base/rmaps_base_support_fns.c
==
--- trunk/orte/mca/rmaps/base/rmaps_base_support_fns.c  (original)
+++ trunk/orte/mca/rmaps/base/rmaps_base_support_fns.c  2011-10-17 15:49:04 EDT 
(Mon, 17 Oct 2011)
@@ -527,7 +527,7 @@
int orte_rmaps_base_compute_vpids(orte_job_t *jdata)
{
orte_job_map_t *map;
-orte_vpid_t vpid;
+orte_vpid_t vpid, cnt;
int i, j;
orte_node_t *node;
orte_proc_t *proc;
@@ -539,6 +539,7 @@
ORTE_MAPPING_BYSOCKET&  map->policy ||
ORTE_MAPPING_BYBOARD&  map->policy) {
/* assign the ranks sequentially */
+vpid = 0;
for (i=0; i<  map->nodes->size; i++) {
if (NULL == (node = 
(orte_node_t*)opal_pointer_array_get_item(map->nodes, i))) {
continue;
@@ -553,12 +554,10 @@
}
if (ORTE_VPID_INVALID == proc->name.vpid) {
/* find the next available vpid */
-for (vpid=0; vpid<  jdata->num_procs; vpid++) {
-if (NULL == opal_pointer_array_get_item(jdata->procs, 
vpid)) {
-break;
-}
+while (NULL != opal_pointer_array_get_item(jdata->procs, 
vpid)) {
+vpid++;
}
-proc->name.vpid = vpid;
+proc->name.vpid = vpid++;
ORTE_EPOCH_SET(proc->name.epoch,ORTE_EPOCH_INVALID);

ORTE_EPOCH_SET(proc->name.epoch,orte_ess.proc_get_epoch(>name));

@@ -580,39 +579,41 @@

if (ORTE_MAPPING_BYNODE&  map->policy) {
/* assign the ranks round-robin across nodes */
-for (i=0; i<  map->nodes->size; i++) {
-if (NULL == (node = 
(orte_node_t*)opal_pointer_array_get_item(map->nodes, i))) {
-continue;
-}
-for (j=0; j<  node->procs->size; j++) {
-if (NULL == (proc = 
(orte_proc_t*)opal_pointer_array_get_item(node->procs, j))) {
+cnt = 0;
+vpid = 0;
+do {
+for (i=0; i<  map->nodes->size; i++) {
+if (NULL == (node = 
(orte_node_t*)opal_pointer_array_get_item(map->nodes, i))) {
continue;
}
-/* ignore procs from other jobs */
-if (proc->name.jobid != jdata->jobid) {
-continue;
-}
-if (ORTE_VPID_INVALID == proc->name.vpid) {
-/* find the next available vpid */
-vpid = i;
-while (NULL != opal_pointer_array_get_item(jdata->procs, 
vpid)) {
-vpid += map->num_nodes;
-if (jdata->num_procs<= vpid) {
-vpid = vpid - jdata->num_procs;
+for (j=0; j<  node->procs->size; j++) {
+if (NULL == (proc = 
(orte_proc_t*)opal_pointer_array_get_item(node->procs, j))) {
+continue;
+}
+/* ignore procs from other jobs */
+if (proc->name.jobid != jdata->jobid) {
+continue;
+}
+if (ORTE_VPID_INVALID == proc->name.vpid) {
+/* find next available vpid */
+while (NULL != 
opal_pointer_array_get_item(jdata->procs, vpid)) {
+vpid++;
+}
+proc->name.vpid = vpid++;
+ORTE_EPOCH_SET(proc->name.epoch,ORTE_EPOCH_INVALID);
+
ORTE_EPOCH_SET(proc->name.epoch,orte_ess.proc_get_epoch(>name));
+if (ORTE_SUCCESS != (rc = 
opal_pointer_array_set_item(jdata->procs,
+  
proc->name.vpid, proc))) {
+ORTE_ERROR_LOG(rc);
+

Re: [OMPI devel] [OMPI bugs] [Open MPI] #2888: base.h inclusion breaks Solaris build

2011-10-18 Thread TERRY DONTJE

BTW, I am working on a patch for this.  Just want to validate there are 
no other loose ends.  I remember there were a couple oddities about this 
issue.


--td

Never mind; I just ready your text more carefully - 2887 caused the problem.

Sent from my phone. No type good.

On Oct 18, 2011, at 6:19 AM, "Open MPI"  wrote:


#2888: base.h inclusion breaks Solaris build
+
Reporter:  tdd  |  Owner:  tdd
Type:  defect   | Status:  new
Priority:  blocker  |  Milestone:  Open MPI 1.5.5
Version:  trunk|   Keywords:
+
#2887 breaks the Solaris build because opal/sys/timer.h and
opal/mca/timer/base/base.h cause a redeclaration error for opal_timer_t.
This is a similar issue we saw with r25157 that r25170 fixed.

--
Ticket URL:
Open MPI

___
bugs mailing list
b...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/bugs

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com

Re: [OMPI devel] [OMPI bugs] [Open MPI] #2888: base.h inclusion breaks Solaris build

2011-10-18 Thread TERRY DONTJE


Terry -

Did #2887 fix this already?


No it broke it.

--td

Sent from my phone. No type good.

On Oct 18, 2011, at 6:19 AM, "Open MPI"  wrote:


#2888: base.h inclusion breaks Solaris build
+
Reporter:  tdd  |  Owner:  tdd
Type:  defect   | Status:  new
Priority:  blocker  |  Milestone:  Open MPI 1.5.5
Version:  trunk|   Keywords:
+
#2887 breaks the Solaris build because opal/sys/timer.h and
opal/mca/timer/base/base.h cause a redeclaration error for opal_timer_t.
This is a similar issue we saw with r25157 that r25170 fixed.

--
Ticket URL:
Open MPI

___
bugs mailing list
b...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/bugs

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com

Re: [hwloc-devel] CPU Model and type

2011-09-16 Thread TERRY DONTJE




On 9/14/2011 5:54 AM, Brice Goglin wrote:

Le 13/09/2011 22:06, TERRY DONTJE a écrit :

On 9/13/2011 9:54 AM, Brice Goglin wrote:

Le 13/09/2011 21:51, TERRY DONTJE a écrit :
Both type and model are character strings.  An example of what I 
currently store in the sysinfo structures are:


type = "SPARC"
model = "SPARC64_VI"

Other values for model are "T1", "T2", "SPARC64_VII"...


What about Solaris on non-sparc machines ?


Looks like the type is an empty string and model is "i86pc" in one case.
These are basically values that come from calls to solairs sysinfo.


Type doesn't seem that helpful then. We already have the architecture 
(taken from uname) in the machine attribute.

Yeah, I guess Solaris is a little biased :-/.


I think you should just put model in the CPUModel info attribute. I 
wil do the same for Linux and add the vendor to "CPUVendor" when 
available, we'll get something like:
So you are saying to add the a CPUModel and CPUVendor info to a socket 
object as we discussed earlier, right?


CPUVendor=GenuineIntel
CPUModel=Intel(R) Core(TM) i7 CPU   M 620  @ 2.67GHz
or
CPUVendor=AuthenticAMD
CPUModel=AMD Opteron(tm) Processor 6174
or
CPUModel=Alpha EV68CB
or
CPUModel=POWER7 (architected), altivec supported
or
CPUModel=Cell Broadband Engine, altivec supported
or
CPUModel=ARMv7 Processor rev 1 (v71)

or
CPUVendor=SPARC
CPUModel=SPARC64_VI
...
Brice, I started actually working on the SPARC detect code and a couple 
things became obvious to me.  First I really meant for CPUVendor to be 
CPUType ala SPARC, i386, Alpha, Power.  And the CPUModel be the fully 
described model or brand-string like "SPARC64_VI", "AMD Opteron(tm) 
Processor 6174",  etc.


I really do not want to be using CPUVendor because SPARC is the 
Processor type not vendor or manufacturer and even though I could force 
CPUVendor to be SPARC but I feel we would regret doing so if ever we 
wanted to truly key off on the CPUVendor for SPARC type processors.


So can we go back to using CPUType and can you populate it with the type 
value instead of vendor?


In looking through my detect code I also figured recalled that it never 
was compiled on non-sparc machines thus the weird values I was quoting 
for CPUType and CPUModel for x386 based machines.  I am going to rework 
the code so it produces correct values when ran under Solaris on both 
SPARC and i386 type processors.  For i386 I expect to have values as such:


CPUType = i386
CPUModel = Intel(R) Core(TM) i7 CPU   M 620  @ 2.67GHz

--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>

Re: [hwloc-devel] CPU Model and type

2011-09-14 Thread TERRY DONTJE




On 9/14/2011 5:54 AM, Brice Goglin wrote:

Le 13/09/2011 22:06, TERRY DONTJE a écrit :

On 9/13/2011 9:54 AM, Brice Goglin wrote:

Le 13/09/2011 21:51, TERRY DONTJE a écrit :
Both type and model are character strings.  An example of what I 
currently store in the sysinfo structures are:


type = "SPARC"
model = "SPARC64_VI"

Other values for model are "T1", "T2", "SPARC64_VII"...


What about Solaris on non-sparc machines ?


Looks like the type is an empty string and model is "i86pc" in one case.
These are basically values that come from calls to solairs sysinfo.


Type doesn't seem that helpful then. We already have the architecture 
(taken from uname) in the machine attribute.

Yeah, I guess Solaris is a little biased :-/.


I think you should just put model in the CPUModel info attribute. I 
wil do the same for Linux and add the vendor to "CPUVendor" when 
available, we'll get something like:
So you are saying to add the a CPUModel and CPUVendor info to a socket 
object as we discussed earlier, right?


CPUVendor=GenuineIntel
CPUModel=Intel(R) Core(TM) i7 CPU   M 620  @ 2.67GHz
or
CPUVendor=AuthenticAMD
CPUModel=AMD Opteron(tm) Processor 6174
or
CPUModel=Alpha EV68CB
or
CPUModel=POWER7 (architected), altivec supported
or
CPUModel=Cell Broadband Engine, altivec supported
or
CPUModel=ARMv7 Processor rev 1 (v71)

or
CPUVendor=SPARC
CPUModel=SPARC64_VI
...

thanks,

--td


Brice


___
hwloc-devel mailing list
hwloc-de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel


--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>

Re: [hwloc-devel] CPU Model and type

2011-09-13 Thread TERRY DONTJE




On 9/13/2011 9:23 AM, Brice Goglin wrote:

Le 12/09/2011 21:01, Brice Goglin a écrit :

Le 09/09/2011 13:25, TERRY DONTJE a écrit :

On 9/8/2011 3:10 PM, Brice Goglin wrote:

Hello Terry,

Indeed there's nothing like this as of today. We talked about it in 
the past but it's not very easy to implement on Linux (see below) 
so we forgot about it until somebody complained.


Adding infos would certainly be fine. I think it should rather be 
"CPUType" and "CPUModel" since existing infos have no underscore in 
their name if I remember correctly. You could also set object->name 
to a combination of type and model. Socket looks like the right 
object to put this. Maybe even use "Model" and "Type" as the info 
names then?


The reason it's not easy on Linux is that we usually take infos 
from either sysfs, or /proc/cpuinfo if sysfs isn't available, but 
not from both. Processor names are only in /proc/cpuinfo IIRC. So 
we'd need to mix sysfs and /proc/cpuinfo. Not easy with the current 
code, especially if you can't assume that all sockets are similar. 
But definitely something that I will do at some point.


Brice

The way info objects would be attached to a Socket object I assume 
it would be ok to just attach such an object under Solaris but not 
not for the other OSes.  Since one can look for the named object and 
it is either going to be there or not :-).


Anyway, I'll play around with this for Solaris.


Looking at the code, you might want to drop hwloc_setup_level() and 
copy it back into the caller. It will make the addition of info 
attributes much easier. I am looking at the Linux side.


I just pushed some code that will make this much easier on Linux (I 
may change the Solaris code similarly when I'll take time to test on a 
real solaris machine).


Now I have a patch that reads the CPU vendor and model in 
/proc/cpuinfo (x86 only for now) and use them to set Socket info 
attributes (CPUVendor and CPUModel) and name (CPUVendor+CPUModel).


Before I push this, we need to clarify what we want. You were talking 
about "CPUType" and "CPUModel". Can you give some example of what it 
would look like under Solaris? I want to compare to what I can get on 
Linux.


Both type and model are character strings.  An example of what I 
currently store in the sysinfo structures are:


type = "SPARC"
model = "SPARC64_VI"

Other values for model are "T1", "T2", "SPARC64_VII"...

--td



Brice


___
hwloc-devel mailing list
hwloc-de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel


--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>

Re: [hwloc-devel] CPU Model and type

2011-09-09 Thread TERRY DONTJE


On 9/8/2011 3:10 PM, Brice Goglin wrote:

Hello Terry,

Indeed there's nothing like this as of today. We talked about it in 
the past but it's not very easy to implement on Linux (see below) so 
we forgot about it until somebody complained.


Adding infos would certainly be fine. I think it should rather be 
"CPUType" and "CPUModel" since existing infos have no underscore in 
their name if I remember correctly. You could also set object->name to 
a combination of type and model. Socket looks like the right object to 
put this. Maybe even use "Model" and "Type" as the info names then?


The reason it's not easy on Linux is that we usually take infos from 
either sysfs, or /proc/cpuinfo if sysfs isn't available, but not from 
both. Processor names are only in /proc/cpuinfo IIRC. So we'd need to 
mix sysfs and /proc/cpuinfo. Not easy with the current code, 
especially if you can't assume that all sockets are similar. But 
definitely something that I will do at some point.


Brice

The way info objects would be attached to a Socket object I assume it 
would be ok to just attach such an object under Solaris but not not for 
the other OSes.  Since one can look for the named object and it is 
either going to be there or not :-).


Anyway, I'll play around with this for Solaris.  Can I then email you 
the diff for a review?


thanks,

--td


Le 08/09/2011 20:57, TERRY DONTJE a écrit :
I wanted to verify that I am not overlooking something, but is there 
any information stored in the hwloc topology tree that contains the 
CPU Model and Type of chips in a machine?  The closest I came was the 
Machine "Architecture" info object.  Unfortunately this object is not 
specific enough so I am considering adding a couple info objects 
(CPU_Model and CPU_Type) to the HWLOC_OBJ_SOCKET objects or maybe to 
the Machine object in topology_solaris.c in the OMPI hwloc source base.


First does that make sense and secondly does this sound like it might 
be useful enough outside of OMPI that you'd want to buy back the 
changes?  There is similar data that can be gotten for Linux too.  
Though I personally only need this for Solaris/SPARC systems.


thanks,
--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>




___
hwloc-devel mailing list
hwloc-de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel



___
hwloc-devel mailing list
hwloc-de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel


--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>

Re: [OMPI devel] MPI_Errhandler_fatal_c failure

2011-08-18 Thread TERRY DONTJE

Thought I'd throw this out there, I retraced my MTT steps and did find 
that there were failures of this test back until r24774.  r24775 has a 
comment that looks very relevant.  I am talking to the committer of that 
change now.


Sorry for the false accusation.

--td

On 8/18/2011 2:32 PM, George Bosilca wrote:

Terry,

The test succeeded in both of your runs.

However, I rolled back before the epoch change (24814) and the output is the 
following:

MPITEST info  (0): Starting MPI_Errhandler_fatal test
MPITEST info  (0): This test will abort after printing the results message
MPITEST info  (0): If it does not, then a f.a.i.l.u.r.e will be noted
[dancer.eecs.utk.edu:16098] *** An error occurred in MPI_Send
[dancer.eecs.utk.edu:16098] *** reported by process 
[766095392769,139869904961537]
[dancer.eecs.utk.edu:16098] *** on communicator MPI COMMUNICATOR 3 DUP FROM 0
[dancer.eecs.utk.edu:16098] *** MPI_ERR_RANK: invalid rank
[dancer.eecs.utk.edu:16098] *** MPI_ERRORS_ARE_FATAL (processes in this 
communicator will now abort,
[dancer.eecs.utk.edu:16098] ***and potentially your MPI job)
MPITEST_results: MPI_Errhandler_fatal all tests PASSED (4)
[dancer.eecs.utk.edu:16096] [[24280,0],0]-[[24280,1],3] mca_oob_tcp_msg_recv: 
readv failed: Connection reset by peer (104)
[dancer.eecs.utk.edu:16096] 3 more processes have sent help message 
help-mpi-errors.txt / mpi_errors_are_fatal
[dancer.eecs.utk.edu:16096] Set MCA parameter "orte_base_help_aggregate" to 0 
to see all help / error messages

As you can see it is identical to the output in your test.

   george.


On Aug 18, 2011, at 12:29 , TERRY DONTJE wrote:


Just ran MPI_Errhandler_fatal_c with r25063 and it still fails.  Everything is the same 
except I don't see the "readv failed.." message.

Have your tried to run this code yourself?  It is pretty simple and fails with 
one node using np=4.

--td

On 8/18/2011 10:57 AM, Wesley Bland wrote:

I just checked in a fix (I hope). I think the problem was that the errmgr
was removing children from the list of odls children without using the
mutex to prevent race conditions. Let me know if the MTT is still having
problems tomorrow.

Wes



I am seeing the intel test suite tests MPI_Errhandler_fatal_c and
MPI_Errhandler_fatal_f fail with an oob failure quite a bit  I have not
seen this test failing under MTT until the epoch code was added.  So I
have a suspicion the epoch code might be at fault.  Could someone
familiar with the epoch changes (Wesley) take a look at this failure.

Note this intermittently fails but fails for me more times than not.
Attached is a log file of a run that succeeds followed by the failing
run.  The piece of concern are the messages involving
mca_oob_tcp_msg_recv and below.

thanks,

--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email
terry.don...@oracle.com<mailto:terry.don...@oracle.com>







--

Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle - Performance Technologies
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com



___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>

Re: [OMPI devel] MPI_Errhandler_fatal_c failure

2011-08-18 Thread TERRY DONTJE




On 8/18/2011 2:32 PM, George Bosilca wrote:

Terry,

The test succeeded in both of your runs.
Not really.  Granted the test aborted  in both cases however the case 
you show below has further issues while the orte is trying to clean 
things up.  It certainly is not what I would call friendly.  But that is 
besides the point, the issue is orte is having  issues with 
MPI_Errhandler_fatal_c test IMO and it looks like you have seen the same 
failure prior to the epoch changes.  Fair enough, I'll go back to the 
drawing board and see if I can narrow this down.


--td

However, I rolled back before the epoch change (24814) and the output is the 
following:

MPITEST info  (0): Starting MPI_Errhandler_fatal test
MPITEST info  (0): This test will abort after printing the results message
MPITEST info  (0): If it does not, then a f.a.i.l.u.r.e will be noted
[dancer.eecs.utk.edu:16098] *** An error occurred in MPI_Send
[dancer.eecs.utk.edu:16098] *** reported by process 
[766095392769,139869904961537]
[dancer.eecs.utk.edu:16098] *** on communicator MPI COMMUNICATOR 3 DUP FROM 0
[dancer.eecs.utk.edu:16098] *** MPI_ERR_RANK: invalid rank
[dancer.eecs.utk.edu:16098] *** MPI_ERRORS_ARE_FATAL (processes in this 
communicator will now abort,
[dancer.eecs.utk.edu:16098] ***and potentially your MPI job)
MPITEST_results: MPI_Errhandler_fatal all tests PASSED (4)
[dancer.eecs.utk.edu:16096] [[24280,0],0]-[[24280,1],3] mca_oob_tcp_msg_recv: 
readv failed: Connection reset by peer (104)
[dancer.eecs.utk.edu:16096] 3 more processes have sent help message 
help-mpi-errors.txt / mpi_errors_are_fatal
[dancer.eecs.utk.edu:16096] Set MCA parameter "orte_base_help_aggregate" to 0 
to see all help / error messages

As you can see it is identical to the output in your test.

   george.


On Aug 18, 2011, at 12:29 , TERRY DONTJE wrote:


Just ran MPI_Errhandler_fatal_c with r25063 and it still fails.  Everything is the same 
except I don't see the "readv failed.." message.

Have your tried to run this code yourself?  It is pretty simple and fails with 
one node using np=4.

--td

On 8/18/2011 10:57 AM, Wesley Bland wrote:

I just checked in a fix (I hope). I think the problem was that the errmgr
was removing children from the list of odls children without using the
mutex to prevent race conditions. Let me know if the MTT is still having
problems tomorrow.

Wes



I am seeing the intel test suite tests MPI_Errhandler_fatal_c and
MPI_Errhandler_fatal_f fail with an oob failure quite a bit  I have not
seen this test failing under MTT until the epoch code was added.  So I
have a suspicion the epoch code might be at fault.  Could someone
familiar with the epoch changes (Wesley) take a look at this failure.

Note this intermittently fails but fails for me more times than not.
Attached is a log file of a run that succeeds followed by the failing
run.  The piece of concern are the messages involving
mca_oob_tcp_msg_recv and below.

thanks,

--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email
terry.don...@oracle.com<mailto:terry.don...@oracle.com>







--

Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle - Performance Technologies
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com



___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>

Re: [OMPI devel] MPI_Errhandler_fatal_c failure

2011-08-18 Thread TERRY DONTJE

Just ran MPI_Errhandler_fatal_c with r25063 and it still fails.  
Everything is the same except I don't see the "readv failed.." message.


Have your tried to run this code yourself?  It is pretty simple and 
fails with one node using np=4.


--td

On 8/18/2011 10:57 AM, Wesley Bland wrote:

I just checked in a fix (I hope). I think the problem was that the errmgr
was removing children from the list of odls children without using the
mutex to prevent race conditions. Let me know if the MTT is still having
problems tomorrow.

Wes


I am seeing the intel test suite tests MPI_Errhandler_fatal_c and
MPI_Errhandler_fatal_f fail with an oob failure quite a bit  I have not
seen this test failing under MTT until the epoch code was added.  So I
have a suspicion the epoch code might be at fault.  Could someone
familiar with the epoch changes (Wesley) take a look at this failure.

Note this intermittently fails but fails for me more times than not.
Attached is a log file of a run that succeeds followed by the failing
run.  The piece of concern are the messages involving
mca_oob_tcp_msg_recv and below.

thanks,

--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com






--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com

[OMPI devel] MPI_Errhandler_fatal_c failure

2011-08-18 Thread TERRY DONTJE

I am seeing the intel test suite tests MPI_Errhandler_fatal_c and 
MPI_Errhandler_fatal_f fail with an oob failure quite a bit  I have not 
seen this test failing under MTT until the epoch code was added.  So I 
have a suspicion the epoch code might be at fault.  Could someone 
familiar with the epoch changes (Wesley) take a look at this failure.


Note this intermittently fails but fails for me more times than not.  
Attached is a log file of a run that succeeds followed by the failing 
run.  The piece of concern are the messages involving 
mca_oob_tcp_msg_recv and below.


thanks,

--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com 



Script started on Thu Aug 18 09:15:10 2011
 burl-ct-x4150-1 101 =>mpirun -np 4 --mca btl tcp,self --mca coll_sm_priority 
100 -- `pwd`/src/MPI_Errhandler_f atal_c

MPITEST info  (0): Starting MPI_Errhandler_fatal test
MPITEST info  (0): This test will abort after printing the results message
MPITEST info  (0): If it does not, then a f.a.i.l.u.r.e will be noted
MPITEST_results: MPI_Errhandler_fatal all tests PASSED (4)
[burl-ct-x4150-1:26951] *** An error occurred in MPI_Send
[burl-ct-x4150-1:26951] *** reported by process [2470772737,3]
[burl-ct-x4150-1:26951] *** on communicator MPI COMMUNICATOR 3 DUP FROM 0
[burl-ct-x4150-1:26951] *** MPI_ERR_RANK: invalid rank
[burl-ct-x4150-1:26951] *** MPI_ERRORS_ARE_FATAL (processes in this 
communicator will now abort,
[burl-ct-x4150-1:26951] ***and potentially your MPI job)
[burl-ct-x4150-1:26945] 3 more processes have sent help message 
help-mpi-errors.txt / mpi_errors_are_fatal
[burl-ct-x4150-1:26945] Set MCA parameter "orte_base_help_aggregate" to 0 to 
see all help / error messages
 burl-ct-x4150-1 102 =>mpirun -np 4 --mca btl tcp,self --mca coll_sm_priority 
100 -- `pwd`/src/MPI_Errhandler_f atal_c

MPITEST info  (0): Starting MPI_Errhandler_fatal test
MPITEST info  (0): This test will abort after printing the results message
MPITEST info  (0): If it does not, then a f.a.i.l.u.r.e will be noted
MPITEST_results: MPI_Errhandler_fatal all tests PASSED (4)
[burl-ct-x4150-1:26955] *** An error occurred in MPI_Send
[burl-ct-x4150-1:26955] *** reported by process [2471231489,0]
[burl-ct-x4150-1:26955] *** on communicator MPI COMMUNICATOR 3 DUP FROM 0
[burl-ct-x4150-1:26955] *** MPI_ERR_RANK: invalid rank
[burl-ct-x4150-1:26955] *** MPI_ERRORS_ARE_FATAL (processes in this 
communicator will now abort,
[burl-ct-x4150-1:26955] ***and potentially your MPI job)
[burl-ct-x4150-1:26952] [[37708,0],0,0]-[[37708,1],3,0] mca_oob_tcp_msg_recv: 
readv failed: Connection reset by peer (131)
[burl-ct-x4150-1:26952] [[37708,0],0,0] ORTE_ERROR_LOG: A message is attempting 
to be sent to a process whose contact information is unknown in file 
../../../../../orte/mca/rml/oob/rml_oob_send.c at line 149
[burl-ct-x4150-1:26952] [[37708,0],0,0] attempted to send to [[37708,1],1,0]: 
tag 38
[burl-ct-x4150-1:26952] [[37708,0],0,0] ORTE_ERROR_LOG: A message is attempting 
to be sent to a process whose contact information is unknown in file 
../../../../orte/mca/odls/base/odls_base_default_fns.c at line 2345
[burl-ct-x4150-1:26952] [[37708,0],0,0]-[[37708,1],0,0] mca_oob_tcp_msg_recv: 
readv failed: Connection reset by peer (131)
[burl-ct-x4150-1:26952] 3 more processes have sent help message 
help-mpi-errors.txt / mpi_errors_are_fatal
[burl-ct-x4150-1:26952] Set MCA parameter "orte_base_help_aggregate" to 0 to 
see all help / error messages
 burl-ct-x4150-1 103 =>^Dexit

script done on Thu Aug 18 09:15:57 2011

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r24830

2011-07-14 Thread Terry Dontje




On 7/14/2011 9:17 AM, Yevgeny Kliteynik wrote:

On 14-Jul-11 3:30 PM, Jeff Squyres wrote:

The real question is: does Solaris have the same data structures required for 
Linux's dynamic SL support?  If so, this header file inquiry is worthwhile.  If 
not, then perhaps a separate port will be required for Solaris to support the 
dynamic SL functionality.

I know for sure that at some point OpenFabrics OpenSM has forked
and was used as a basis for *some* Solaris SM, which possibly
will preserve the same headers. The MAD format has to stay the
same to provide interoperability, and I doubt that someone renamed
MAD fields just for fun.
So the questions is, what header and what package contain there
structures.

Note there is no SM delivered with Solaris specifically.  So there are 
no SM specific header files.
AFAIK, we rely on remote SM's whether it is one running on a switch or 
on another node (like OpenSM running on Linux).
So relying on OpenSM source headers existing on Solaris is probably a 
bad plan.


However, it sounds like the existance of ib_types.h might help us which 
I have answered in a previous email that it does exist.

I'm checking this offline with Oracle IB people.

Other question is do Oracle folks care about IB QoS and torus/mesh
topologies w.r.t. OMPI, because otherwise the dynamic SL is irrelevant.


It is not an extreme priority of ours but we would like to support it.

--td

-- YK


On Jul 14, 2011, at 7:24 AM, Terry Dontje wrote:


I do but my machine room's power is down so I don't have access to it right 
now.  I will grope around once it comes up to see what it has.  I also have 
sent email to our IB team for some direction.

--td

On 7/14/2011 2:42 AM, Yevgeny Kliteynik wrote:

[adding Terry]

On 14-Jul-11 2:49 AM, Eugene Loh wrote:


On 7/13/2011 4:31 PM, Paul H. Hargrove wrote:


On 7/13/2011 4:20 PM, Yevgeny Kliteynik wrote:


Finally, are you sure that infiniband/complib/cl_types_osd.h exists on all 
platforms? (e.g., Solaris) I know you said you don't have any Solaris machines 
to test with, but you should ping Oracle directly for some testing -- Terry 
might not be paying attention to this specific thread...


I'll check it, but my guess would be that Solaris doesn't have it.
AFAIK Solaris doesn't use OpenFabrics OpenSM - it has a separate
subnet manager, so I can't assume that it has.
So right now the dynamic SL will probably not work on Solaris
(though it won't break the compilation).


I have a pair of old machines running Solaris 11 Express (aka "SunOS 5.11 snv_151a 
November 2010").
These have IB Verbs support, but there is no such header. In fact, 
/usr/include/inifiband has no sub-directories.


+1

(That is, no such header and not even any subdirectories on a very recent 
version of Solaris 11: snv_168.)


Makes sense. But I believe that these Solaris installations
just don't have Subnet Manager, so they are not supposed to
have these headers anyway. What I don't know is what headers
are installed as part of the SM installation.
Does anybody have a Solaris machine with Subnet Manager?

-- YK




I may be able to do some testing eventually, but now is not a good time.


___
devel mailing list

de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




--

Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle - Performance Technologies
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com



___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r24830

2011-07-14 Thread Terry Dontje




On 7/14/2011 9:30 AM, Yevgeny Kliteynik wrote:

On 14-Jul-11 4:21 PM, Paul H. Hargrove wrote:


On 7/13/2011 11:42 PM, Yevgeny Kliteynik wrote:

[adding Terry]

On 14-Jul-11 2:49 AM, Eugene Loh wrote:

On 7/13/2011 4:31 PM, Paul H. Hargrove wrote:

On 7/13/2011 4:20 PM, Yevgeny Kliteynik wrote:

Finally, are you sure that infiniband/complib/cl_types_osd.h exists on all 
platforms? (e.g., Solaris) I know you said you don't have any Solaris machines 
to test with, but you should ping Oracle directly for some testing -- Terry 
might not be paying attention to this specific thread...

I'll check it, but my guess would be that Solaris doesn't have it.
AFAIK Solaris doesn't use OpenFabrics OpenSM - it has a separate
subnet manager, so I can't assume that it has.
So right now the dynamic SL will probably not work on Solaris
(though it won't break the compilation).

I have a pair of old machines running Solaris 11 Express (aka "SunOS 5.11 snv_151a 
November 2010").
These have IB Verbs support, but there is no such header. In fact, 
/usr/include/inifiband has no sub-directories.

+1

(That is, no such header and not even any subdirectories on a very recent 
version of Solaris 11: snv_168.)

Makes sense. But I believe that these Solaris installations
just don't have Subnet Manager, so they are not supposed to
have these headers anyway. What I don't know is what headers
are installed as part of the SM installation.
Does anybody have a Solaris machine with Subnet Manager?

-- YK

I'll go so far as to say that this header does not exist in the package repo:

root@pcp-j-20:~# pkg search verbs.h || echo NOPE
INDEX ACTION VALUE PACKAGE
basename file usr/include/infiniband/verbs.h 
pkg:/network/open-fabrics@1.3-0.151.0.1

root@pcp-j-20:~# pkg search cl_types_osd.h || echo NOPE
NOPE

Actually cl_types_osd.h is used as kinda hack - it workarounds
bad include directive in ib_types.h

Could you check for ib_types.h instead?
This is the header that I'm actually using.


Yes we have ib_types.h it is in /usr/include/sys/ib/ib_types.h

Thanks a lot!

-- YK


-Paul


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r24830

2011-07-14 Thread Terry Dontje

I do but my machine room's power is down so I don't have access to it 
right now.  I will grope around once it comes up to see what it has.  I 
also have sent email to our IB team for some direction.


--td

On 7/14/2011 2:42 AM, Yevgeny Kliteynik wrote:

[adding Terry]

On 14-Jul-11 2:49 AM, Eugene Loh wrote:

On 7/13/2011 4:31 PM, Paul H. Hargrove wrote:

On 7/13/2011 4:20 PM, Yevgeny Kliteynik wrote:

Finally, are you sure that infiniband/complib/cl_types_osd.h exists on all 
platforms? (e.g., Solaris) I know you said you don't have any Solaris machines 
to test with, but you should ping Oracle directly for some testing -- Terry 
might not be paying attention to this specific thread...

I'll check it, but my guess would be that Solaris doesn't have it.
AFAIK Solaris doesn't use OpenFabrics OpenSM - it has a separate
subnet manager, so I can't assume that it has.
So right now the dynamic SL will probably not work on Solaris
(though it won't break the compilation).

I have a pair of old machines running Solaris 11 Express (aka "SunOS 5.11 snv_151a 
November 2010").
These have IB Verbs support, but there is no such header. In fact, 
/usr/include/inifiband has no sub-directories.

+1

(That is, no such header and not even any subdirectories on a very recent 
version of Solaris 11: snv_168.)

Makes sense. But I believe that these Solaris installations
just don't have Subnet Manager, so they are not supposed to
have these headers anyway. What I don't know is what headers
are installed as part of the SM installation.
Does anybody have a Solaris machine with Subnet Manager?

-- YK



I may be able to do some testing eventually, but now is not a good time.

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com

Re: [OMPI devel] opal_init/finalize counter --> boolean

2011-07-11 Thread Terry Dontje

Trying to uplevel this a bit so I can figure out which of these paths
makes sense to me. Is the only reason we want to convert the symmetry
of init and finalize to being asymmetric is to support an abort case?
Forgive me Ralph, I know you had posted this in one of the emails but I
wanted to make sure it was that simple (I feel it probably isn't).

--td

On 7/11/2011 7:28 AM, George Bosilca wrote:

On Jul 9, 2011, at 13:43 , Jeff Squyres wrote:

Leaving out many details, I think the arguments can be summarized as:

1. Ralph's argument is that per convention of our other 2 layers,
"_finalize" should unconditionally finalize the layer. Just do it. It's
also weird that opal_finalize() may actually do *nothing* (vs. finalizing at least all of its
stuff but leave OPAL util stuff initialized) -- this is not symmetric.

2. George's argument is that for API symmetry, if you call opal_init_util, then
opal shouldn't be finalized until opal_finalize_util is invoked. Plus, we may
want to use OPAL utils after opal_finalize someday (note that we don't do this
today).

How about a compromise?

Based on the english dictionary a compromise is an agreement or a settlement of
a dispute that is reached by each side making concessions. This is not a
compromise. This is exactly what Ralph did plus name changes. Therefore, this
is a single sided settlement.

- Take what is (essentially) in opal_init() today and rename it to be
opal_init_frameworks() -- because it's (mostly) initializing the OPAL MCA
frameworks.

- Take what is (essentially) in opal_finalize() today and rename it to be
opal_finalize_frameworks() -- because it's (mostly) finalizing the OPAL MCA
frameworks. Remove the call to opal_finalize_util() from this function.

- Remove all use of counters; calling opal_init*() will initialize (unless it
has already been initialized), and calling opal_finalize*() will finalize
(unless it has already been finalized).

- Create a new opal_init() function that is a wrapper around opal_init_util()
and opal_init_frameworks(). Create a new opal_finalize() function that is a
wrapper around opal_finalize_util() and opal_finalize_frameworks().

- orte_finalize() will call opal_finalize() -- i.e., it will unconditionally
shut down all of OPAL. This will remove the need for opal_finalize_util() in
the MPI layer.

This seems to give all desired behaviors:

- All_finalize() functions will be unconditional. The Law of Least
Surprise is preserved.

- There are paths for split init and split finalize and combined init and
combined finalize. They can even be combined (e.g., split init and combined
finalize -- which will be a common case, actually).

Least surprise you say? How surprise one will be once he/she realize that
orte_finalize teared down all OPAL. At least, do not forget to add one of
those nice comments about the fact that one have to initialize the utils, but
don't have to finalize them, ORTE will graciously do it for you.

You want to change names, OK go for it. At the point where the code is today, I
don't think it really matters anymore. In fact why do we need a boolean in the
code Ralph put inside? If the opal_finalize is supposed to clean up everything
just go ahead and remove that useless bool. And as you made the code easy to
understand by few, put a big comment for all the others (the ones that will try
to understand why their code break after a simple orte_finalize).

Moreover, Open MPI is a community project. I'm the only one against this change
and you guys are two for adding this great feature to the code base. So go
ahead and implement it with the blessing of the entire community!

george.

If we ever want to use OPAL utility behavior after orte_finalize() someday, we can. E.g., we can
pass a flag to orte_finalize() saying "use opal_finalize_frameworks() instead of
opal_finalize()", or perhaps even "don't finalize OPAL at all."

On Jul 8, 2011, at 11:57 AM, George Bosilca wrote:

On Jul 8, 2011, at 16:15 , Ralph Castain wrote:

So we have opal_init * 1 and opal_util * 2. Clearly the opal util is not a
simple ON/OFF stuff. With Ralph patch the OPAL utilities will disappear as soon
as the OMPI layer call orte_fini. Luckily, today there is nothing between the
call to orte_fini and opal_finalize_util, so we're safe from a segfault.

The point is that you shouldn't be calling opal_finalize_util separately. We do
so now only because of the counter - there is no reason for doing it separately
otherwise.

Absolutely not, we do so for consistency. If as a software layer have to
explicitly call the opal util initialization function (in order to access some
features), then it should __explicitly__ state when it doesn't need it anymore
(instead of relying on some other layer will do the right thing for me).

In other words, we created a counter, and then modified the code to make the
counter work. There is no reason for it to exist as there is no use of the

Re: [hwloc-devel] Merging the PCI branch?

2011-04-06 Thread Terry Dontje


On 04/06/2011 08:49 AM, Brice Goglin wrote:

Le 31/03/2011 18:06, Jeff Squyres a écrit :

On Mar 28, 2011, at 5:26 PM, Brice Goglin wrote:


libpci is needed to make this work. And only Linux gives you OS devices
for now (we use sysfs to translate between pci devs and os devs).


Is libpci available on all platforms?  Or is it only needed on Linux?


Our PCI stuff works on FreeBSD7, but it seems to require root access
(/dev/pci) for everything (Linux only requires root access to get the
link speeds). Also we don't have any information about OS devices there,
so we just get PCI devices and bridges.

I couldn't find a way to install libpci on our Solaris platforms.


Do you have source you can point me?  I could try on a couple of my systems.

--td

Brice

___
hwloc-devel mailing list
hwloc-de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel



--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com

Re: [OMPI devel] trunk not compiling for btl_openib_connect_oob.c

2011-03-16 Thread Terry Dontje


On 03/16/2011 12:00 PM, Jeff Squyres wrote:

Ya, you're right -- I'm looking at my MTT right now and I see lots of broken 
installs.

But it works if I compile manually.  Weird.
So when I saw your MTT results it was not finding a header file as 
opposed to the problem I was incurring which was a redeclaration issue.  
So I am wondering if your MTT issue is something different?


--td

Mellanox -- please fix ASAP, or we'll likely back our r24507 so that people can 
keep working...


On Mar 16, 2011, at 11:58 AM, George Bosilca wrote:


The trunk is indeed broken. The reason is, as Terry pointed out, the inclusion of 
infiniband/mad.h introduced by r24507 
(https://svn.open-mpi.org/trac/ompi/changeset/24507). As long as OFED 1.4 is 
available, it will compile independent of the version of the kernel, libpthread, moon 
position or.

  george.


On Mar 16, 2011, at 09:35 , Jeff Squyres wrote:


On Mar 16, 2011, at 6:50 AM, Terry Dontje wrote:


K. When Ralph and I removed that code, it was on he educated guess that no one 
was using it (because it hasn't compiled right in a while). If we were wrong, 
it can be put back, but someone will need to update it and Ralph and I don't 
have access to machines to test that behavior.

Ok, however, the compilation issue I am running into has nothing to do with 
your's and Ralph's changes.  I would have expected not to even get as far as 
compiling the openib btl, right?

Right.

What does your configure output say when it is testing for different PIDs for 
threads?

--
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

George Bosilca
Research Assistant Professor
Innovative Computing Laboratory
Department of Electrical Engineering and Computer Science
University of Tennessee, Knoxville
http://web.eecs.utk.edu/~bosilca/


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel





--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>

Re: [OMPI devel] trunk not compiling for btl_openib_connect_oob.c

2011-03-16 Thread Terry Dontje


On 03/16/2011 06:34 AM, Terry Dontje wrote:

On 03/16/2011 06:21 AM, Jeff Squyres wrote:

On Mar 16, 2011, at 5:51 AM, Terry Dontje wrote:


I've seen this with the following:

RH 4.6 / OFED 1.3.6

Errr... did you look 
athttp://www.open-mpi.org/community/lists/devel/2011/03/9068.php?
Yes I did, and I will be talking with my group about this, this 
afternoon.  We might be able to remove that dependency.

CentOS 5.2 / OFED 1.3.6
SLES 10.1 /  OFED 1.3.6

I know the above is pretty darn old but it would be nice to know what is the 
oldest s/w we can be using?  Note things have been building up until now.

BTW, I am now trying to compile on a system with ofed 1.4.4.

Using OFED 1.4.4 seems to work for me.

--td

I'll look at my MTT runs later this morning.




--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>




___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>

Re: [OMPI devel] trunk not compiling for btl_openib_connect_oob.c

2011-03-16 Thread Terry Dontje


On 03/16/2011 06:38 AM, Jeff Squyres (jsquyres) wrote:
K. When Ralph and I removed that code, it was on he educated guess 
that no one was using it (because it hasn't compiled right in a 
while). If we were wrong, it can be put back, but someone will need to 
update it and Ralph and I don't have access to machines to test that 
behavior.
Ok, however, the compilation issue I am running into has nothing to do 
with your's and Ralph's changes.  I would have expected not to even get 
as far as compiling the openib btl, right?


--td


Sent from my phone. No type good.

On Mar 16, 2011, at 6:32 AM, "Terry Dontje" <terry.don...@oracle.com 
<mailto:terry.don...@oracle.com>> wrote:



On 03/16/2011 06:21 AM, Jeff Squyres wrote:

On Mar 16, 2011, at 5:51 AM, Terry Dontje wrote:


I've seen this with the following:

RH 4.6 / OFED 1.3.6

Errr... did you look 
athttp://www.open-mpi.org/community/lists/devel/2011/03/9068.php?
Yes I did, and I will be talking with my group about this, this 
afternoon.  We might be able to remove that dependency.

CentOS 5.2 / OFED 1.3.6
SLES 10.1 /  OFED 1.3.6

I know the above is pretty darn old but it would be nice to know what is the 
oldest s/w we can be using?  Note things have been building up until now.

BTW, I am now trying to compile on a system with ofed 1.4.4.

I'll look at my MTT runs later this morning.




--

Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>



___
devel mailing list
de...@open-mpi.org <mailto:de...@open-mpi.org>
http://www.open-mpi.org/mailman/listinfo.cgi/devel



___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>

Re: [OMPI devel] trunk not compiling for btl_openib_connect_oob.c

2011-03-16 Thread Terry Dontje


On 03/16/2011 06:21 AM, Jeff Squyres wrote:

On Mar 16, 2011, at 5:51 AM, Terry Dontje wrote:


I've seen this with the following:

RH 4.6 / OFED 1.3.6

Errr... did you look at 
http://www.open-mpi.org/community/lists/devel/2011/03/9068.php?
Yes I did, and I will be talking with my group about this, this 
afternoon.  We might be able to remove that dependency.

CentOS 5.2 / OFED 1.3.6
SLES 10.1 /  OFED 1.3.6

I know the above is pretty darn old but it would be nice to know what is the 
oldest s/w we can be using?  Note things have been building up until now.

BTW, I am now trying to compile on a system with ofed 1.4.4.

I'll look at my MTT runs later this morning.




--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>

Re: [OMPI devel] trunk not compiling for btl_openib_connect_oob.c

2011-03-15 Thread Terry Dontje

It looks to me like r24507 is what changed in btl_openib_connect_oob.c 
to include the two header files that are conflicting with each other.


--td

On 03/15/2011 01:39 PM, Terry Dontje wrote:
While compiling btl_openib_connect_oob.c I am getting identifier 
redeclared: ib_gid_t.  Looks like infiniband/mad.h defines this and 
then iba/types.h tries to redefine it.


I am on Linux compiling with gcc.  Is anyone else seeing the same 
issue or am I possibly dealing with some old s/w?

--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>




___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>

[OMPI devel] trunk not compiling for btl_openib_connect_oob.c

2011-03-15 Thread Terry Dontje

While compiling btl_openib_connect_oob.c I am getting identifier 
redeclared: ib_gid_t.  Looks like infiniband/mad.h defines this and then 
iba/types.h tries to redefine it.


I am on Linux compiling with gcc.  Is anyone else seeing the same issue 
or am I possibly dealing with some old s/w?

--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com

Re: [OMPI devel] RFC: use ISO C99 style struct initialization

2011-01-19 Thread Terry Dontje

Hopefully we'll find out tomorrow but I think I vaguely remember an 
issue with the Studio compilers and this type of initialization style.


--td

On 01/19/2011 05:22 PM, Nathan Hjelm wrote:
Done. I added the module orte/mca/debugger/dummy and I will remove it 
tomorrow.


-Nathan
HPC-3, LANL

On Wed, 19 Jan 2011, Jeff Squyres wrote:


+1 on Ralph and George's comments.

Want to make a dummy component somewhere that uses this kind of 
initialization and see what happens?  Put a test for the C99 
initialization style in configure.m4 to see if it works or not; MTT 
will then check this for all the compilers that we care about.



On Jan 19, 2011, at 3:58 PM, Ralph Castain wrote:

I believe the majority of structs used in OMPI are actually declared 
to be opal objects of some flavor, so I'm not sure how much this 
will actually accomplish. Other than that, I have no real objection 
- either way works fine for me.



On Jan 19, 2011, at 12:29 PM, George Bosilca wrote:

I'm with you on that. Let's create a fake module using the ISO C99 
naming scheme, and leave it to MTT to figure out where is breaks!


george.

On Jan 19, 2011, at 14:23 , Nathan Hjelm wrote:

I don't know if this has been discussed before or if this will 
break Windows (or some obscure platform) support but I would like 
to start using the ISO C99 style for struct initialization (see 
section 6.7.8, example 10 in 
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf). Using 
this style would make mca code much easier to read. Any thoughts? 
Would this break something?


Example:
struct module_foo {
char *bar;
int   baz;
};

struct foo foobar = {
.bar = "foobar",
.baz = 1
};

-Nathan
HPC-3, LANL
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com

Re: [OMPI devel] OMPI 1.4.3 hangs in gather

2011-01-18 Thread Terry Dontje

On 01/18/2011 07:48 AM, Jeff Squyres wrote:
> IBCM is broken and disabled (has been for a long time).
>
> Did you mean RDMACM?
>
>
No I think I meant OMPI oob.

sorry,

-- 
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com

Re: [OMPI devel] OMPI 1.4.3 hangs in gather

2011-01-18 Thread Terry Dontje

Could the issue have anything to do with the how OMPI implements lazy 
connections with IBCM?  Does setting the mca parameter 
mpi_preconnect_all to 1 change things?


--td

On 01/16/2011 04:12 AM, Doron Shoham wrote:

Hi,

The gather hangs only in liner_sync algorithm but works with
basic_linear and binomial algorithms.
The gather algorithm is choosen dynamiclly depanding on block size and
communicator size.
So, in the beginning, binomial algorithm is chosen (communicator size
is larger then 60).
When increasing the message size, the liner_sync algorithm is chosen
(with small_segment_size).
When debugging on the cluster I saw that the linear_sync function is
called in endless loop with segment size of 1024.
This explain why hang occure in the middle of the run.

I still don't understand why does RDMACM solve it or what causes
liner_sync hangs.

Again, in 1.5 it doesn't hang (maybe timing is different?).
I'm still trying to understand what are the diffrences in those areas
between 1.4.3 and 1.5


BTW,
Choosing RDMACM fixes hangs and performance issues in all collective operations.

Thanks,
Doron


On Thu, Jan 13, 2011 at 9:44 PM, Shamis, Pavel  wrote:

RDMACM creates the same QPs with the same tunings as OOB, so I don't see how 
CPC may effect on performance.

Pavel (Pasha) Shamis
---
Application Performance Tools Group
Computer Science and Math Division
Oak Ridge National Laboratory






On Jan 13, 2011, at 2:15 PM, Jeff Squyres wrote:


+1 on what Pasha said -- if using rdmacm fixes the problem, then there's 
something else nefarious going on...

You might want to check padb with your hangs to see where all the processes are 
hung to see if anything obvious jumps out.  I'd be surprised if there's a bug 
in the oob cpc; it's been around for a long, long time; it should be pretty 
stable.

Do we create QP's differently between oob and rdmacm, such that perhaps they are 
"better" (maybe better routed, or using a different SL, or ...) when created 
via rdmacm?


On Jan 12, 2011, at 12:12 PM, Shamis, Pavel wrote:


RDMACM or OOB can not effect on performance of this benchmark, since they are 
not involved in communication. So I'm not sure that the performance changes 
that you see are related to connection manager changes.
About oob - I'm not aware about hangs issue there, the code is very-very old, 
we did not touch it for a long time.

Regards,

Pavel (Pasha) Shamis
---
Application Performance Tools Group
Computer Science and Math Division
Oak Ridge National Laboratory
Email: sham...@ornl.gov





On Jan 12, 2011, at 8:45 AM, Doron Shoham wrote:


Hi,

For the first problem, I can see that when using rdmacm as openib oob
I get much better performence results (and no hangs!).

mpirun -display-map -np 64 -machinefile voltairenodes -mca btl
sm,self,openib -mca btl_openib_connect_rdmacm_priority 100
imb/src/IMB-MPI1 gather -npmin 64


#bytes  #repetitionst_min[usec] t_max[usec] t_avg[usec]

0   10000.040.050.05

1   100019.64   19.69   19.67

2   100019.97   20.02   19.99

4   100021.86   21.96   21.89

8   100022.87   22.94   22.90

16  100024.71   24.80   24.76

32  100027.23   27.32   27.27

64  100030.96   31.06   31.01

128 100036.96   37.08   37.02

256 100042.64   42.79   42.72

512 100060.32   60.59   60.46

1024100082.44   82.74   82.59

20481000497.66  499.62  498.70

40961000684.15  686.47  685.33

8192519 544.07  546.68  545.85

16384   519 653.20  656.23  655.27

32768   519 704.48  707.55  706.60

65536   519 918.00  922.12  920.86

131072  320 2414.08 2422.17 2418.20

262144  160 4198.25 4227.58 4213.19

524288  80  7333.04 7503.99 7438.18

1048576 40  13692.6014150.2013948.75

2097152 20  30377.3432679.1531779.86

4194304 10  61416.7071012.5068380.04

How can the oob cause the hang? Isn't it only used to bring up the connection?
Does the oob has any part of the connections were made?

Thanks,
Dororn

On Tue, Jan 11, 2011 at 2:58 PM, Doron Shoham  wrote:

Hi

All machines on the setup are IDataPlex with Nehalem 12 cores per node, 24GB  
memory.



· Problem 1 – OMPI 1.4.3 hangs in gather:



I’m trying to run IMB and gather operation with OMPI 1.4.3 (Vanilla).

It happens when np>= 64 and message size exceed 4k:

mpirun -np 64 -machinefile voltairenodes -mca btl sm,self,openib  
imb/src-1.4.2/IMB-MPI1 gather –npmin 64



voltairenodes consists of 64 machines.



#

# Benchmarking Gather

# #processes = 64

[OMPI devel] openib btl_openib_async_thread poll question

2010-12-21 Thread Terry Dontje

We're doing some testing with openib btl on a system with Solaris.  It 
looks like Solaris can return POLLIN|POLLRDNORM in revents from a poll 
call.  I looked at the manpages for Linux and it reads like Linux could 
possibly do this too.  However the code in btl_openib_async_thread that 
checks for valid revents is only checking for POLLIN and in the case it 
gets POLLIN|POLLRDNORM the btl ends up throwing an error.  I think 
erroring out on the POLLIN|POLLRDNORM case is a bug.


Does anyone feel otherwise and can explain to me why we should not 
consider POLLIN|POLLRDNORM as a valid condition?  I have the same 
question pertaining to POLLRDBAND too but I don't believe we've seen 
this set.


thanks,
--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com

Re: [OMPI devel] 1.5 plans

2010-11-30 Thread Terry Dontje


On 11/30/2010 10:10 AM, Joshua Hursey wrote:

(Insert jab at the definition of 'quickly' when talking about OMPI releases)

> From the way I read Jeff's original email, it seems that we are trying to get 
v1.5 stable so we can start v1.7 in the next few months (3-5). The C/R 
functionality on the trunk is significantly different than that on the v1.5 (and 
more so with v1.4). So brining these features over the v1.5 branch will require a 
CMR that will look like re-syncing to the trunk (it requires the ORTE refresh, and 
a couple other odds and ends). Since the ORTE refresh was killed due to the size 
of the feature, so has the C/R features. So even though the v1.5 is a feature 
branch, the C/R feature is locked out of it at the moment and pushed to v1.7.

Yeah, we have successfully deadlocked ourselves.  We got features that 
cannot go in because they rely on stuff we refuse to bringover because 
of stability but at the same time cannot force 1.5 to be 1.6 because 1.5 
isn't stable enough itself.  Quite a pickle.  I still believe a 
refresh/sync of trunk to 1.5 makes sense.  The only other solution is to 
start 1.7 and put 1.5 to bed.   Unfortunately there are some 
implications for Oracle if all the current stuff is put into 1.7 instead 
of 1.5.

So, from my perspective, there is now a push to hurry up on the v1.7 so users 
will have a release branch with the latest-n-greatest C/R functionality. 
Releasing v1.7 next summer would be fine with me, but pushing it further into 
the future seems bad to me.

Well, I think we need to really think about this carefully to make sure 
we do not end up in the same situation 6 months from now.

As a side comment:
The stable branch is a great idea for the production side of the house since it 
is more carefully crafted and maintained. The feature branch is a great idea 
for the researchers in the group to gain exposure for new features, and 
enhancements on old features (many of these require changes to internal APIs 
and data structures). From my perspective, a slow moving feature branch is no 
longer that useful to the research community since it becomes more and more 
painful to synchronize the trunk and branch the longer it takes for the feature 
branch to stabilize for release. So the question often becomes why bother. But 
this a longer discussion for another time maybe.

IMO, the problem is we ended up not stablizing 1.5 quick enough thus 
causing so great of a divergence that we are in the pickle we are now.  
The whole idea was we were to push stuff into 1.5 quickly.  If we cannot 
do that then we may want to reconsider how we handle releases again :-(.


--td

-- Josh

On Nov 30, 2010, at 9:36 AM, Terry Dontje wrote:


On 11/30/2010 09:00 AM, Jeff Squyres wrote:

On Nov 30, 2010, at 8:54 AM, Joshua Hursey wrote:



Can you make a v1.7 milestone on Trac, so I can move some of my tickets?


Done.


I have a question about Josh's recent ticket moves.  One of them mentions 1.5 
is stablizing quickly Josh can you clarify what you mean by quickly because I 
think there will be a 1.5 release 3-6 months from now.  So does that fall into 
your quickly perspective?

--td

Some are CMRs, but a couple are defects, with fixes in development, that 
without those CMRs cannot be moved to v1.5.

Thanks,
Josh


On Nov 29, 2010, at 11:43 AM, Jeff Squyres wrote:



I'm about 2 weeks late on this email; apologies.  SC and Thanksgiving got in 
the way.

Per a discussion on the devel teleconf nearly 3 weeks ago, we have decided what 
to do with the v1.5 series:

- 1.5.1 will be a bug fix release.  There's 2 blocker bugs right now that need 
to be reviewed; those and the currently ready-to-commit major CMR are all that 
is planned for 1.5.1.  Hopefully, they could be ready by tonight.

- 1.5.2 (and successive releases) will be "normal" feature releases.  There's a 
bit of divergence between the trunk and the v1.5 branch, meaning that some porting of 
features may be required to get over to the v1.5 branch (FWIW, I think that many things 
will not require much porting at all -- but some will).  Many of the CMRs filed against 
v1.5.2 are still relevant; *some* of the features/bugs are still relevant.  We'll start 
[re-]examining the v1.5.2 tickets in more detail soon.  So feel free to apply to have 
your favorite feature brought over to the v1.5 branch.  Bigger features may be kept in 
the wings for v1.7 (e.g., the wholesale ORTE refresh for v1.5.x has been axed and will 
wait until v1.7).  There is a bunch of affinity work occurring on the trunk (and/or in hg 
branches) right now; we plan to bring all that stuff in to the v1.5 series when ready 
(probably 3+ months at the earliest -- especially with the December holidays delaying 
everything).  Once that's done, !

  we!

   !


ca!


n then probably start thinking about wrapping up the v1.5 series, converting it 
to its stable counterpart (1.6), and then branching for v1.7.

--
Jeff Squyres

jsquy...@cisco.com

For co

Re: [OMPI devel] 1.5 plans

2010-11-30 Thread Terry Dontje


On 11/30/2010 09:00 AM, Jeff Squyres wrote:

On Nov 30, 2010, at 8:54 AM, Joshua Hursey wrote:


Can you make a v1.7 milestone on Trac, so I can move some of my tickets?

Done.
I have a question about Josh's recent ticket moves.  One of them 
mentions 1.5 is stablizing quickly Josh can you clarify what you mean by 
quickly because I think there will be a 1.5 release 3-6 months from 
now.  So does that fall into your quickly perspective?


--td

Some are CMRs, but a couple are defects, with fixes in development, that 
without those CMRs cannot be moved to v1.5.

Thanks,
Josh


On Nov 29, 2010, at 11:43 AM, Jeff Squyres wrote:


I'm about 2 weeks late on this email; apologies.  SC and Thanksgiving got in 
the way.

Per a discussion on the devel teleconf nearly 3 weeks ago, we have decided what 
to do with the v1.5 series:

- 1.5.1 will be a bug fix release.  There's 2 blocker bugs right now that need 
to be reviewed; those and the currently ready-to-commit major CMR are all that 
is planned for 1.5.1.  Hopefully, they could be ready by tonight.

- 1.5.2 (and successive releases) will be "normal" feature releases.  There's a 
bit of divergence between the trunk and the v1.5 branch, meaning that some porting of 
features may be required to get over to the v1.5 branch (FWIW, I think that many things 
will not require much porting at all -- but some will).  Many of the CMRs filed against 
v1.5.2 are still relevant; *some* of the features/bugs are still relevant.  We'll start 
[re-]examining the v1.5.2 tickets in more detail soon.  So feel free to apply to have 
your favorite feature brought over to the v1.5 branch.  Bigger features may be kept in 
the wings for v1.7 (e.g., the wholesale ORTE refresh for v1.5.x has been axed and will 
wait until v1.7).  There is a bunch of affinity work occurring on the trunk (and/or in hg 
branches) right now; we plan to bring all that stuff in to the v1.5 series when ready 
(probably 3+ months at the earliest -- especially with the December holidays delaying 
everything).  Once that's done, we!

   !

ca!

n then probably start thinking about wrapping up the v1.5 series, converting it 
to its stable counterpart (1.6), and then branching for v1.7.

--
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel





--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r23998

2010-11-08 Thread Terry Dontje

Hmmm, it looks like you are right so my original change probably is the 
right thing then.


--td

On 11/08/2010 08:13 AM, Jeff Squyres wrote:

It doesn't look like  is needed at all in libevent207.h.  Should it 
just be removed?


On Nov 8, 2010, at 6:18 AM, Terry Dontje wrote:


In light of the push event changes upstream to libevent the changes to libevent207.h 
probably should be modified to look like event.h.  That is wrap the 
include  with some ifdef for C++.  I did not do this in the original 
fix because everything pulling it in was also pulling in opal_config.h and getting 
stdbool.h in when it needed to.

Jeff, do you want me to change libevent207.h to the above?

--td

On 11/05/2010 02:58 PM, Jeff Squyres wrote:

This patch should be pushed upstream to libevent.

Terry / Ralph?



On Nov 5, 2010, at 2:54 PM,
t...@osl.iu.edu
  wrote:



Author: tdd
Date: 2010-11-05 14:54:19 EDT (Fri, 05 Nov 2010)
New Revision: 23998
URL:
https://svn.open-mpi.org/trac/ompi/changeset/23998


Log:
corrected stdbool.h inclusion to allow Oracle C++ compilers to work with OMPI
Text files modified:
   trunk/opal/mca/event/libevent207/libevent/include/event2/event.h | 4 +++-
   trunk/opal/mca/event/libevent207/libevent207.h   | 3 ---
   2 files changed, 3 insertions(+), 4 deletions(-)

Modified: trunk/opal/mca/event/libevent207/libevent/include/event2/event.h
==
--- trunk/opal/mca/event/libevent207/libevent/include/event2/event.h
(original)
+++ trunk/opal/mca/event/libevent207/libevent/include/event2/event.h
2010-11-05 14:54:19 EDT (Fri, 05 Nov 2010)
@@ -45,7 +45,9 @@
#include
#endif
#ifndef WIN32
-#include
+#if !(defined(c_plusplus) || defined(__cplusplus))
+#include
+#endif
#endif

#include

Modified: trunk/opal/mca/event/libevent207/libevent207.h
==
--- trunk/opal/mca/event/libevent207/libevent207.h  (original)
+++ trunk/opal/mca/event/libevent207/libevent207.h  2010-11-05 14:54:19 EDT 
(Fri, 05 Nov 2010)
@@ -42,9 +42,6 @@
#include
#include
#include
-#ifndef WIN32
-#include
-#endif

#include "opal/class/opal_object.h"
#include "opal/threads/mutex.h"
___
svn-full mailing list

svn-f...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/svn-full


--

Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle - Performance Technologies
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com



___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel





--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r23998

2010-11-08 Thread Terry Dontje

In light of the push event changes upstream to libevent the changes to 
libevent207.h probably should be modified to look like event.h.  That is 
wrap the include  with some ifdef for C++.  I did not do this 
in the original fix because everything pulling it in was also pulling in 
opal_config.h and getting stdbool.h in when it needed to.


Jeff, do you want me to change libevent207.h to the above?

--td

On 11/05/2010 02:58 PM, Jeff Squyres wrote:

This patch should be pushed upstream to libevent.

Terry / Ralph?



On Nov 5, 2010, at 2:54 PM, t...@osl.iu.edu wrote:


Author: tdd
Date: 2010-11-05 14:54:19 EDT (Fri, 05 Nov 2010)
New Revision: 23998
URL: https://svn.open-mpi.org/trac/ompi/changeset/23998

Log:
corrected stdbool.h inclusion to allow Oracle C++ compilers to work with OMPI
Text files modified:
   trunk/opal/mca/event/libevent207/libevent/include/event2/event.h | 4 +++-
   trunk/opal/mca/event/libevent207/libevent207.h   | 3 ---
   2 files changed, 3 insertions(+), 4 deletions(-)

Modified: trunk/opal/mca/event/libevent207/libevent/include/event2/event.h
==
--- trunk/opal/mca/event/libevent207/libevent/include/event2/event.h
(original)
+++ trunk/opal/mca/event/libevent207/libevent/include/event2/event.h
2010-11-05 14:54:19 EDT (Fri, 05 Nov 2010)
@@ -45,7 +45,9 @@
#include
#endif
#ifndef WIN32
-#include
+#if !(defined(c_plusplus) || defined(__cplusplus))
+#include
+#endif
#endif

#include

Modified: trunk/opal/mca/event/libevent207/libevent207.h
==
--- trunk/opal/mca/event/libevent207/libevent207.h  (original)
+++ trunk/opal/mca/event/libevent207/libevent207.h  2010-11-05 14:54:19 EDT 
(Fri, 05 Nov 2010)
@@ -42,9 +42,6 @@
#include
#include
#include
-#ifndef WIN32
-#include
-#endif

#include "opal/class/opal_object.h"
#include "opal/threads/mutex.h"
___
svn-full mailing list
svn-f...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/svn-full





--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com

Re: [OMPI devel] v1.5.1: one idea

2010-10-11 Thread Terry Dontje

On 10/11/2010 06:11 AM, Jeff Squyres wrote:

On Oct 10, 2010, at 7:49 AM, Terry Dontje wrote:

At first glance this sounds like a sane approach but didn't we start with this
same approach with 1.5.0? I know it was kind of required to do it for 1.5.0
but we did go off track with delivery. I believe to be successful at making a
deadline for 1.5.1 we need to consider the following. Do we think the initial
stablization is going to take weeks or months?

I *think* weeks. The trunk is pretty stable right now. But then again, that's
why I'm asking here -- what do others think? Are there half-baked features in
the trunk that are not / nowhere near ready for the v1.5.1?

I don't have my half bake features in the trunk, yet :-).

While we stablize what will be the rules of doing CMRs to 1.5.1? What will be
the rules for doing CMRs to 1.5.1 after stablization?

I think the CMRs will be pretty much the same. However, we do reserve the right to have the more
aggressive CMRs -- e.g., something "big" can be "ok, Terry/CMR committer, you have
the v1.5 branch for 3 days. Bring your feature over to it." (might not be necessary if we
re-sync, but we still reserve the right to do it ;-) ).

Did I misunderstand the commitment we had for quick dot feature releases
earlier this year? The above sounds like we'll repeat the 1.5.0 release
schedule and possibly end up not releasing 1.5.1 until 8 months from
now. Unless the trunk doesn't have any major features/changes I'd
almost be more inclined to say the stablization of the trunk merge be
the 1.5.1 release and plan for a 1.5.2 based on CMRs (no CMR no putback
to 1.5.2). However, that is really dependent on the merge happening and
what amount of wide spread changes that end up being put back into the
trunk that makes backporting to 1.5 branch difficult to impossible. I
would even be somewhat supportive of periodic trunk sync ups to
alleviate such backport pain.

--td

On 10/8/2010 5:13 PM, Jeff Squyres wrote:

As we discussed on the call last week, since there is already a bit of a
divergence between the trunk and the v1.5 branch, how's this for a wild idea:

What if we re-sync the entire trunk to the v1.5 branch, stabilize that,
and call it v1.5.1?

The assumption here is that it will be [far] easier to just re-sync the trunk
to the v1.5 branch than to try to bring over stuff in a piecemeal fashion.

There's a *bunch* of new stuff on the trunk that is not on the v1.5 branch -- there's
more than enough "meat" to call it a new release.

*** Put differently: is there anything on the trunk that is *not* ready to go
to the v1.5 series?

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle * - Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>

Re: [hwloc-devel] Support for solaris lgrp_affinity_set

2010-08-20 Thread Terry Dontje


Samuel Thibault wrote:

Hello,

Terry Dontje, le Fri 06 Aug 2010 13:11:30 -0400, a écrit :
  
Is anyone looking at replacing the Solaris processor_bind calls with 
lgrp_affinity_set calls in hwloc?



I eventually added using lgrp_affinity_set(). Not as a replacement for
processor_bind, as AIUI, lgrp_affinity_set() doesn't permit to specify
precise processors.

  
I believe (and I might be wrong here) that there are premade lgrps that 
correspond to precise processors.   In another life here at Oracle I've 
used OpenSolaris plgrp command to bind processes to lgrps that contained 
specific processors.  This is what led me to believe that
using lgrp_affinity_set() might help in being able to bind to multiple 
processors.


Unfortunately I don't have the exact particulars to give you.  If I get 
some time in the next couple weeks I'll see if I can come up with some 
example code that might be able to do the above.


--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle * - Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>

Re: [OMPI devel] Trunk Commit Heads-up: New Common Shared Memory Component

2010-08-12 Thread Terry Dontje

Sorry Rich, I didn't realize there was a graph attached at the end of 
message.  In other words my comments are not applicable because I really 
didn't know you were asking about the graph.  I agree it would be nice 
to know what the graph was plotting.


--td
Terry Dontje wrote:

Graham, Richard L. wrote:

Stupid question:
   What is being plotted, and what are the units ?

Rich
  
MB of Resident and Shared memory as gotten from top (on linux).  The 
values for each of the processes run cases seem to be the same between 
posix, mmap and sysv.


--td

On 8/11/10 3:15 PM, "Samuel K. Gutierrez" <sam...@lanl.gov> wrote:

Hi Terry,










On Aug 11, 2010, at 12:34 PM, Terry Dontje wrote:


 I've done some minor testing on Linux looking at resident and shared memory 
sizes for np=4, 8 and 16 jobs.  I could not see any appreciable differences in 
sizes in the process between sysv, posix or mmap usage in the SM btl.

 So I am still somewhat non-plussed about making this the default.  It seems 
like the biggest gain out of using posix might be one doesn't need to worry 
about the location of the backing file.  This seems kind of frivolous to me 
since there is a warning that happens if the backing file is not memory based.

If I'm not mistaken, the warning is only issued if the backing files is stored 
on the following file systems: Lustre, NFS, Panasas, and GPFS  (see: 
opal_path_nfs in opal/util/path.c).  Based on the performance numbers that 
Sylvain provided on June 9th of this year (see attached),  there was a 
performance difference between mmap on /tmp and mmap on a tmpfs-like file 
system (/dev/shm in that particular case).  Using the new POSIX component 
provides us with a portable way to provide similar shared memory performance 
gains without having to worry about where the OMPI session directory is rooted.

--
Samuel K. Gutierrez
Los Alamos National Laboratory

[cid:3364459484_11867134]


 I still working on testing the code on Solaris but I don't imagine I will see 
anything that will change my mind.

 --td

 Samuel K. Gutierrez wrote:
Hi Rich,

 It's a modification to the existing common sm component.  The modifications do 
include the addition of a new POSIX shared memory facility, however.

 Sam

 On Aug 11, 2010, at 10:05 AM, Graham, Richard L. wrote:


Is this a modification of the existing component, or a new component ?

 Rich


 On 8/10/10 10:52 AM, "Samuel K. Gutierrez" <sam...@lanl.gov> 
<mailto:sam...@lanl.gov>  wrote:

 Hi,

 I wanted to give everyone a heads-up about a new POSIX shared memory
 component
 that has been in the works for a while now and is ready to be pushed
 into the
 trunk.

 http://bitbucket.org/samuelkgutierrez/ompi_posix_sm_new

 Some highlights:
 o New posix component now the new default.
o May address some of the shared memory performance issues users
 encounter
   when OMPI's session directories are inadvertently placed on a non-
 local
   filesystem.
 o Silent component failover.
o In the default case, if the posix component fails initialization,
   mmap will be selected.
 o The sysv component will only be queried for selection if it is
 placed before
the mmap component (for example, -mca mpi_common_sm
 sysv,posix,mmap).  In the
default case, sysv will never be queried/selected.
 o Per some on-list discussion, now unlinking mmaped file in both mmap
 and posix
components (see: "System V Shared Memory for Open MPI: Request for
 Community
Input and Testing" thread).
 o  Assuming local process homogeneity with respect to all utilized
 shared
 memory facilities. That is, if one local process deems a
 particular shared
 memory facility acceptable, then ALL local processes should be
 able to
 utilize that facility. As it stands, this is an important point
 because one
 process dictates to all other local processes which common sm
 component will
 be selected based on its own, local run-time test.
 o Addressed some of George's code reuse concerns.

 If there are no major objections by August 17th, I'll commit the code
 after the
 Tuesday morning conference call.

 Thanks!

 --
 Samuel K. Gutierrez
 Los Alamos National Laboratory





 ___
 devel mailing list
 de...@open-mpi.org
 http://www.open-mpi.org/mailman/listinfo.cgi/devel


 ___
 devel mailing list
 de...@open-mpi.org
 http://www.open-mpi.org/mailman/listinfo.cgi/devel


 ___
 devel mailing list
 de...@open-mpi.org
 http://www.open-mpi.org/mailman/listinfo.cgi/devel





  



___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engi

Re: [OMPI devel] Trunk Commit Heads-up: New Common Shared Memory Component

2010-08-12 Thread Terry Dontje


Will do.

--td
Samuel K. Gutierrez wrote:

Hi Terry,

One more thing...  Before testing on Solaris 10, could you please 
update (I just committed a Solaris 10 fix).


Thanks,

--
Samuel K. Gutierrez
Los Alamos National Laboratory 


On Aug 11, 2010, at 1:15 PM, Samuel K. Gutierrez wrote:


Hi Terry,








On Aug 11, 2010, at 12:34 PM, Terry Dontje wrote:

I've done some minor testing on Linux looking at resident and shared 
memory sizes for np=4, 8 and 16 jobs.  I could not see any 
appreciable differences in sizes in the process between sysv, posix 
or mmap usage in the SM btl.


So I am still somewhat non-plussed about making this the default.  
It seems like the biggest gain out of using posix might be one 
doesn't need to worry about the location of the backing file.  This 
seems kind of frivolous to me since there is a warning that happens 
if the backing file is not memory based.


If I'm not mistaken, the warning is only issued if the backing files 
is stored on the following file systems: Lustre, NFS, Panasas, and 
GPFS  (see: opal_path_nfs in opal/util/path.c).  Based on the 
performance numbers that Sylvain provided on June 9th of this year 
(see attached),  there was a performance difference between mmap on 
/tmp and mmap on a tmpfs-like file system (/dev/shm in that 
particular case).  Using the new POSIX component provides us with a 
portable way to provide similar shared memory performance gains 
without having to worry about where the OMPI session directory is rooted.


--
Samuel K. Gutierrez
Los Alamos National Laboratory 






I still working on testing the code on Solaris but I don't imagine I 
will see anything that will change my mind.


--td

Samuel K. Gutierrez wrote:

Hi Rich,

It's a modification to the existing common sm component.  The 
modifications do include the addition of a new POSIX shared memory 
facility, however.


Sam

On Aug 11, 2010, at 10:05 AM, Graham, Richard L. wrote:

Is this a modification of the existing component, or a new 
component ?


Rich


On 8/10/10 10:52 AM, "Samuel K. Gutierrez" <sam...@lanl.gov> wrote:

Hi,

I wanted to give everyone a heads-up about a new POSIX shared memory
component
that has been in the works for a while now and is ready to be pushed
into the
trunk.

http://bitbucket.org/samuelkgutierrez/ompi_posix_sm_new

Some highlights:
o New posix component now the new default.
   o May address some of the shared memory performance issues 
users

encounter
  when OMPI's session directories are inadvertently placed 
on a non-

local
  filesystem.
o Silent component failover.
   o In the default case, if the posix component fails 
initialization,

  mmap will be selected.
o The sysv component will only be queried for selection if it is
placed before
   the mmap component (for example, -mca mpi_common_sm
sysv,posix,mmap).  In the
   default case, sysv will never be queried/selected.
o Per some on-list discussion, now unlinking mmaped file in both mmap
and posix
   components (see: "System V Shared Memory for Open MPI: Request for
Community
   Input and Testing" thread).
o  Assuming local process homogeneity with respect to all utilized
shared
memory facilities. That is, if one local process deems a
particular shared
memory facility acceptable, then ALL local processes should be
able to
utilize that facility. As it stands, this is an important point
because one
process dictates to all other local processes which common sm
component will
be selected based on its own, local run-time test.
o Addressed some of George's code reuse concerns.

If there are no major objections by August 17th, I'll commit the code
after the
Tuesday morning conference call.

Thanks!

--
Samuel K. Gutierrez
Los Alamos National Laboratory





___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--

Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.650.633.7054
Oracle * - Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>

___
devel mailing list
de...@open-mpi.org <mailto:de...@open-mpi.org>
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org <mailto:de...@open-mpi.org>
http://www.open-mpi.org/mailman/listinfo.cgi/devel




___
devel mailing list
de...@open-mpi.org
http:/

[hwloc-devel] Support for solaris lgrp_affinity_set

2010-08-06 Thread Terry Dontje

Is anyone looking at replacing the Solaris processor_bind calls with 
lgrp_affinity_set calls in hwloc?


--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.650.633.7054
Oracle * - Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com

Re: [OMPI devel] RFC: Add new Solaris sysinfo component

2010-08-03 Thread Terry Dontje


Graham, Richard L. wrote:

Why do we need an RFC for this sort of component ?  Seems self contained.

  
Probably don't, just giving a heads up. 


--td


Rich


On 8/3/10 6:59 AM, "Terry Dontje" <terry.don...@oracle.com> wrote:

WHAT:  Add new Solaris sysinfo component

WHY:  To allow OPAL access to chip type and model information when running on 
Solaris OS.

WHERE: opal/mca/sysinfo/solaris

WHEN:  for 1.5.1

TIMEOUT:  Aug 10, 2010

-

MORE DETAILS:

There is a sysinfo framework that has a component for Linux to expose the chip 
type and model information to OPAL.  This RFC is to propose the making of a 
Solaris component to expose the same information.  The Linux component also 
exposes number of processes and memory amount on a node, however the first 
instantiation of the Solaris component will not expose this information because 
it will be easier to do such with hwloc (which has not been integrated to 
provide such info, yet).

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel
  



--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.650.633.7054
Oracle * - Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>

Re: [OMPI devel] RFC: Remove all other paffinity components

2010-05-18 Thread Terry Dontje

Jeff Squyres wrote:

Just chatted with Ralph about this on the phone and he came up with a slightly
better compromise...

He points out that we really don't need *all* of the hwloc API (there's a
bajillion tiny little accessor functions). We could provide a steady,
OPAL/ORTE/OMPI-specific API (probably down in opal/util or somesuch) with a
dozen or two (or whatever) functions that we really need. These functions can
either call their back-end hwloc counterparts or they could do something safe
if hwloc is not present / not supported / etc.

That would alleviate the need to put #if OPAL_HAVE_HWLOC elsewhere in the code base.
But the code calling opal_hwloc_() needs to be able to gracefully handle
the failure case where it returns OPAL_ERR_NOT_SUPPORTED (etc.).

The above sounds like you are replacing the whole paffinity framework
with hwloc. Is that true? Or is the hwloc accessors you are talking
about non-paffinity related?

--td

On May 17, 2010, at 8:25 PM, Jeff Squyres (jsquyres) wrote:

On May 17, 2010, at 7:59 PM, Barrett, Brian W wrote:

HWLOC could be extended to support Red Storm, probably, but we don't have the need or time to do such an implementation.

Fair enough.

Given that, I'm not really picky about what the method of not breaking an
existing supported platform is, but I think having HAVE_HWLOC defines
everywhere is a bad idea...

We need a mechanism to have hwloc *not* be there, particularly for embedded
environments -- where hwloc would add no value. This is apparently just like
Red Storm, but even worse because we need to keep the memory footprint down as
much as possible (libhwloc.so.0.0 on linux is 104KB -- libhwloc.a is 139KB --
both are big numbers when you only have a few MB of usable RAM). So even
leaving stubs doesn't seem like a good idea -- they'll take up space, too. And
the hwloc API is fairly large -- maintaining stubs for all the API functions
could be a daunting task.

I think embedding is the main reason I can't think of any better idea than #if
OPAL_HAVE_HWLOC.

I anticipate that hwloc usage would be fairly localized in the OMPI code base:

int btl_sm_setup_stuff(...)
{
#if OPAL_HAVE_HWLOC
...do interesting hwloc things...
...setup stuff on btl_sm_component...
btl_sm_component.have_hwloc = 1;
#else
btl_sm_component.have_hwloc = 0;
#endif
}

int btl_sm_other_stuff(...)
{
if (btl_sm_component.have_hwloc) {
...use the hwloc info...
}
}

But I'm certainly open to other ideas -- got any?

--
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.650.633.7054
Oracle * - Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com

Re: [OMPI devel] System V Shared Memory forOpenMPI:Request forCommunity Input and Testing

2010-05-05 Thread Terry Dontje


Jeff Squyres wrote:

On May 4, 2010, at 9:53 AM, Ashley Pittman wrote:

  

Point noted.  But actually -- can you give specific reasons as to why a user should care? 
 Keep in mind that this would be a short-lived fork'ed process -- not "spawn" 
in the MPI sense of the word.
  

You might be running the job under Valgrind or another debugger, bclr has some 
issues with fork as I remember and traditionally there have been IB mapping 
issues here as well.  I'm sure you could make a case against any of those 
points if you wanted to but I think the argument stands, doing this kind of 
run-time check shouldn't be needed.



Mmm; good points (especially Valgrind).  BLCR and OpenFabrics verbs shouldn't 
be much of an issue here, but I can see that there might be unexpectedness if 
you're running under Valgrind or some other debugger.
  
Couldn't you also run into problems if a job is running under an RM that 
is enforcing a number of processes limit on the job?


--td
  

It might be possible to construct the code however so that if it failed to 
initialise it just wasn't used rather than aborted the job which would have 
much the same effect as a run-time test but without having to fork new 
processes and create short-lived shared memory regions.



That's how most of the network transports are in OMPI today -- if they fail to 
init, they are just skipped.

The problem here is that you really need 2 processes to do this test.  I 
suppose it could be done with local ranks 0 and 1 instead of forking a new 
process -- they would just need to communicate via RML to sync up, I suppose.

  

I should of course said fork where I mentioned spawn above to avoid any 
confusion, spawn has a specific meaning in the context of MPI.

I still think a better understanding of the issue is required before any 
decision here is made though, I'm surprised by Samuels description of the 
problem because it's not how I remember it and from what Chris says it doesn't 
reflect what is in linux Git code either.  I'd like to see why there is an 
apparent difference in behaviour before a decision is made to only support one.



There's no intent to only support sysv or mmap.  Samuel's work was to extend 
OMPI to support sysv in the case where it would be advantageous (e.g., 
guaranteed cleanup of the shmem segment).  The mmap stuff is definitely not 
going to be removed.

  



--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.650.633.7054
Oracle * - Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com

Re: [OMPI devel] System V Shared Memory for Open MPI:Request for Community Input and Testing

2010-05-04 Thread Terry Dontje


Ralph Castain wrote:


On May 4, 2010, at 3:45 AM, Terry Dontje wrote:

Is a configure-time test good enough?  For example, are all Linuxes 
the same in this regard.  That is if you built OMPI on RH and it 
configured in the new SysV SM will those bits actually run on other 
Linux systems correctly?  I think Jeff had hinted to this similarly 
when suggesting this may need to be a runtime test. 



I don't think we have ever enforced that requirement, nor am I sure 
the current code would meet it. We have a number of components that 
test for ability to build, but don't check again at run-time.


Generally, the project has followed the philosophy of "build on the 
system you intend to run on".


There is at least one binary distribution that does build on one linux 
and allows to be installed on several others.  That is the reason I 
bring up the above.   The community can make a stance that that one 
distribution does not matter for this case or needs to handle it on its 
own.  In the grand scheme of things it might not matter but I wanted to 
at least stand up and be heard.


--td

--td

Samuel K. Gutierrez wrote:

Hi All,

New configure-time test added - thanks for the suggestion, Jeff.  
Update and give it a whirl.


Ethan - could you please try again?  This time, I'm hoping sysv 
support will be disabled ;-).


Thanks!

--
Samuel K. Gutierrez
Los Alamos National Laboratory

On May 3, 2010, at 9:18 AM, Samuel K. Gutierrez wrote:


Hi Jeff,

Sounds like a plan :-).

Thanks!

--
Samuel K. Gutierrez
Los Alamos National Laboratory

On May 3, 2010, at 9:12 AM, Jeff Squyres wrote:

It might well be that you need a configure test to determine 
whether this behavior occurs or not.  Heck, it may even need to be 
a run-time test!  Hrm.


Write a small C program that does something like the following 
(this is off the top of my head):


fork a child
child goes to sleep immediately
sysv alloc a segment
attach to it
ipc rm it
parent wakes up child
child tries to attach to segment

If that succeeds, then all is good.  If not, then don't use this 
stuff.



On May 3, 2010, at 10:55 AM, Samuel K. Gutierrez wrote:


Hi all,

Does anyone know of a relatively portable solution for querying a
given system for the shmctl behavior that I am relying on, or is 
this

going to be a nightmare?  Because, if I am reading this thread
correctly, the presence of shmget and Linux is not sufficient for
determining an adequate level of sysv support.

Thanks!

--
Samuel K. Gutierrez
Los Alamos National Laboratory

On May 2, 2010, at 7:48 AM, N.M. Maclaren wrote:


On May 2 2010, Ashley Pittman wrote:

On 2 May 2010, at 04:03, Samuel K. Gutierrez wrote:

As to performance there should be no difference in use between 
sys-

V shared memory and file-backed shared memory, the instructions
issued and the MMU flags for the page should both be the same so
the performance should be identical.


Not necessarily, and possibly not so even for far-future Linuces.
On at least one system I used, the poxious kernel wrote the 
complete

file to disk before returning - all right, it did that for System V
shared memory, too, just to a 'hidden' file!  But, if I recall, on
another it did that only for file-backed shared memory - 
however, it's

a decade ago now and I may be misremembering.

Of course, that's a serious issue mainly for large segments.  I was
using multi-GB ones.  I don't know how big the ones you need are.


The one area you do need to keep an eye on for performance is on
numa machines where it's important which process on a node touches
each page first, you can end up using different areas (pages, not
regions) for communicating in different directions between the 
same

pair of processes. I don't believe this is any different to mmap
backed shared memory though.


On some systems it may be, but in bizarre, inconsistent, 
undocumented

and unpredictable ways :-(  Also, there are usually several system
(and
sometimes user) configuration options that change the behaviour, so
you
have to allow for that.  My experience of trying to use those is 
that

different uses have incompatible requirements, and most of the
critical
configuration parameters apply to ALL uses!

In my view, the configuration variability is the number one 
nightmare
for trying to write portable code that uses any form of shared 
memory.

ARMCI seem to agree.


Because of this, sysv support may be limited to Linux systems -
that is,
until we can get a better sense of which systems provide the 
shmctl

IPC_RMID behavior that I am relying on.


And, I suggest, whether they have an evil gotcha on one of the 
areas

that
Ashley Pittman noted.


Regards,
Nick Maclaren.


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




--
Jeff Squyres
jsquy...

Re: [OMPI devel] System V Shared Memory for Open MPI:Request for Community Input and Testing

2010-05-04 Thread Terry Dontje

Is a configure-time test good enough?  For example, are all Linuxes the 
same in this regard.  That is if you built OMPI on RH and it configured 
in the new SysV SM will those bits actually run on other Linux systems 
correctly?  I think Jeff had hinted to this similarly when suggesting 
this may need to be a runtime test. 


--td

Samuel K. Gutierrez wrote:

Hi All,

New configure-time test added - thanks for the suggestion, Jeff.  
Update and give it a whirl.


Ethan - could you please try again?  This time, I'm hoping sysv 
support will be disabled ;-).


Thanks!

--
Samuel K. Gutierrez
Los Alamos National Laboratory

On May 3, 2010, at 9:18 AM, Samuel K. Gutierrez wrote:


Hi Jeff,

Sounds like a plan :-).

Thanks!

--
Samuel K. Gutierrez
Los Alamos National Laboratory

On May 3, 2010, at 9:12 AM, Jeff Squyres wrote:

It might well be that you need a configure test to determine whether 
this behavior occurs or not.  Heck, it may even need to be a 
run-time test!  Hrm.


Write a small C program that does something like the following (this 
is off the top of my head):


fork a child
child goes to sleep immediately
sysv alloc a segment
attach to it
ipc rm it
parent wakes up child
child tries to attach to segment

If that succeeds, then all is good.  If not, then don't use this stuff.


On May 3, 2010, at 10:55 AM, Samuel K. Gutierrez wrote:


Hi all,

Does anyone know of a relatively portable solution for querying a
given system for the shmctl behavior that I am relying on, or is this
going to be a nightmare?  Because, if I am reading this thread
correctly, the presence of shmget and Linux is not sufficient for
determining an adequate level of sysv support.

Thanks!

--
Samuel K. Gutierrez
Los Alamos National Laboratory

On May 2, 2010, at 7:48 AM, N.M. Maclaren wrote:


On May 2 2010, Ashley Pittman wrote:

On 2 May 2010, at 04:03, Samuel K. Gutierrez wrote:

As to performance there should be no difference in use between sys-
V shared memory and file-backed shared memory, the instructions
issued and the MMU flags for the page should both be the same so
the performance should be identical.


Not necessarily, and possibly not so even for far-future Linuces.
On at least one system I used, the poxious kernel wrote the complete
file to disk before returning - all right, it did that for System V
shared memory, too, just to a 'hidden' file!  But, if I recall, on
another it did that only for file-backed shared memory - however, 
it's

a decade ago now and I may be misremembering.

Of course, that's a serious issue mainly for large segments.  I was
using multi-GB ones.  I don't know how big the ones you need are.


The one area you do need to keep an eye on for performance is on
numa machines where it's important which process on a node touches
each page first, you can end up using different areas (pages, not
regions) for communicating in different directions between the same
pair of processes. I don't believe this is any different to mmap
backed shared memory though.


On some systems it may be, but in bizarre, inconsistent, undocumented
and unpredictable ways :-(  Also, there are usually several system
(and
sometimes user) configuration options that change the behaviour, so
you
have to allow for that.  My experience of trying to use those is that
different uses have incompatible requirements, and most of the
critical
configuration parameters apply to ALL uses!

In my view, the configuration variability is the number one nightmare
for trying to write portable code that uses any form of shared 
memory.

ARMCI seem to agree.


Because of this, sysv support may be limited to Linux systems -
that is,
until we can get a better sense of which systems provide the shmctl
IPC_RMID behavior that I am relying on.


And, I suggest, whether they have an evil gotcha on one of the areas
that
Ashley Pittman noted.


Regards,
Nick Maclaren.


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




--
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.650.633.7054
Oracle * - Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com

Re: [OMPI devel] RFC: Deprecate rankfile?

2010-04-16 Thread Terry Dontje


Ralph Castain wrote:
To be clear, I wasn't implying anyone would intentionally break 
rank_file. However, it is rarely (if ever?) tested before we release - 
AFAIK, none of the MTT tests run by the community test this feature. 
Thus, it inevitably breaks without detection as changes are made 
elsewhere in the system. We typically don't know it is broken until 
someone complains about it, which usually is several months after the 
release.



Fair enough.  I guess my yellow fever shot has made me cranky today.

So I'll stand by my "self deprecate" comment. It has been the history 
of this feature, and I don't see anything changing to improve that 
situation.


Now if you implement a replacement... :-)
I'll get right on that after you approve the RFC that I am also suppose 
to send out :-).


-td


On Apr 16, 2010, at 5:08 AM, Terry Dontje wrote:


Jeff Squyres wrote:

On Apr 16, 2010, at 6:43 AM, Terry Dontje wrote:

  

If you are suggesting that you will make code that breaks a current rankfile 
feature, note I am not talking about adding a new feature that isn't supported 
by rankfile but something that used to work, then I think you are acting in 
poor form.  At a minimum you should at least give the community a heads up that 
you are borking a feature.



Er... no.

There is nothing nefarious going on here.  Ralph and I were just chatting yesterday about 
some process affinity issues and the topic of rank_file came up (again).  Remember that 
rank_file was a "throw over the wall" kind of code contribution and has 
historically been difficult to maintain.  Neither of us were excited at the prospect of 
adding hyperthreading support (once hwloc is finally released -- unfortunately, it's 
blocking on me, at the moment...) and also having to extend rank file to support it.

I asked Ralph if we should deprecate rank_file since the other binding options 
are available.  He assumed (correctly, it turns out) that no one would go for 
that.  But I figured I'd ask anyway.

I think all Ralph is saying is that we're (I'm) likely to add hyperthreading 
support in the not-distant future (and maybe Oracle will add support for 
boards).  This work is not likely to *break* rank_file, but neither of us are 
excited about extending rank_file to support hyperthreading.  If no one else 
steps up to extend it, then it may become obsolete over time because it doesn't 
support the things that people want.

  
I am ok with the above. 

Terry -- perhaps it's time to resurrect the new processor affinity proposal 
that you've been promising me for many months.  If rank_file were replaced with 
Something Better, I'd certainly be happy.  ;-)

  

Can we then have Ralph implement it :-)...  That was a joke Ralph!!!


--

Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.650.633.7054
Oracle * - Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>

___
devel mailing list
de...@open-mpi.org <mailto:de...@open-mpi.org>
http://www.open-mpi.org/mailman/listinfo.cgi/devel




___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.650.633.7054
Oracle * - Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>

1 2 3 >

1 - 100 of 247 matches

Mail list logo