Re: [OMPI devel] Open MPI 1.8.6 memory leak

2015-07-01 Thread Gilles Gouaillardet
Nathan, the root cause is your fixes were not backported to the v1.8 (nor the v1.10) branch i made PR https://github.com/open-mpi/ompi-release/pull/357 to fix this. could you please review it ? since there are quite a lot of differences between v1.8 and master, the backport was not

[hwloc-devel] Create success (hwloc git 1.11.0-4-g5804ed0)

2015-07-01 Thread MPI Team
Creating nightly hwloc snapshot git tarball was a success. Snapshot: hwloc 1.11.0-4-g5804ed0 Start time: Wed Jul 1 21:04:36 EDT 2015 End time: Wed Jul 1 21:06:05 EDT 2015 Your friendly daemon, Cyrador

[hwloc-devel] Create success (hwloc git 1.10.1-46-gafa7535)

2015-07-01 Thread MPI Team
Creating nightly hwloc snapshot git tarball was a success. Snapshot: hwloc 1.10.1-46-gafa7535 Start time: Wed Jul 1 21:03:13 EDT 2015 End time: Wed Jul 1 21:04:35 EDT 2015 Your friendly daemon, Cyrador

[hwloc-devel] Create success (hwloc git dev-604-g10f7097)

2015-07-01 Thread MPI Team
Creating nightly hwloc snapshot git tarball was a success. Snapshot: hwloc dev-604-g10f7097 Start time: Wed Jul 1 21:01:05 EDT 2015 End time: Wed Jul 1 21:02:58 EDT 2015 Your friendly daemon, Cyrador

Re: [OMPI devel] error in test/threads/opal_condition.c

2015-07-01 Thread Gilles Gouaillardet
In other places, initialization looks like opal_mutex_t mutex = {{0}}; Btw, opal_condition is a standalone binary (e.g. Not part of ompi library), so I do not think uninitialized common hurts here. Cheers, Gilles On Wednesday, July 1, 2015, Nathan Hjelm wrote: > > PGI no

Re: [OMPI devel] Open MPI 1.8.6 memory leak

2015-07-01 Thread Nathan Hjelm
Don't see the leak on master with OS X using the leaks command. Will see what valgrind finds on linux. -Nathan On Wed, Jul 01, 2015 at 08:48:57PM +, Rolf vandeVaart wrote: >There have been two reports on the user list about memory leaks. I have >reproduced this leak with LAMMPS.

[OMPI devel] Open MPI 1.8.6 memory leak

2015-07-01 Thread Rolf vandeVaart
There have been two reports on the user list about memory leaks. I have reproduced this leak with LAMMPS. Note that this has nothing to do with CUDA-aware features. The steps that Stefan has provided make it easy to reproduce. Here are some more specific steps to reproduce derived from

Re: [OMPI devel] opal_lifo hangs on ppc in master

2015-07-01 Thread Nathan Hjelm
Paul, can you send me the config.log for the ppc build? -Nathan On Wed, Jul 01, 2015 at 09:33:53AM -0700, Paul Hargrove wrote: >Testing last night's master tarball with "make check" I find that >opal_lifo *hangs* on every ppc/linux system I try, including both gcc and >xlc, both 32-

Re: [OMPI devel] opal_{lifo,fifo} hang on ppc in master

2015-07-01 Thread Paul Hargrove
And I just found that after I kill opal_lifo, the opal_fifo test hangs too. Output was not shown by "make check", but here is what I see manually: $ ./test/class/opal_lifo Single thread test. Time: 0 s 35084 us 35 nsec/poppush Atomics thread finished. Time: 0 s 197821 us 197 nsec/poppush ^C $

[OMPI devel] opal_lifo hangs on ppc in master

2015-07-01 Thread Paul Hargrove
Testing last night's master tarball with "make check" I find that opal_lifo *hangs* on every ppc/linux system I try, including both gcc and xlc, both 32- and 64-bit CPUs and even a little-endian POWER8. Attaching gdb to a hung yields: (gdb) thread apply all bt full Thread 1 (Thread

Re: [OMPI devel] opal_fifo SEGV from master

2015-07-01 Thread Paul Hargrove
Nathan, Last night's master tarball is still producing a SEGV in opal_fifo on the same Scientific Linux 7.x x86-64 VM as I reported in Feb. Reproducing the SEGV under gdb yields: Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x75bb1700 (LWP 16242)]

[OMPI devel] Master build failures w/ Studio 12.4 on Linux w/ -m32 [w/ patch]

2015-07-01 Thread Paul Hargrove
Back in February I reported ( http://www.open-mpi.org/community/lists/devel/2015/02/17073.php) that when building master on Linux with the Solaris Studio compilers and -m32 I saw the following: /bin/sh ../../../libtool --tag=CC --mode=link cc -m32 -g -mt -export-dynamic -o opal_wrapper

Re: [OMPI devel] Testing of "OMP_PROC_BIND value is invalid" errors

2015-07-01 Thread Joshua Ladd
Thanks for the clarification. Josh On Wed, Jul 1, 2015 at 9:46 AM, Paul Hargrove wrote: > Josh, > > You are grossly misinterpreting my position in this. > You use my name to support a position I do not agree with. > > First, I don't do production work at all and have

Re: [OMPI devel] error in test/threads/opal_condition.c

2015-07-01 Thread Nathan Hjelm
PGI no longer suprises me with how bad it is. The lines in question look ok to me. We can fix this (and remove the common symbols) by removing the initializers and making the variables static. I will go ahead and do this. -Nathan On Wed, Jul 01, 2015 at 05:41:59AM -0700, Paul Hargrove wrote: >

[OMPI devel] NetBSD regression on master

2015-07-01 Thread Paul Hargrove
The Open MPI README currently says - On NetBSD-6 (at least AMD64 and i386), and possibly on OpenBSD, libtool misidentifies properties of f95/g95, leading to obscure compile-time failures if used to build Open MPI. You can work around this issue by ensuring that libtool will not use f95/g95

Re: [OMPI devel] Testing of "OMP_PROC_BIND value is invalid" errors

2015-07-01 Thread Paul Hargrove
Josh, You are grossly misinterpreting my position in this. You use my name to support a position I do not agree with. First, I don't do production work at all and have actually never written an MPI application more significant than a class project. Second, anyone who wants to use "master" for

[OMPI devel] error in test/threads/opal_condition.c

2015-07-01 Thread Paul Hargrove
I find that PGI version 9, 10, 11, 12, 13 and 14 all fail "make check" with last night's master tarball. All expect 9 fail with pretty much the same message: CC opal_condition.o PGC-S-0155-Empty initializer not supported

Re: [OMPI devel] Testing of "OMP_PROC_BIND value is invalid" errors

2015-07-01 Thread Ralph Castain
Okay, I surrender - I trashed it. Was trying to help another community member, but I'm not going to waste my time arguing for someone else's requirement. I will happily confine myself to your new rules, and to things only of interest to my employer. On Wed, Jul 1, 2015 at 5:10 AM, Joshua Ladd

Re: [OMPI devel] Testing of "OMP_PROC_BIND value is invalid" errors

2015-07-01 Thread Joshua Ladd
Paul, I think your testing is extremely helpful. Even more so with this new versioning scheme. Setting OMP envars in ORTE should have been discussed. Considering both Paul and Howard (key members of our community) use OMP in production environments with Cray and PGI compilers, it seems a bit odd

Re: [OMPI devel] Testing of "OMP_PROC_BIND value is invalid" errors

2015-07-01 Thread Ralph Castain
Given the description, I suspect that any MPI application should be sufficient to test it - it appears that PGI is adding some OpenMP-specific code checks. I'm not saying it is isolated solely to PGI, nor am I pointing fingers at them - I'm only saying that is the only compiler for which we've

[OMPI devel] Testing of "OMP_PROC_BIND value is invalid" errors

2015-07-01 Thread Paul Hargrove
On Wed, Jul 1, 2015 at 12:05 AM, Ralph Castain wrote: [...] > Now that we know there is an issue with one compiler, and it is isolated > to just that compiler, we can easily use configure.m4 to protect against > it. I'll add that protection here shortly. [...] Ralph, One

Re: [OMPI devel] OMPI_PROC_BIND value is invalid errors

2015-07-01 Thread Ralph Castain
Ease up there, Howard. This is why we have a "master" branch at OMPI. It is a fairly common problem we face as this is a community that supports a very broad spectrum of environments, not just a single one where everything is known and "canned". Supporting alternative programming models is a

Re: [OMPI devel] OMPI_PROC_BIND value is invalid errors

2015-07-01 Thread Howard Pritchard
Hi Geoff, This is kind of what I suspected. I think its a very bad design decision to have the open mpi runtime under the hood setting Open MP environment variables. At the very minimum, there should be an mca parameter to over ride this, or alternatively, this section of code would only be