Nathan,
the root cause is your fixes were not backported to the v1.8 (nor the
v1.10) branch
i made PR https://github.com/open-mpi/ompi-release/pull/357 to fix this.
could you please review it ?
since there are quite a lot of differences between v1.8 and master, the
backport was not
Creating nightly hwloc snapshot git tarball was a success.
Snapshot: hwloc 1.11.0-4-g5804ed0
Start time: Wed Jul 1 21:04:36 EDT 2015
End time: Wed Jul 1 21:06:05 EDT 2015
Your friendly daemon,
Cyrador
Creating nightly hwloc snapshot git tarball was a success.
Snapshot: hwloc 1.10.1-46-gafa7535
Start time: Wed Jul 1 21:03:13 EDT 2015
End time: Wed Jul 1 21:04:35 EDT 2015
Your friendly daemon,
Cyrador
Creating nightly hwloc snapshot git tarball was a success.
Snapshot: hwloc dev-604-g10f7097
Start time: Wed Jul 1 21:01:05 EDT 2015
End time: Wed Jul 1 21:02:58 EDT 2015
Your friendly daemon,
Cyrador
In other places, initialization looks like
opal_mutex_t mutex = {{0}};
Btw, opal_condition is a standalone binary (e.g. Not part of ompi library),
so I do not think uninitialized common hurts here.
Cheers,
Gilles
On Wednesday, July 1, 2015, Nathan Hjelm wrote:
>
> PGI no
Don't see the leak on master with OS X using the leaks command. Will see
what valgrind finds on linux.
-Nathan
On Wed, Jul 01, 2015 at 08:48:57PM +, Rolf vandeVaart wrote:
>There have been two reports on the user list about memory leaks. I have
>reproduced this leak with LAMMPS.
There have been two reports on the user list about memory leaks. I have
reproduced this leak with LAMMPS. Note that this has nothing to do with
CUDA-aware features. The steps that Stefan has provided make it easy to
reproduce.
Here are some more specific steps to reproduce derived from
Paul, can you send me the config.log for the ppc build?
-Nathan
On Wed, Jul 01, 2015 at 09:33:53AM -0700, Paul Hargrove wrote:
>Testing last night's master tarball with "make check" I find that
>opal_lifo *hangs* on every ppc/linux system I try, including both gcc and
>xlc, both 32-
And I just found that after I kill opal_lifo, the opal_fifo test hangs too.
Output was not shown by "make check", but here is what I see manually:
$ ./test/class/opal_lifo
Single thread test. Time: 0 s 35084 us 35 nsec/poppush
Atomics thread finished. Time: 0 s 197821 us 197 nsec/poppush
^C
$
Testing last night's master tarball with "make check" I find that opal_lifo
*hangs* on every ppc/linux system I try, including both gcc and xlc, both
32- and 64-bit CPUs and even a little-endian POWER8.
Attaching gdb to a hung yields:
(gdb) thread apply all bt full
Thread 1 (Thread
Nathan,
Last night's master tarball is still producing a SEGV in opal_fifo on the
same Scientific Linux 7.x x86-64 VM as I reported in Feb.
Reproducing the SEGV under gdb yields:
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x75bb1700 (LWP 16242)]
Back in February I reported (
http://www.open-mpi.org/community/lists/devel/2015/02/17073.php) that when
building master on Linux with the Solaris Studio compilers and -m32 I saw
the following:
/bin/sh ../../../libtool --tag=CC --mode=link cc -m32 -g -mt
-export-dynamic -o opal_wrapper
Thanks for the clarification.
Josh
On Wed, Jul 1, 2015 at 9:46 AM, Paul Hargrove wrote:
> Josh,
>
> You are grossly misinterpreting my position in this.
> You use my name to support a position I do not agree with.
>
> First, I don't do production work at all and have
PGI no longer suprises me with how bad it is. The lines in question look
ok to me. We can fix this (and remove the common symbols) by removing
the initializers and making the variables static. I will go ahead and do
this.
-Nathan
On Wed, Jul 01, 2015 at 05:41:59AM -0700, Paul Hargrove wrote:
>
The Open MPI README currently says
- On NetBSD-6 (at least AMD64 and i386), and possibly on OpenBSD,
libtool misidentifies properties of f95/g95, leading to obscure
compile-time failures if used to build Open MPI. You can work
around this issue by ensuring that libtool will not use f95/g95
Josh,
You are grossly misinterpreting my position in this.
You use my name to support a position I do not agree with.
First, I don't do production work at all and have actually never written an
MPI application more significant than a class project.
Second, anyone who wants to use "master" for
I find that PGI version 9, 10, 11, 12, 13 and 14 all fail "make check" with
last night's master tarball. All expect 9 fail with pretty much the same
message:
CC opal_condition.o
PGC-S-0155-Empty initializer not supported
Okay, I surrender - I trashed it. Was trying to help another community
member, but I'm not going to waste my time arguing for someone else's
requirement. I will happily confine myself to your new rules, and to things
only of interest to my employer.
On Wed, Jul 1, 2015 at 5:10 AM, Joshua Ladd
Paul,
I think your testing is extremely helpful. Even more so with this new
versioning scheme.
Setting OMP envars in ORTE should have been discussed. Considering both
Paul and Howard (key members of our community) use OMP in production
environments with Cray and PGI compilers, it seems a bit odd
Given the description, I suspect that any MPI application should be
sufficient to test it - it appears that PGI is adding some OpenMP-specific
code checks.
I'm not saying it is isolated solely to PGI, nor am I pointing fingers at
them - I'm only saying that is the only compiler for which we've
On Wed, Jul 1, 2015 at 12:05 AM, Ralph Castain wrote:
[...]
> Now that we know there is an issue with one compiler, and it is isolated
> to just that compiler, we can easily use configure.m4 to protect against
> it. I'll add that protection here shortly.
[...]
Ralph,
One
Ease up there, Howard. This is why we have a "master" branch at OMPI. It is
a fairly common problem we face as this is a community that supports a very
broad spectrum of environments, not just a single one where everything is
known and "canned".
Supporting alternative programming models is a
Hi Geoff,
This is kind of what I suspected. I think its a very bad design decision
to
have the open mpi runtime under the hood setting Open MP environment
variables. At the very minimum, there should be an mca parameter to over
ride
this, or alternatively, this section of code would only be
23 matches
Mail list logo