[OMPI users] Open mpi based program runs as root and gives SIGSEGV under unprivileged user

2014-12-10 Thread Luca Fini
I've a problem running a well tested MPI based application. The program has been used for years with no problems. Suddenly the executable which was run many times with no problems crashed with SIGSEGV. The very same executable if run with root privileges works OK. The same happens with other

Re: [OMPI users] Open mpi based program runs as root and gives SIGSEGV under unprivileged user

2014-12-10 Thread Gilles Gouaillardet
Luca, your email mentions openmpi 1.6.5 but gdb output points to openmpi 1.8.1. could the root cause be a mix of versions that does not occur with root account ? which openmpi version are you expecting ? you can run pmap when your binary is running and/or under gdb to confirm the openmpi

Re: [OMPI users] Oversubscribing in 1.8.3 vs 1.6.5

2014-12-10 Thread Eric Chamberland
Hi Nathan, I pulled your commit d0da29351f9 and tested it against our example. It now works perfectly. Strangely, I can even unset "OMPI_MCA_mpi_yield_when_idle=1" and it doesn't seems to last longer. Can I apply the patch to a fresh "1.8.3" and it should work? Other question: how can I

Re: [OMPI users] Oversubscribing in 1.8.3 vs 1.6.5

2014-12-10 Thread Ralph Castain
You should be able to apply the patch - I don’t think that section of code differs from what is in the 1.8 repo. The sha for 1.8.3 can be found on the web site (see right-most column in table): http://www.open-mpi.org/software/ompi/v1.8/ > On Dec

Re: [OMPI users] Open mpi based program runs as root and gives SIGSEGV under unprivileged user

2014-12-10 Thread Gus Correa
Hi Luca Another possibility that comes to mind, besides mixed versions mentioned by Gilles, is the OS limits. Limits may vary according to the user and user privileges. Large programs tend to require big stacksize (even unlimited), and typically segfault when the stack is not large enough. Max

Re: [OMPI users] Oversubscribing in 1.8.3 vs 1.6.5

2014-12-10 Thread Eric Chamberland
On 12/10/2014 10:40 AM, Ralph Castain wrote: You should be able to apply the patch - I don’t think that section of code differs from what is in the 1.8 repo. it compiles, link, but gives me a segmentation violation now: #0 0x7f1827b00e91 in mca_allocator_component_lookup () from

Re: [OMPI users] Oversubscribing in 1.8.3 vs 1.6.5

2014-12-10 Thread Ralph Castain
I’ll run the tarball generator now so you can try the nightly tarball. > On Dec 10, 2014, at 9:20 AM, Eric Chamberland > wrote: > > On 12/10/2014 10:40 AM, Ralph Castain wrote: >> You should be able to apply the patch - I don’t think that section of >> code

Re: [OMPI users] Oversubscribing in 1.8.3 vs 1.6.5

2014-12-10 Thread Ralph Castain
Tarball now available on web site http://www.open-mpi.org/nightly/v1.8/ > On Dec 10, 2014, at 9:40 AM, Ralph Castain wrote: > > I’ll run the tarball generator now so you can try the nightly tarball. > >> On Dec 10, 2014, at 9:20 AM,

Re: [OMPI users] Oversubscribing in 1.8.3 vs 1.6.5

2014-12-10 Thread Eric Chamberland
On 12/10/2014 12:55 PM, Ralph Castain wrote: Tarball now available on web site http://www.open-mpi.org/nightly/v1.8/ _ _ _ _ On Dec 10, 2014, at 9:40 AM, Ralph Castain > wrote: I’ll run the tarball generator now so you can try the nightly tarball.

Re: [OMPI users] Oversubscribing in 1.8.3 vs 1.6.5

2014-12-10 Thread Eric Chamberland
On 12/10/2014 12:55 PM, Ralph Castain wrote: Tarball now available on web site http://www.open-mpi.org/nightly/v1.8/ I’ll run the tarball generator now so you can try the nightly tarball. ok, retrieved openmpi-v1.8.3-236-ga21cb20 and it compiled, linked, and executed nicely when

[OMPI users] MPI_THREAD_MULTIPLE hang

2014-12-10 Thread Christopher O'Grady
Hi, I see what looks like a bug in openmpi involving the MPI_THREAD_MULTIPLE. When we call MPI_Init_thread with this argument, this 61-line example hangs: http://www.slac.stanford.edu/~cpo/mpi_one_sided.cc There are some commented out lines in that file that we can use to workaround the

Re: [OMPI users] MPI_THREAD_MULTIPLE hang

2014-12-10 Thread Ralph Castain
Yes - it is being fixed for 1.8.4 > On Dec 10, 2014, at 2:00 PM, Christopher O'Grady > wrote: > > > Hi, > > I see what looks like a bug in openmpi involving the > MPI_THREAD_MULTIPLE. When we call MPI_Init_thread with this argument, > this 61-line example hangs: > >

Re: [OMPI users] MPI_THREAD_MULTIPLE hang

2014-12-10 Thread Nathan Hjelm
Several things: - In 1.8.x only shared memory windows work with multiple threads. This problem will be fixed in the master branch soon. A back-port to 1.8 is unlikely given the magnitude of the changes. - I highly recommend using the MPI-3 call MPI_Win_allocate over MPI_Win_create.