Re: [OMPI devel] [OMPI users] still segmentation fault with openmpi-2.0.2rc3 on Linux

2017-01-11 Thread Gilles Gouaillardet
Ralph, spawn_master spawns 4 spawn_slave (i attached note master is fine with mpirun -np 1 --slot-list 0:0-5,1:0-5 --host motomachi ./spawn_master but v2.0.x and v2.x are not. Cheers, Gilles On 1/12/2017 1:42 PM, r...@open-mpi.org wrote: Looking at this note again: how many procs is

Re: [OMPI devel] [OMPI users] still segmentation fault with openmpi-2.0.2rc3 on Linux

2017-01-11 Thread r...@open-mpi.org
Looking at this note again: how many procs is spawn_master generating? > On Jan 11, 2017, at 7:39 PM, r...@open-mpi.org wrote: > > Sigh - yet another corner case. Lovely. Will take a poke at it later this > week. Thx for tracking it down > >> On Jan 11, 2017, at 5:27 PM, Gilles Gouaillardet > <

Re: [OMPI devel] Fwd: Re: [OMPI users] still segmentation fault with openmpi-2.0.2rc3 on Linux

2017-01-11 Thread r...@open-mpi.org
Sigh - yet another corner case. Lovely. Will take a poke at it later this week. Thx for tracking it down > On Jan 11, 2017, at 5:27 PM, Gilles Gouaillardet wrote: > > Ralph, > > > > so it seems the root cause is a kind of incompatibility between the --host > and the --slot-list options > >

[OMPI devel] Fwd: Re: [OMPI users] still segmentation fault with openmpi-2.0.2rc3 on Linux

2017-01-11 Thread Gilles Gouaillardet
Ralph, so it seems the root cause is a kind of incompatibility between the --host and the --slot-list options on a single node with two six cores sockets, this works : mpirun -np 1 ./spawn_master mpirun -np 1 --slot-list 0:0-5,1:0-5 ./spawn_master mpirun -np 1 --host motomachi --oversubscr

Re: [OMPI devel] rdmacm and udcm for 2.0.1 and RoCE

2017-01-11 Thread Dave Turner
The btl_openib_receive_queues parameters that Howard provided fixed our problem with getting 2.0.1 working with RoCE so thanks for all the help. However, we are seeing segfaults with this when configured with --enable-btl-openib-failover. I've included the configuration below that the packag

Re: [OMPI devel] OMPI devel] hwloc missing NUMANode object

2017-01-11 Thread r...@open-mpi.org
Should be fixed here: https://github.com/open-mpi/ompi/pull/2711 > On Jan 5, 2017, at 6:42 AM, r...@open-mpi.org wrote: > > I can add a check to see if we have NUMA, and if not we can fall back to > socket (if present) or just “none” > >> On Jan 5,