Re: [OMPI devel] simple_spawn test fails using different set of btls.

2014-11-06 Thread Dave Goodell (dgoodell)
On Nov 6, 2014, at 12:44 AM, George Bosilca wrote: > PS: Sorry Dave I also pushed a master branch merge ... It's not the end of the world, just try to keep an eye on it and avoid doing it in the future. If you need any help avoiding it, feel free to ping me or the devel@

Re: [OMPI devel] simple_spawn test fails using different set of btls.

2014-11-06 Thread Elena Elkina
Thanks! It fixes the problem with tcp. Best regards, Elena On Thu, Nov 6, 2014 at 10:44 AM, George Bosilca wrote: > I pushed a slightly better patch for the TCP BTL > (54ddb0aece0892dcdb1a1293a3bd3902b5f3acdc). The correct scheme would be to > OBJ_RETAIN the proc once it

Re: [OMPI devel] simple_spawn test fails using different set of btls.

2014-11-06 Thread George Bosilca
I pushed a slightly better patch for the TCP BTL (54ddb0aece0892dcdb1a1293a3bd3902b5f3acdc). The correct scheme would be to OBJ_RETAIN the proc once it is attached to the btl_proc and release it upon destruction of the btl_proc. However, for some obscure reason this doesn't quite works, as the

Re: [OMPI devel] simple_spawn test fails using different set of btls.

2014-11-06 Thread Gilles Gouaillardet
Ralph, i updated the MODEX flag to PMIX_GLOBAL https://github.com/open-mpi/ompi/commit/d542c9ff2dc57ca5d260d0578fd5c1c556c598c7 Elena, i was able to reproduce the issue (salloc -N 5 mpirun -np 2 is enough). i was "lucky" to reproduce the issue : it happened because one of node was misconfigured

Re: [OMPI devel] simple_spawn test fails using different set of btls.

2014-11-05 Thread Ralph Castain
> On Nov 5, 2014, at 6:11 PM, Gilles Gouaillardet > wrote: > > Elena, > > the first case (-mca btl tcp,self) crashing is a bug and i will have a look > at it. > > the second case (-mca sm,self) is a feature : the sm btl cannot be used with > tasks > having

Re: [OMPI devel] simple_spawn test fails using different set of btls.

2014-11-05 Thread Gilles Gouaillardet
Elena, the first case (-mca btl tcp,self) crashing is a bug and i will have a look at it. the second case (-mca sm,self) is a feature : the sm btl cannot be used with tasks having different jobids (this is the case after a spawn), and obviously, self cannot be used also, so the behaviour and

[OMPI devel] simple_spawn test fails using different set of btls.

2014-11-05 Thread Elena Elkina
Hi, It looks like there is a problem in trunk which reproduces with simple_spawn test (orte/test/mpi/simple_spawn.c). It seems to be a n issue with pmix. It doesn't reproduce with default set of btls. But it reproduces with several btls specified. For example, salloc -N5