Okay, I’ll take a look - I’ve been chasing a race condition that might be related
> On Oct 27, 2015, at 9:54 AM, George Bosilca <bosi...@icl.utk.edu> wrote: > > No, it's using 2 nodes. > George. > > > On Tue, Oct 27, 2015 at 12:35 PM, Ralph Castain <r...@open-mpi.org > <mailto:r...@open-mpi.org>> wrote: > Is this on a single node? > >> On Oct 27, 2015, at 9:25 AM, George Bosilca <bosi...@icl.utk.edu >> <mailto:bosi...@icl.utk.edu>> wrote: >> >> I get intermittent deadlocks wit the latest trunk. The smallest reproducer >> is a shell for loop around a small (2 processes) short (20 seconds) MPI >> application. After few tens of iterations the MPI_Init will deadlock with >> the following backtrace: >> >> #0 0x00007fa94b5d9aed in nanosleep () from /lib64/libc.so.6 >> #1 0x00007fa94b60ec94 in usleep () from /lib64/libc.so.6 >> #2 0x00007fa94960ba08 in OPAL_PMIX_PMIX1XX_PMIx_Fence (procs=0x0, nprocs=0, >> info=0x7ffd7934fb90, >> ninfo=1) at src/client/pmix_client_fence.c:100 >> #3 0x00007fa9498376a2 in pmix1_fence (procs=0x0, collect_data=1) at >> pmix1_client.c:305 >> #4 0x00007fa94bb39ba4 in ompi_mpi_init (argc=3, argv=0x7ffd793500a8, >> requested=3, >> provided=0x7ffd7934ff94) at runtime/ompi_mpi_init.c:645 >> #5 0x00007fa94bb77281 in PMPI_Init_thread (argc=0x7ffd7934ff8c, >> argv=0x7ffd7934ff80, required=3, >> provided=0x7ffd7934ff94) at pinit_thread.c:69 >> #6 0x000000000040150f in main (argc=3, argv=0x7ffd793500a8) at >> osu_mbw_mr.c:86 >> >> On my machines this is reproducible at 100% after anywhere between 50 and >> 100 iterations. >> >> Thanks, >> George. >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org <mailto:de...@open-mpi.org> >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >> <http://www.open-mpi.org/mailman/listinfo.cgi/devel> >> Link to this post: >> http://www.open-mpi.org/community/lists/devel/2015/10/18280.php >> <http://www.open-mpi.org/community/lists/devel/2015/10/18280.php> > > _______________________________________________ > devel mailing list > de...@open-mpi.org <mailto:de...@open-mpi.org> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > <http://www.open-mpi.org/mailman/listinfo.cgi/devel> > Link to this post: > http://www.open-mpi.org/community/lists/devel/2015/10/18281.php > <http://www.open-mpi.org/community/lists/devel/2015/10/18281.php> > > _______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2015/10/18282.php