Is this on a single node?
> On Oct 27, 2015, at 9:25 AM, George Bosilca <bosi...@icl.utk.edu> wrote:
>
> I get intermittent deadlocks wit the latest trunk. The smallest reproducer is
> a shell for loop around a small (2 processes) short (20 seconds) MPI
> application. After few tens of iterations the MPI_Init will deadlock with the
> following backtrace:
>
> #0 0x00007fa94b5d9aed in nanosleep () from /lib64/libc.so.6
> #1 0x00007fa94b60ec94 in usleep () from /lib64/libc.so.6
> #2 0x00007fa94960ba08 in OPAL_PMIX_PMIX1XX_PMIx_Fence (procs=0x0, nprocs=0,
> info=0x7ffd7934fb90,
> ninfo=1) at src/client/pmix_client_fence.c:100
> #3 0x00007fa9498376a2 in pmix1_fence (procs=0x0, collect_data=1) at
> pmix1_client.c:305
> #4 0x00007fa94bb39ba4 in ompi_mpi_init (argc=3, argv=0x7ffd793500a8,
> requested=3,
> provided=0x7ffd7934ff94) at runtime/ompi_mpi_init.c:645
> #5 0x00007fa94bb77281 in PMPI_Init_thread (argc=0x7ffd7934ff8c,
> argv=0x7ffd7934ff80, required=3,
> provided=0x7ffd7934ff94) at pinit_thread.c:69
> #6 0x000000000040150f in main (argc=3, argv=0x7ffd793500a8) at
> osu_mbw_mr.c:86
>
> On my machines this is reproducible at 100% after anywhere between 50 and 100
> iterations.
>
> Thanks,
> George.
>
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2015/10/18280.php