Okay, I’ll take a look - I’ve been chasing a race condition that might be 
related

> On Oct 27, 2015, at 9:54 AM, George Bosilca <bosi...@icl.utk.edu> wrote:
> 
> No, it's using 2 nodes.
>   George.
> 
> 
> On Tue, Oct 27, 2015 at 12:35 PM, Ralph Castain <r...@open-mpi.org 
> <mailto:r...@open-mpi.org>> wrote:
> Is this on a single node?
> 
>> On Oct 27, 2015, at 9:25 AM, George Bosilca <bosi...@icl.utk.edu 
>> <mailto:bosi...@icl.utk.edu>> wrote:
>> 
>> I get intermittent deadlocks wit the latest trunk. The smallest reproducer 
>> is a shell for loop around a small (2 processes) short (20 seconds) MPI 
>> application. After few tens of iterations the MPI_Init will deadlock with 
>> the following backtrace:
>> 
>> #0  0x00007fa94b5d9aed in nanosleep () from /lib64/libc.so.6
>> #1  0x00007fa94b60ec94 in usleep () from /lib64/libc.so.6
>> #2  0x00007fa94960ba08 in OPAL_PMIX_PMIX1XX_PMIx_Fence (procs=0x0, nprocs=0, 
>> info=0x7ffd7934fb90, 
>>     ninfo=1) at src/client/pmix_client_fence.c:100
>> #3  0x00007fa9498376a2 in pmix1_fence (procs=0x0, collect_data=1) at 
>> pmix1_client.c:305
>> #4  0x00007fa94bb39ba4 in ompi_mpi_init (argc=3, argv=0x7ffd793500a8, 
>> requested=3, 
>>     provided=0x7ffd7934ff94) at runtime/ompi_mpi_init.c:645
>> #5  0x00007fa94bb77281 in PMPI_Init_thread (argc=0x7ffd7934ff8c, 
>> argv=0x7ffd7934ff80, required=3, 
>>     provided=0x7ffd7934ff94) at pinit_thread.c:69
>> #6  0x000000000040150f in main (argc=3, argv=0x7ffd793500a8) at 
>> osu_mbw_mr.c:86
>> 
>> On my machines this is reproducible at 100% after anywhere between 50 and 
>> 100 iterations.
>> 
>>   Thanks,
>>     George.
>> 
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org <mailto:de...@open-mpi.org>
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
>> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/devel/2015/10/18280.php 
>> <http://www.open-mpi.org/community/lists/devel/2015/10/18280.php>
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org <mailto:de...@open-mpi.org>
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2015/10/18281.php 
> <http://www.open-mpi.org/community/lists/devel/2015/10/18281.php>
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2015/10/18282.php

Reply via email to