Looking at it, I think I see what was happening. The thread would start, but 
then immediately see that the active flag was false and would exit. This left 
the server without any listening thread - but it wouldn’t detect this had 
happened. It was therefore a race between whether the thread checked the flag 
before the server set it.

Thanks Nysal - I believe this should indeed fix the problem!


> On Nov 9, 2015, at 9:04 AM, George Bosilca <bosi...@icl.utk.edu> wrote:
> 
> Clearly Nyal got a valid point there. I launched a stress test with Nysal 
> suggestion in the code, and so far it's up to few hundreds iterations
> without deadlock. I would not claim victory yet, I launched a 10k cycle to 
> see where we stand (btw this never passed before).
> I'll let you know the outcome.
> 
>   George.
> 
> 
> On Mon, Nov 9, 2015 at 11:55 AM, Artem Polyakov <artpo...@gmail.com 
> <mailto:artpo...@gmail.com>> wrote:
> 
> 
> 2015-11-09 22:42 GMT+06:00 Artem Polyakov <artpo...@gmail.com 
> <mailto:artpo...@gmail.com>>:
> This is the very good point, Nysal!
> 
> This is definitely a problem and I can say even more: avg. 3 from every 10 
> tasks was affected by this bug. Once the PR 
> (https://github.com/pmix/master/pull/8 
> <https://github.com/pmix/master/pull/8>) was applied I was able to run 100 
> testing tasks without any hangs.
> 
> Here some more information on my symptoms. I was observing this without OMPI, 
> just running pmix_client test binary from PMIx test suite with SLURM PMIx 
> plugin.
> Periodicaly application was hanging. Investigation shows that not all 
> processes are able to initialize correctly. 
> Here is how such client's backtrace looks like:
> 
> P.S. I think that this backtrace may be relevant to George's problem as well. 
> In my case not all of the processes was hanging in the connect_to_server, 
> most of them were able to move forward and reach Fence.
> George, the backtrace that you've posted was the same on both processes or it 
> was the "random" one from one of them?
>  
> (gdb) bt
> #0  0x00007f1448f1b7eb in recv () from /lib/x86_64-linux-gnu/libpthread.so.0
> #1  0x00007f144914c191 in pmix_usock_recv_blocking (sd=9, data=0x7fff367f7c64 
> "", size=4) at src/usock/usock.c:166
> #2  0x00007f1449152d18 in recv_connect_ack (sd=9) at 
> src/client/pmix_client.c:837
> #3  0x00007f14491546bf in usock_connect (addr=0x7fff367f7d60) at 
> src/client/pmix_client.c:1103
> #4  0x00007f144914f94c in connect_to_server (address=0x7fff367f7d60, 
> cbdata=0x7fff367f7dd0) at src/client/pmix_client.c:179
> #5  0x00007f1449150421 in PMIx_Init (proc=0x7fff367f81d0) at 
> src/client/pmix_client.c:355
> #6  0x0000000000401b97 in main (argc=9, argv=0x7fff367f83d8) at 
> pmix_client.c:62
> 
> 
> The server-side debug has the following lines at the end of the file:
> [cn33:00482] pmix:server register client slurm.pmix.22.0:10
> [cn33:00482] pmix:server _register_client for nspace slurm.pmix.22.0 rank 10
> [cn33:00482] pmix:server setup_fork for nspace slurm.pmix.22.0 rank 10
> 
> in normal operation the following lines should appear after lines above:
> ....
> [cn33:00188] listen_thread: new connection: (26, 0)
> [cn33:00188] connection_handler: new connection: 26
> [cn33:00188] RECV CONNECT ACK FROM PEER ON SOCKET 26
> [cn33:00188] waiting for blocking recv of 16 bytes
> [cn33:00188] blocking receive complete from remote
> ....
> 
> At the client side I see the following lines
> cn33:00491] usock_peer_try_connect: attempting to connect to server
> [cn33:00491] usock_peer_try_connect: attempting to connect to server on 
> socket 10
> [cn33:00491] pmix: SEND CONNECT ACK
> [cn33:00491] sec: native create_cred
> [cn33:00491] sec: using credential 1000:1000
> [cn33:00491] send blocking of 54 bytes to socket 10
> [cn33:00491] blocking send complete to socket 10
> [cn33:00491] pmix: RECV CONNECT ACK FROM SERVER
> [cn33:00491] waiting for blocking recv of 4 bytes
> [cn33:00491] blocking_recv received error 11:Resource temporarily unavailable 
> from remote - cycling
> [cn33:00491] blocking_recv received error 11:Resource temporarily unavailable 
> from remote - cycling
> [... repeated many times ...]
> 
> With the fix for the problem highlighted by Nysal all runs cleanly.
> 
> 
> 2015-11-09 10:53 GMT+06:00 Nysal Jan K A <jny...@gmail.com 
> <mailto:jny...@gmail.com>>:
> In listen_thread():
> 194     while (pmix_server_globals.listen_thread_active) {
> 195         FD_ZERO(&readfds);
> 196         FD_SET(pmix_server_globals.listen_socket, &readfds);
> 197         max = pmix_server_globals.listen_socket;
> 
> Is it possible that pmix_server_globals.listen_thread_active can be false, in 
> which case the thread just exits and will never call accept() ?
> 
> In pmix_start_listening():
> 147         /* fork off the listener thread */
> 148         if (0 > pthread_create(&engine, NULL, listen_thread, NULL)) {
> 149             return PMIX_ERROR;
> 150         }
> 151         pmix_server_globals.listen_thread_active = true;
> 
> pmix_server_globals.listen_thread_active is set to true after the thread is 
> created, could this cause a race ?
> listen_thread_active might also need to be declared as volatile.
> 
> Regards
> --Nysal
> 
> On Sun, Nov 8, 2015 at 10:38 PM, George Bosilca <bosi...@icl.utk.edu 
> <mailto:bosi...@icl.utk.edu>> wrote:
> We had a power outage last week and the local disks on our cluster were wiped 
> out. My tester was in there. But, I can rewrite it after SC.
> 
>   George.
> 
> On Sat, Nov 7, 2015 at 12:04 PM, Ralph Castain <r...@open-mpi.org 
> <mailto:r...@open-mpi.org>> wrote:
> Could you send me your stress test? I’m wondering if it is just something 
> about how we set socket options
> 
> 
>> On Nov 7, 2015, at 8:58 AM, George Bosilca <bosi...@icl.utk.edu 
>> <mailto:bosi...@icl.utk.edu>> wrote:
>> 
>> I has to postpone this until after SC. However, I ran for 3 days a stress 
>> test of UDS reproducing the opening and sending of data (what Ralph said in 
>> his email) and I never could get a deadlock.
>> 
>>   George.
>> 
>> 
>> On Sat, Nov 7, 2015 at 11:26 AM, Ralph Castain <r...@open-mpi.org 
>> <mailto:r...@open-mpi.org>> wrote:
>> George was looking into it, but I don’t know if he has had time recently to 
>> continue the investigation. We understand “what” is happening (accept 
>> sometimes ignores the connection), but we don’t yet know “why”. I’ve done 
>> some digging around the web, and found that sometimes you can try to talk to 
>> a Unix Domain Socket too quickly - i.e., you open it and then send to it, 
>> but the OS hasn’t yet set it up. In those cases, you can hang the socket. 
>> However, I’ve tried adding some artificial delay, and while it helped, it 
>> didn’t completely solve the problem.
>> 
>> I have an idea for a workaround (set a timer and retry after awhile), but 
>> would obviously prefer a real solution. I’m not even sure it will work as it 
>> is unclear that the server (who is the one hung in accept) will break free 
>> if the client closes the socket and retries.
>> 
>> 
>>> On Nov 6, 2015, at 10:53 PM, Artem Polyakov <artpo...@gmail.com 
>>> <mailto:artpo...@gmail.com>> wrote:
>>> 
>>> Hello, is there any progress on this topic? This affects our PMIx 
>>> measurements.
>>> 
>>> 2015-10-30 21:21 GMT+06:00 Ralph Castain <r...@open-mpi.org 
>>> <mailto:r...@open-mpi.org>>:
>>> I’ve verified that the orte/util/listener thread is not being started, so I 
>>> don’t think it should be involved in this problem.
>>> 
>>> HTH
>>> Ralph
>>> 
>>>> On Oct 30, 2015, at 8:07 AM, Ralph Castain <r...@open-mpi.org 
>>>> <mailto:r...@open-mpi.org>> wrote:
>>>> 
>>>> Hmmm…there is a hook that would allow the PMIx server to utilize that 
>>>> listener thread, but we aren’t currently using it. Each daemon plus mpirun 
>>>> will call orte_start_listener, but nothing is currently registering and so 
>>>> the listener in that code is supposed to just return without starting the 
>>>> thread.
>>>> 
>>>> So the only listener thread that should exist is the one inside the PMIx 
>>>> server itself. If something else is happening, then that would be a bug. I 
>>>> can look at the orte listener code to ensure that the thread isn’t 
>>>> incorrectly starting.
>>>> 
>>>> 
>>>>> On Oct 29, 2015, at 10:03 PM, George Bosilca <bosi...@icl.utk.edu 
>>>>> <mailto:bosi...@icl.utk.edu>> wrote:
>>>>> 
>>>>> Some progress, that puzzles me but might help you understand. Once the 
>>>>> deadlock appears, if I manually kill the MPI process on the node where 
>>>>> the deadlock was created, the local orte daemon doesn't notice and will 
>>>>> just keep waiting.
>>>>> 
>>>>> Quick question: I am under the impression that the issue is not in the 
>>>>> PMIX server but somewhere around the listener_thread_fn in 
>>>>> orte/util/listener.c. Possible ?
>>>>> 
>>>>>   George.
>>>>> 
>>>>> 
>>>>> On Wed, Oct 28, 2015 at 3:56 AM, Ralph Castain <r...@open-mpi.org 
>>>>> <mailto:r...@open-mpi.org>> wrote:
>>>>> Should have also clarified: the prior fixes are indeed in the current 
>>>>> master.
>>>>> 
>>>>>> On Oct 28, 2015, at 12:42 AM, Ralph Castain <r...@open-mpi.org 
>>>>>> <mailto:r...@open-mpi.org>> wrote:
>>>>>> 
>>>>>> Nope - I was wrong. The correction on the client side consisted of 
>>>>>> attempting to timeout if the blocking recv failed. We then modified the 
>>>>>> blocking send/recv so they would handle errors.
>>>>>> 
>>>>>> So that problem occurred -after- the server had correctly called accept. 
>>>>>> The listener code is in 
>>>>>> opal/mca/pmix/pmix1xx/pmix/src/server/pmix_server_listener.c
>>>>>> 
>>>>>> It looks to me like the only way we could drop the accept (assuming the 
>>>>>> OS doesn’t lose it) is if the file descriptor lies outside the expected 
>>>>>> range once we fall out of select:
>>>>>> 
>>>>>> 
>>>>>>         /* Spin accepting connections until all active listen sockets
>>>>>>          * do not have any incoming connections, pushing each connection
>>>>>>          * onto the event queue for processing
>>>>>>          */
>>>>>>         do {
>>>>>>             accepted_connections = 0;
>>>>>>             /* according to the man pages, select replaces the given 
>>>>>> descriptor
>>>>>>              * set with a subset consisting of those descriptors that 
>>>>>> are ready
>>>>>>              * for the specified operation - in this case, a read. So we 
>>>>>> need to
>>>>>>              * first check to see if this file descriptor is included in 
>>>>>> the
>>>>>>              * returned subset
>>>>>>              */
>>>>>>             if (0 == FD_ISSET(pmix_server_globals.listen_socket, 
>>>>>> &readfds)) {
>>>>>>                 /* this descriptor is not included */
>>>>>>                 continue;
>>>>>>             }
>>>>>> 
>>>>>>             /* this descriptor is ready to be read, which means a 
>>>>>> connection
>>>>>>              * request has been received - so harvest it. All we want to 
>>>>>> do
>>>>>>              * here is accept the connection and push the info onto the 
>>>>>> event
>>>>>>              * library for subsequent processing - we don't want to 
>>>>>> actually
>>>>>>              * process the connection here as it takes too long, and so 
>>>>>> the
>>>>>>              * OS might start rejecting connections due to timeout.
>>>>>>              */
>>>>>>             pending_connection = PMIX_NEW(pmix_pending_connection_t);
>>>>>>             event_assign(&pending_connection->ev, pmix_globals.evbase, 
>>>>>> -1,
>>>>>>                          EV_WRITE, connection_handler, 
>>>>>> pending_connection);
>>>>>>             pending_connection->sd = 
>>>>>> accept(pmix_server_globals.listen_socket,
>>>>>>                                             (struct 
>>>>>> sockaddr*)&(pending_connection->addr),
>>>>>>                                             &addrlen);
>>>>>>             if (pending_connection->sd < 0) {
>>>>>>                 PMIX_RELEASE(pending_connection);
>>>>>>                 if (pmix_socket_errno != EAGAIN ||
>>>>>>                     pmix_socket_errno != EWOULDBLOCK) {
>>>>>>                     if (EMFILE == pmix_socket_errno) {
>>>>>>                         PMIX_ERROR_LOG(PMIX_ERR_OUT_OF_RESOURCE);
>>>>>>                     } else {
>>>>>>                         pmix_output(0, "listen_thread: accept() failed: 
>>>>>> %s (%d).",
>>>>>>                                     strerror(pmix_socket_errno), 
>>>>>> pmix_socket_errno);
>>>>>>                     }
>>>>>>                     goto done;
>>>>>>                 }
>>>>>>                 continue;
>>>>>>             }
>>>>>> 
>>>>>>             pmix_output_verbose(8, pmix_globals.debug_output,
>>>>>>                                 "listen_thread: new connection: (%d, 
>>>>>> %d)",
>>>>>>                                 pending_connection->sd, 
>>>>>> pmix_socket_errno);
>>>>>>             /* activate the event */
>>>>>>             event_active(&pending_connection->ev, EV_WRITE, 1);
>>>>>>             accepted_connections++;
>>>>>>         } while (accepted_connections > 0);
>>>>>> 
>>>>>> 
>>>>>>> On Oct 28, 2015, at 12:25 AM, Ralph Castain <r...@open-mpi.org 
>>>>>>> <mailto:r...@open-mpi.org>> wrote:
>>>>>>> 
>>>>>>> Looking at the code, it appears that a fix was committed for this 
>>>>>>> problem, and that we correctly resolved the issue found by Paul. The 
>>>>>>> problem is that the fix didn’t get upstreamed, and so it was lost the 
>>>>>>> next time we refreshed PMIx. Sigh.
>>>>>>> 
>>>>>>> Let me try to recreate the fix and have you take a gander at it.
>>>>>>> 
>>>>>>> 
>>>>>>>> On Oct 28, 2015, at 12:22 AM, Ralph Castain <r...@open-mpi.org 
>>>>>>>> <mailto:r...@open-mpi.org>> wrote:
>>>>>>>> 
>>>>>>>> Here is the discussion - afraid it is fairly lengthy. Ignore the hwloc 
>>>>>>>> references in it as that was a separate issue:
>>>>>>>> 
>>>>>>>> http://www.open-mpi.org/community/lists/devel/2015/09/18074.php 
>>>>>>>> <http://www.open-mpi.org/community/lists/devel/2015/09/18074.php>
>>>>>>>> 
>>>>>>>> It definitely sounds like the same issue creeping in again. I’d 
>>>>>>>> appreciate any thoughts on how to correct it. If it helps, you could 
>>>>>>>> look at the PMIx master - there are standalone tests in the 
>>>>>>>> test/simple directory that fork/exec a child and just do the 
>>>>>>>> connection.
>>>>>>>> 
>>>>>>>> https://github.com/pmix/master <https://github.com/pmix/master>
>>>>>>>> 
>>>>>>>> The test server is simptest.c - it will spawn a single copy of 
>>>>>>>> simpclient.c by default.
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> On Oct 27, 2015, at 10:14 PM, George Bosilca <bosi...@icl.utk.edu 
>>>>>>>>> <mailto:bosi...@icl.utk.edu>> wrote:
>>>>>>>>> 
>>>>>>>>> Interesting. Do you have a pointer to the commit (or/and to the 
>>>>>>>>> discussion)?
>>>>>>>>> 
>>>>>>>>> I looked at the PMIX code, and I have identified few issues, but 
>>>>>>>>> unfortunately none of them seem to fix the problem for good. However, 
>>>>>>>>> now I need more than 1000 runs to get a deadlock (instead of few 
>>>>>>>>> tens).
>>>>>>>>> 
>>>>>>>>> Looking with "netstat -ax" at the status of the UDS while the 
>>>>>>>>> processes are deadlocked, I see 2 UDS with the same name: one from 
>>>>>>>>> the server which is in LISTEN state, and one for the client which is 
>>>>>>>>> being in CONNECTING state (while the client already sent a message in 
>>>>>>>>> the socket and is now waiting in a blocking receive). This somehow 
>>>>>>>>> suggest that the server has not yet called accept on the UDS. 
>>>>>>>>> Unfortunately, there are 3 threads all doing different flavors of 
>>>>>>>>> even_base and select, so I have a hard time tracking the path of the 
>>>>>>>>> UDS on the server side.
>>>>>>>>> 
>>>>>>>>> So in order to validate my assumption I wrote a minimalistic UDS 
>>>>>>>>> client and server application and tried different scenarios. The 
>>>>>>>>> conclusion is that in order to see the same type of output from 
>>>>>>>>> "netstat -ax" I have to call listen on the server, connect on the 
>>>>>>>>> client and do not call accept on the server.
>>>>>>>>> 
>>>>>>>>> With the same occasion I also confirmed that the UDS are holding the 
>>>>>>>>> data sent so there is no need for further synchronization for the 
>>>>>>>>> case where the data is sent first. We only need to find out how the 
>>>>>>>>> server forgets to call accept.
>>>>>>>>> 
>>>>>>>>>   George.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Tue, Oct 27, 2015 at 7:52 PM, Ralph Castain <r...@open-mpi.org 
>>>>>>>>> <mailto:r...@open-mpi.org>> wrote:
>>>>>>>>> Hmmm…this looks like it might be that problem we previously saw where 
>>>>>>>>> the blocking recv hangs in a proc when the blocking send tries to 
>>>>>>>>> send before the domain socket is actually ready, and so the send 
>>>>>>>>> fails on the other end. As I recall, it was something to do with the 
>>>>>>>>> socketoptions - and then Paul had a problem on some of his machines, 
>>>>>>>>> and we backed it out?
>>>>>>>>> 
>>>>>>>>> I wonder if that’s what is biting us here again, and what we need is 
>>>>>>>>> to either remove the blocking send/recv’s altogether, or figure out a 
>>>>>>>>> way to wait until the socket is really ready.
>>>>>>>>> 
>>>>>>>>> Any thoughts?
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> On Oct 27, 2015, at 4:11 PM, George Bosilca <bosi...@icl.utk.edu 
>>>>>>>>>> <mailto:bosi...@icl.utk.edu>> wrote:
>>>>>>>>>> 
>>>>>>>>>> It appear the branch solve the problem at least partially. I asked 
>>>>>>>>>> one of my students to hammer it pretty badly, and he reported that 
>>>>>>>>>> the deadlocks still occur. He also graciously provided some 
>>>>>>>>>> stacktraces:
>>>>>>>>>> 
>>>>>>>>>> #0  0x00007f4bd5274aed in nanosleep () from /lib64/libc.so.6
>>>>>>>>>> #1  0x00007f4bd52a9c94 in usleep () from /lib64/libc.so.6
>>>>>>>>>> #2  0x00007f4bd2e42b00 in OPAL_PMIX_PMIX1XX_PMIx_Fence (procs=0x0, 
>>>>>>>>>> nprocs=0, info=0x7fff3c561960, 
>>>>>>>>>>     ninfo=1) at src/client/pmix_client_fence.c:100
>>>>>>>>>> #3  0x00007f4bd306e6d2 in pmix1_fence (procs=0x0, collect_data=1) at 
>>>>>>>>>> pmix1_client.c:306
>>>>>>>>>> #4  0x00007f4bd57d5cc3 in ompi_mpi_init (argc=3, 
>>>>>>>>>> argv=0x7fff3c561ea8, requested=3, 
>>>>>>>>>>     provided=0x7fff3c561d84) at runtime/ompi_mpi_init.c:644
>>>>>>>>>> #5  0x00007f4bd5813399 in PMPI_Init_thread (argc=0x7fff3c561d7c, 
>>>>>>>>>> argv=0x7fff3c561d70, required=3, 
>>>>>>>>>>     provided=0x7fff3c561d84) at pinit_thread.c:69
>>>>>>>>>> #6  0x0000000000401516 in main (argc=3, argv=0x7fff3c561ea8) at 
>>>>>>>>>> osu_mbw_mr.c:86
>>>>>>>>>> 
>>>>>>>>>> And another process:
>>>>>>>>>> 
>>>>>>>>>> #0  0x00007f7b9d7d8bdc in recv () from /lib64/libpthread.so.0
>>>>>>>>>> #1  0x00007f7b9b0aa42d in opal_pmix_pmix1xx_pmix_usock_recv_blocking 
>>>>>>>>>> (sd=13, data=0x7ffd62139004 "", 
>>>>>>>>>>     size=4) at src/usock/usock.c:168
>>>>>>>>>> #2  0x00007f7b9b0af5d9 in recv_connect_ack (sd=13) at 
>>>>>>>>>> src/client/pmix_client.c:844
>>>>>>>>>> #3  0x00007f7b9b0b085e in usock_connect (addr=0x7ffd62139330) at 
>>>>>>>>>> src/client/pmix_client.c:1110
>>>>>>>>>> #4  0x00007f7b9b0acc24 in connect_to_server (address=0x7ffd62139330, 
>>>>>>>>>> cbdata=0x7ffd621390e0)
>>>>>>>>>>     at src/client/pmix_client.c:181
>>>>>>>>>> #5  0x00007f7b9b0ad569 in OPAL_PMIX_PMIX1XX_PMIx_Init 
>>>>>>>>>> (proc=0x7f7b9b4e9b60)
>>>>>>>>>>     at src/client/pmix_client.c:362
>>>>>>>>>> #6  0x00007f7b9b2dbd9d in pmix1_client_init () at pmix1_client.c:99
>>>>>>>>>> #7  0x00007f7b9b4eb95f in pmi_component_query 
>>>>>>>>>> (module=0x7ffd62139490, priority=0x7ffd6213948c)
>>>>>>>>>>     at ess_pmi_component.c:90
>>>>>>>>>> #8  0x00007f7b9ce70ec5 in mca_base_select (type_name=0x7f7b9d20e059 
>>>>>>>>>> "ess", output_id=-1, 
>>>>>>>>>>     components_available=0x7f7b9d431eb0, best_module=0x7ffd621394d0, 
>>>>>>>>>> best_component=0x7ffd621394d8, 
>>>>>>>>>>     priority_out=0x0) at mca_base_components_select.c:77
>>>>>>>>>> #9  0x00007f7b9d1a956b in orte_ess_base_select () at 
>>>>>>>>>> base/ess_base_select.c:40
>>>>>>>>>> #10 0x00007f7b9d160449 in orte_init (pargc=0x0, pargv=0x0, flags=32) 
>>>>>>>>>> at runtime/orte_init.c:219
>>>>>>>>>> #11 0x00007f7b9da4377a in ompi_mpi_init (argc=3, 
>>>>>>>>>> argv=0x7ffd621397f8, requested=3, 
>>>>>>>>>>     provided=0x7ffd621396d4) at runtime/ompi_mpi_init.c:488
>>>>>>>>>> #12 0x00007f7b9da81399 in PMPI_Init_thread (argc=0x7ffd621396cc, 
>>>>>>>>>> argv=0x7ffd621396c0, required=3, 
>>>>>>>>>>     provided=0x7ffd621396d4) at pinit_thread.c:69
>>>>>>>>>> #13 0x0000000000401516 in main (argc=3, argv=0x7ffd621397f8) at 
>>>>>>>>>> osu_mbw_mr.c:86
>>>>>>>>>> 
>>>>>>>>>>   George.
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Tue, Oct 27, 2015 at 2:36 PM, Ralph Castain <r...@open-mpi.org 
>>>>>>>>>> <mailto:r...@open-mpi.org>> wrote:
>>>>>>>>>> I haven’t been able to replicate this when using the branch in this 
>>>>>>>>>> PR:
>>>>>>>>>> 
>>>>>>>>>> https://github.com/open-mpi/ompi/pull/1073 
>>>>>>>>>> <https://github.com/open-mpi/ompi/pull/1073>
>>>>>>>>>> 
>>>>>>>>>> Would you mind giving it a try? It fixes some other race conditions 
>>>>>>>>>> and might pick this one up too.
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>> On Oct 27, 2015, at 10:04 AM, Ralph Castain <r...@open-mpi.org 
>>>>>>>>>>> <mailto:r...@open-mpi.org>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> Okay, I’ll take a look - I’ve been chasing a race condition that 
>>>>>>>>>>> might be related
>>>>>>>>>>> 
>>>>>>>>>>>> On Oct 27, 2015, at 9:54 AM, George Bosilca <bosi...@icl.utk.edu 
>>>>>>>>>>>> <mailto:bosi...@icl.utk.edu>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> No, it's using 2 nodes.
>>>>>>>>>>>>   George.
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> On Tue, Oct 27, 2015 at 12:35 PM, Ralph Castain <r...@open-mpi.org 
>>>>>>>>>>>> <mailto:r...@open-mpi.org>> wrote:
>>>>>>>>>>>> Is this on a single node?
>>>>>>>>>>>> 
>>>>>>>>>>>>> On Oct 27, 2015, at 9:25 AM, George Bosilca <bosi...@icl.utk.edu 
>>>>>>>>>>>>> <mailto:bosi...@icl.utk.edu>> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I get intermittent deadlocks wit the latest trunk. The smallest 
>>>>>>>>>>>>> reproducer is a shell for loop around a small (2 processes) short 
>>>>>>>>>>>>> (20 seconds) MPI application. After few tens of iterations the 
>>>>>>>>>>>>> MPI_Init will deadlock with the following backtrace:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> #0  0x00007fa94b5d9aed in nanosleep () from /lib64/libc.so.6
>>>>>>>>>>>>> #1  0x00007fa94b60ec94 in usleep () from /lib64/libc.so.6
>>>>>>>>>>>>> #2  0x00007fa94960ba08 in OPAL_PMIX_PMIX1XX_PMIx_Fence 
>>>>>>>>>>>>> (procs=0x0, nprocs=0, info=0x7ffd7934fb90, 
>>>>>>>>>>>>>     ninfo=1) at src/client/pmix_client_fence.c:100
>>>>>>>>>>>>> #3  0x00007fa9498376a2 in pmix1_fence (procs=0x0, collect_data=1) 
>>>>>>>>>>>>> at pmix1_client.c:305
>>>>>>>>>>>>> #4  0x00007fa94bb39ba4 in ompi_mpi_init (argc=3, 
>>>>>>>>>>>>> argv=0x7ffd793500a8, requested=3, 
>>>>>>>>>>>>>     provided=0x7ffd7934ff94) at runtime/ompi_mpi_init.c:645
>>>>>>>>>>>>> #5  0x00007fa94bb77281 in PMPI_Init_thread (argc=0x7ffd7934ff8c, 
>>>>>>>>>>>>> argv=0x7ffd7934ff80, required=3, 
>>>>>>>>>>>>>     provided=0x7ffd7934ff94) at pinit_thread.c:69
>>>>>>>>>>>>> #6  0x000000000040150f in main (argc=3, argv=0x7ffd793500a8) at 
>>>>>>>>>>>>> osu_mbw_mr.c:86
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On my machines this is reproducible at 100% after anywhere 
>>>>>>>>>>>>> between 50 and 100 iterations.
>>>>>>>>>>>>> 
>>>>>>>>>>>>>   Thanks,
>>>>>>>>>>>>>     George.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>> devel mailing list
>>>>>>>>>>>>> de...@open-mpi.org <mailto:de...@open-mpi.org>
>>>>>>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
>>>>>>>>>>>>> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
>>>>>>>>>>>>> Link to this post: 
>>>>>>>>>>>>> http://www.open-mpi.org/community/lists/devel/2015/10/18280.php 
>>>>>>>>>>>>> <http://www.open-mpi.org/community/lists/devel/2015/10/18280.php>
>>>>>>>>>>>> 
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> devel mailing list
>>>>>>>>>>>> de...@open-mpi.org <mailto:de...@open-mpi.org>
>>>>>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
>>>>>>>>>>>> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
>>>>>>>>>>>> Link to this post: 
>>>>>>>>>>>> http://www.open-mpi.org/community/lists/devel/2015/10/18281.php 
>>>>>>>>>>>> <http://www.open-mpi.org/community/lists/devel/2015/10/18281.php>
>>>>>>>>>>>> 
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> devel mailing list
>>>>>>>>>>>> de...@open-mpi.org <mailto:de...@open-mpi.org>
>>>>>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
>>>>>>>>>>>> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
>>>>>>>>>>>> Link to this post: 
>>>>>>>>>>>> http://www.open-mpi.org/community/lists/devel/2015/10/18282.php 
>>>>>>>>>>>> <http://www.open-mpi.org/community/lists/devel/2015/10/18282.php>
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> _______________________________________________
>>>>>>>>>> devel mailing list
>>>>>>>>>> de...@open-mpi.org <mailto:de...@open-mpi.org>
>>>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
>>>>>>>>>> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
>>>>>>>>>> Link to this post: 
>>>>>>>>>> http://www.open-mpi.org/community/lists/devel/2015/10/18284.php 
>>>>>>>>>> <http://www.open-mpi.org/community/lists/devel/2015/10/18284.php>
>>>>>>>>>> 
>>>>>>>>>> _______________________________________________
>>>>>>>>>> devel mailing list
>>>>>>>>>> de...@open-mpi.org <mailto:de...@open-mpi.org>
>>>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
>>>>>>>>>> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
>>>>>>>>>> Link to this post: 
>>>>>>>>>> http://www.open-mpi.org/community/lists/devel/2015/10/18292.php 
>>>>>>>>>> <http://www.open-mpi.org/community/lists/devel/2015/10/18292.php>
>>>>>>>>> 
>>>>>>>>> _______________________________________________
>>>>>>>>> devel mailing list
>>>>>>>>> de...@open-mpi.org <mailto:de...@open-mpi.org>
>>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
>>>>>>>>> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
>>>>>>>>> Link to this post: 
>>>>>>>>> http://www.open-mpi.org/community/lists/devel/2015/10/18294.php 
>>>>>>>>> <http://www.open-mpi.org/community/lists/devel/2015/10/18294.php>
>>>>>>>>> 
>>>>>>>>> _______________________________________________
>>>>>>>>> devel mailing list
>>>>>>>>> de...@open-mpi.org <mailto:de...@open-mpi.org>
>>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
>>>>>>>>> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
>>>>>>>>> Link to this post: 
>>>>>>>>> http://www.open-mpi.org/community/lists/devel/2015/10/18302.php 
>>>>>>>>> <http://www.open-mpi.org/community/lists/devel/2015/10/18302.php>
>>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> devel mailing list
>>>>> de...@open-mpi.org <mailto:de...@open-mpi.org>
>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
>>>>> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
>>>>> Link to this post: 
>>>>> http://www.open-mpi.org/community/lists/devel/2015/10/18309.php 
>>>>> <http://www.open-mpi.org/community/lists/devel/2015/10/18309.php>
>>>>> 
>>>>> _______________________________________________
>>>>> devel mailing list
>>>>> de...@open-mpi.org <mailto:de...@open-mpi.org>
>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
>>>>> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
>>>>> Link to this post: 
>>>>> http://www.open-mpi.org/community/lists/devel/2015/10/18320.php 
>>>>> <http://www.open-mpi.org/community/lists/devel/2015/10/18320.php>
>>> 
>>> 
>>> _______________________________________________
>>> devel mailing list
>>> de...@open-mpi.org <mailto:de...@open-mpi.org>
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
>>> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
>>> Link to this post: 
>>> http://www.open-mpi.org/community/lists/devel/2015/10/18323.php 
>>> <http://www.open-mpi.org/community/lists/devel/2015/10/18323.php>
>>> 
>>> 
>>> 
>>> -- 
>>> С Уважением, Поляков Артем Юрьевич
>>> Best regards, Artem Y. Polyakov
>>> _______________________________________________
>>> devel mailing list
>>> de...@open-mpi.org <mailto:de...@open-mpi.org>
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
>>> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
>>> Link to this post: 
>>> http://www.open-mpi.org/community/lists/devel/2015/11/18334.php 
>>> <http://www.open-mpi.org/community/lists/devel/2015/11/18334.php>
>> 
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org <mailto:de...@open-mpi.org>
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
>> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/devel/2015/11/18335.php 
>> <http://www.open-mpi.org/community/lists/devel/2015/11/18335.php>
>> 
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org <mailto:de...@open-mpi.org>
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
>> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/devel/2015/11/18336.php 
>> <http://www.open-mpi.org/community/lists/devel/2015/11/18336.php>
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org <mailto:de...@open-mpi.org>
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2015/11/18337.php 
> <http://www.open-mpi.org/community/lists/devel/2015/11/18337.php>
> 
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org <mailto:de...@open-mpi.org>
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2015/11/18340.php 
> <http://www.open-mpi.org/community/lists/devel/2015/11/18340.php>
> 
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org <mailto:de...@open-mpi.org>
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2015/11/18341.php 
> <http://www.open-mpi.org/community/lists/devel/2015/11/18341.php>
> 
> 
> 
> -- 
> С Уважением, Поляков Артем Юрьевич
> Best regards, Artem Y. Polyakov
> 
> 
> 
> -- 
> С Уважением, Поляков Артем Юрьевич
> Best regards, Artem Y. Polyakov
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org <mailto:de...@open-mpi.org>
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2015/11/18345.php 
> <http://www.open-mpi.org/community/lists/devel/2015/11/18345.php>
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2015/11/18346.php

Reply via email to