Re: [OMPI devel] PMIX deadlock

George Bosilca Mon, 9 Nov 2015 12:04:18 -0500 (EST)

Clearly Nyal got a valid point there. I launched a stress test with Nysal
suggestion in the code, and so far it's up to few hundreds iterations
without deadlock. I would not claim victory yet, I launched a 10k cycle to
see where we stand (btw this never passed before).
I'll let you know the outcome.


  George.


On Mon, Nov 9, 2015 at 11:55 AM, Artem Polyakov <[email protected]> wrote:

>
>
> 2015-11-09 22:42 GMT+06:00 Artem Polyakov <[email protected]>:
>
>> This is the very good point, Nysal!
>>
>> This is definitely a problem and I can say even more: avg. 3 from every
>> 10 tasks was affected by this bug. Once the PR (
>> https://github.com/pmix/master/pull/8) was applied I was able to run 100
>> testing tasks without any hangs.
>>
>> Here some more information on my symptoms. I was observing this without
>> OMPI, just running pmix_client test binary from PMIx test suite with SLURM
>> PMIx plugin.
>> Periodicaly application was hanging. Investigation shows that not all
>> processes are able to initialize correctly.
>> Here is how such client's backtrace looks like:
>>
>
> P.S. I think that this backtrace may be relevant to George's problem as
> well. In my case not all of the processes was hanging in the
> connect_to_server, most of them were able to move forward and reach Fence.
> George, the backtrace that you've posted was the same on both processes or
> it was the "random" one from one of them?
>
>
>> (gdb) bt
>> #0  0x00007f1448f1b7eb in recv () from
>> /lib/x86_64-linux-gnu/libpthread.so.0
>> #1  0x00007f144914c191 in pmix_usock_recv_blocking (sd=9,
>> data=0x7fff367f7c64 "", size=4) at src/usock/usock.c:166
>> #2  0x00007f1449152d18 in recv_connect_ack (sd=9) at
>> src/client/pmix_client.c:837
>> #3  0x00007f14491546bf in usock_connect (addr=0x7fff367f7d60) at
>> src/client/pmix_client.c:1103
>> #4  0x00007f144914f94c in connect_to_server (address=0x7fff367f7d60,
>> cbdata=0x7fff367f7dd0) at src/client/pmix_client.c:179
>> #5  0x00007f1449150421 in PMIx_Init (proc=0x7fff367f81d0) at
>> src/client/pmix_client.c:355
>> #6  0x0000000000401b97 in main (argc=9, argv=0x7fff367f83d8) at
>> pmix_client.c:62
>>
>>
>> The server-side debug has the following lines at the end of the file:
>> [cn33:00482] pmix:server register client slurm.pmix.22.0:10
>> [cn33:00482] pmix:server _register_client for nspace slurm.pmix.22.0 rank
>> 10
>> [cn33:00482] pmix:server setup_fork for nspace slurm.pmix.22.0 rank 10
>>
>> in normal operation the following lines should appear after lines above:
>> ....
>> [cn33:00188] listen_thread: new connection: (26, 0)
>> [cn33:00188] connection_handler: new connection: 26
>> [cn33:00188] RECV CONNECT ACK FROM PEER ON SOCKET 26
>> [cn33:00188] waiting for blocking recv of 16 bytes
>> [cn33:00188] blocking receive complete from remote
>> ....
>>
>> At the client side I see the following lines
>> cn33:00491] usock_peer_try_connect: attempting to connect to server
>> [cn33:00491] usock_peer_try_connect: attempting to connect to server on
>> socket 10
>> [cn33:00491] pmix: SEND CONNECT ACK
>> [cn33:00491] sec: native create_cred
>> [cn33:00491] sec: using credential 1000:1000
>> [cn33:00491] send blocking of 54 bytes to socket 10
>> [cn33:00491] blocking send complete to socket 10
>> [cn33:00491] pmix: RECV CONNECT ACK FROM SERVER
>> [cn33:00491] waiting for blocking recv of 4 bytes
>> [cn33:00491] blocking_recv received error 11:Resource temporarily
>> unavailable from remote - cycling
>> [cn33:00491] blocking_recv received error 11:Resource temporarily
>> unavailable from remote - cycling
>> [... repeated many times ...]
>>
>> With the fix for the problem highlighted by Nysal all runs cleanly.
>>
>>
>> 2015-11-09 10:53 GMT+06:00 Nysal Jan K A <[email protected]>:
>>
>>> In listen_thread():
>>> 194     while (pmix_server_globals.listen_thread_active) {
>>> 195         FD_ZERO(&readfds);
>>> 196         FD_SET(pmix_server_globals.listen_socket, &readfds);
>>> 197         max = pmix_server_globals.listen_socket;
>>>
>>> Is it possible that pmix_server_globals.listen_thread_active can be
>>> false, in which case the thread just exits and will never call accept() ?
>>>
>>> In pmix_start_listening():
>>> 147         /* fork off the listener thread */
>>> 148         if (0 > pthread_create(&engine, NULL, listen_thread, NULL)) {
>>> 149             return PMIX_ERROR;
>>> 150         }
>>> 151         pmix_server_globals.listen_thread_active = true;
>>>
>>> pmix_server_globals.listen_thread_active is set to true after the thread
>>> is created, could this cause a race ?
>>> listen_thread_active might also need to be declared as volatile.
>>>
>>> Regards
>>> --Nysal
>>>
>>> On Sun, Nov 8, 2015 at 10:38 PM, George Bosilca <[email protected]>
>>> wrote:
>>>
>>>> We had a power outage last week and the local disks on our cluster were
>>>> wiped out. My tester was in there. But, I can rewrite it after SC.
>>>>
>>>>   George.
>>>>
>>>> On Sat, Nov 7, 2015 at 12:04 PM, Ralph Castain <[email protected]>
>>>> wrote:
>>>>
>>>>> Could you send me your stress test? I’m wondering if it is just
>>>>> something about how we set socket options
>>>>>
>>>>>
>>>>> On Nov 7, 2015, at 8:58 AM, George Bosilca <[email protected]>
>>>>> wrote:
>>>>>
>>>>> I has to postpone this until after SC. However, I ran for 3 days a
>>>>> stress test of UDS reproducing the opening and sending of data (what Ralph
>>>>> said in his email) and I never could get a deadlock.
>>>>>
>>>>>   George.
>>>>>
>>>>>
>>>>> On Sat, Nov 7, 2015 at 11:26 AM, Ralph Castain <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> George was looking into it, but I don’t know if he has had time
>>>>>> recently to continue the investigation. We understand “what” is happening
>>>>>> (accept sometimes ignores the connection), but we don’t yet know “why”.
>>>>>> I’ve done some digging around the web, and found that sometimes you can 
>>>>>> try
>>>>>> to talk to a Unix Domain Socket too quickly - i.e., you open it and then
>>>>>> send to it, but the OS hasn’t yet set it up. In those cases, you can hang
>>>>>> the socket. However, I’ve tried adding some artificial delay, and while 
>>>>>> it
>>>>>> helped, it didn’t completely solve the problem.
>>>>>>
>>>>>> I have an idea for a workaround (set a timer and retry after awhile),
>>>>>> but would obviously prefer a real solution. I’m not even sure it will 
>>>>>> work
>>>>>> as it is unclear that the server (who is the one hung in accept) will 
>>>>>> break
>>>>>> free if the client closes the socket and retries.
>>>>>>
>>>>>>
>>>>>> On Nov 6, 2015, at 10:53 PM, Artem Polyakov <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>> Hello, is there any progress on this topic? This affects our PMIx
>>>>>> measurements.
>>>>>>
>>>>>> 2015-10-30 21:21 GMT+06:00 Ralph Castain <[email protected]>:
>>>>>>
>>>>>>> I’ve verified that the orte/util/listener thread is not being
>>>>>>> started, so I don’t think it should be involved in this problem.
>>>>>>>
>>>>>>> HTH
>>>>>>> Ralph
>>>>>>>
>>>>>>> On Oct 30, 2015, at 8:07 AM, Ralph Castain <[email protected]> wrote:
>>>>>>>
>>>>>>> Hmmm…there is a hook that would allow the PMIx server to utilize
>>>>>>> that listener thread, but we aren’t currently using it. Each daemon plus
>>>>>>> mpirun will call orte_start_listener, but nothing is currently 
>>>>>>> registering
>>>>>>> and so the listener in that code is supposed to just return without
>>>>>>> starting the thread.
>>>>>>>
>>>>>>> So the only listener thread that should exist is the one inside the
>>>>>>> PMIx server itself. If something else is happening, then that would be a
>>>>>>> bug. I can look at the orte listener code to ensure that the thread 
>>>>>>> isn’t
>>>>>>> incorrectly starting.
>>>>>>>
>>>>>>>
>>>>>>> On Oct 29, 2015, at 10:03 PM, George Bosilca <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>> Some progress, that puzzles me but might help you understand. Once
>>>>>>> the deadlock appears, if I manually kill the MPI process on the node 
>>>>>>> where
>>>>>>> the deadlock was created, the local orte daemon doesn't notice and will
>>>>>>> just keep waiting.
>>>>>>>
>>>>>>> Quick question: I am under the impression that the issue is not in
>>>>>>> the PMIX server but somewhere around the listener_thread_fn in
>>>>>>> orte/util/listener.c. Possible ?
>>>>>>>
>>>>>>>   George.
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Oct 28, 2015 at 3:56 AM, Ralph Castain <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Should have also clarified: the prior fixes are indeed in the
>>>>>>>> current master.
>>>>>>>>
>>>>>>>> On Oct 28, 2015, at 12:42 AM, Ralph Castain <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Nope - I was wrong. The correction on the client side consisted of
>>>>>>>> attempting to timeout if the blocking recv failed. We then modified the
>>>>>>>> blocking send/recv so they would handle errors.
>>>>>>>>
>>>>>>>> So that problem occurred -after- the server had correctly called
>>>>>>>> accept. The listener code is in
>>>>>>>> opal/mca/pmix/pmix1xx/pmix/src/server/pmix_server_listener.c
>>>>>>>>
>>>>>>>> It looks to me like the only way we could drop the accept (assuming
>>>>>>>> the OS doesn’t lose it) is if the file descriptor lies outside the 
>>>>>>>> expected
>>>>>>>> range once we fall out of select:
>>>>>>>>
>>>>>>>>
>>>>>>>>         /* Spin accepting connections until all active listen
>>>>>>>> sockets
>>>>>>>>          * do not have any incoming connections, pushing each
>>>>>>>> connection
>>>>>>>>          * onto the event queue for processing
>>>>>>>>          */
>>>>>>>>         do {
>>>>>>>>             accepted_connections = 0;
>>>>>>>>             /* according to the man pages, select replaces the
>>>>>>>> given descriptor
>>>>>>>>              * set with a subset consisting of those descriptors
>>>>>>>> that are ready
>>>>>>>>              * for the specified operation - in this case, a read.
>>>>>>>> So we need to
>>>>>>>>              * first check to see if this file descriptor is
>>>>>>>> included in the
>>>>>>>>              * returned subset
>>>>>>>>              */
>>>>>>>>             if (0 == FD_ISSET(pmix_server_globals.listen_socket,
>>>>>>>> &readfds)) {
>>>>>>>>                 /* this descriptor is not included */
>>>>>>>>                 continue;
>>>>>>>>             }
>>>>>>>>
>>>>>>>>             /* this descriptor is ready to be read, which means a
>>>>>>>> connection
>>>>>>>>              * request has been received - so harvest it. All we
>>>>>>>> want to do
>>>>>>>>              * here is accept the connection and push the info onto
>>>>>>>> the event
>>>>>>>>              * library for subsequent processing - we don't want to
>>>>>>>> actually
>>>>>>>>              * process the connection here as it takes too long,
>>>>>>>> and so the
>>>>>>>>              * OS might start rejecting connections due to timeout.
>>>>>>>>              */
>>>>>>>>             pending_connection =
>>>>>>>> PMIX_NEW(pmix_pending_connection_t);
>>>>>>>>             event_assign(&pending_connection->ev,
>>>>>>>> pmix_globals.evbase, -1,
>>>>>>>>                          EV_WRITE, connection_handler,
>>>>>>>> pending_connection);
>>>>>>>>             pending_connection->sd =
>>>>>>>> accept(pmix_server_globals.listen_socket,
>>>>>>>>                                             (struct
>>>>>>>> sockaddr*)&(pending_connection->addr),
>>>>>>>>                                             &addrlen);
>>>>>>>>             if (pending_connection->sd < 0) {
>>>>>>>>                 PMIX_RELEASE(pending_connection);
>>>>>>>>                 if (pmix_socket_errno != EAGAIN ||
>>>>>>>>                     pmix_socket_errno != EWOULDBLOCK) {
>>>>>>>>                     if (EMFILE == pmix_socket_errno) {
>>>>>>>>                         PMIX_ERROR_LOG(PMIX_ERR_OUT_OF_RESOURCE);
>>>>>>>>                     } else {
>>>>>>>>                         pmix_output(0, "listen_thread: accept()
>>>>>>>> failed: %s (%d).",
>>>>>>>>                                     strerror(pmix_socket_errno),
>>>>>>>> pmix_socket_errno);
>>>>>>>>                     }
>>>>>>>>                     goto done;
>>>>>>>>                 }
>>>>>>>>                 continue;
>>>>>>>>             }
>>>>>>>>
>>>>>>>>             pmix_output_verbose(8, pmix_globals.debug_output,
>>>>>>>>                                 "listen_thread: new connection:
>>>>>>>> (%d, %d)",
>>>>>>>>                                 pending_connection->sd,
>>>>>>>> pmix_socket_errno);
>>>>>>>>             /* activate the event */
>>>>>>>>             event_active(&pending_connection->ev, EV_WRITE, 1);
>>>>>>>>             accepted_connections++;
>>>>>>>>         } while (accepted_connections > 0);
>>>>>>>>
>>>>>>>>
>>>>>>>> On Oct 28, 2015, at 12:25 AM, Ralph Castain <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Looking at the code, it appears that a fix was committed for this
>>>>>>>> problem, and that we correctly resolved the issue found by Paul. The
>>>>>>>> problem is that the fix didn’t get upstreamed, and so it was lost the 
>>>>>>>> next
>>>>>>>> time we refreshed PMIx. Sigh.
>>>>>>>>
>>>>>>>> Let me try to recreate the fix and have you take a gander at it.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Oct 28, 2015, at 12:22 AM, Ralph Castain <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Here is the discussion - afraid it is fairly lengthy. Ignore the
>>>>>>>> hwloc references in it as that was a separate issue:
>>>>>>>>
>>>>>>>> http://www.open-mpi.org/community/lists/devel/2015/09/18074.php
>>>>>>>>
>>>>>>>> It definitely sounds like the same issue creeping in again. I’d
>>>>>>>> appreciate any thoughts on how to correct it. If it helps, you could 
>>>>>>>> look
>>>>>>>> at the PMIx master - there are standalone tests in the test/simple
>>>>>>>> directory that fork/exec a child and just do the connection.
>>>>>>>>
>>>>>>>> https://github.com/pmix/master
>>>>>>>>
>>>>>>>> The test server is simptest.c - it will spawn a single copy of
>>>>>>>> simpclient.c by default.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Oct 27, 2015, at 10:14 PM, George Bosilca <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Interesting. Do you have a pointer to the commit (or/and to the
>>>>>>>> discussion)?
>>>>>>>>
>>>>>>>> I looked at the PMIX code, and I have identified few issues, but
>>>>>>>> unfortunately none of them seem to fix the problem for good. However, 
>>>>>>>> now I
>>>>>>>> need more than 1000 runs to get a deadlock (instead of few tens).
>>>>>>>>
>>>>>>>> Looking with "netstat -ax" at the status of the UDS while the
>>>>>>>> processes are deadlocked, I see 2 UDS with the same name: one from the
>>>>>>>> server which is in LISTEN state, and one for the client which is being 
>>>>>>>> in
>>>>>>>> CONNECTING state (while the client already sent a message in the 
>>>>>>>> socket and
>>>>>>>> is now waiting in a blocking receive). This somehow suggest that the 
>>>>>>>> server
>>>>>>>> has not yet called accept on the UDS. Unfortunately, there are 3 
>>>>>>>> threads
>>>>>>>> all doing different flavors of even_base and select, so I have a hard 
>>>>>>>> time
>>>>>>>> tracking the path of the UDS on the server side.
>>>>>>>>
>>>>>>>> So in order to validate my assumption I wrote a minimalistic UDS
>>>>>>>> client and server application and tried different scenarios. The 
>>>>>>>> conclusion
>>>>>>>> is that in order to see the same type of output from "netstat -ax" I 
>>>>>>>> have
>>>>>>>> to call listen on the server, connect on the client and do not call 
>>>>>>>> accept
>>>>>>>> on the server.
>>>>>>>>
>>>>>>>> With the same occasion I also confirmed that the UDS are holding
>>>>>>>> the data sent so there is no need for further synchronization for the 
>>>>>>>> case
>>>>>>>> where the data is sent first. We only need to find out how the server
>>>>>>>> forgets to call accept.
>>>>>>>>
>>>>>>>>   George.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Oct 27, 2015 at 7:52 PM, Ralph Castain <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hmmm…this looks like it might be that problem we previously saw
>>>>>>>>> where the blocking recv hangs in a proc when the blocking send tries 
>>>>>>>>> to
>>>>>>>>> send before the domain socket is actually ready, and so the send 
>>>>>>>>> fails on
>>>>>>>>> the other end. As I recall, it was something to do with the 
>>>>>>>>> socketoptions -
>>>>>>>>> and then Paul had a problem on some of his machines, and we backed it 
>>>>>>>>> out?
>>>>>>>>>
>>>>>>>>> I wonder if that’s what is biting us here again, and what we need
>>>>>>>>> is to either remove the blocking send/recv’s altogether, or figure 
>>>>>>>>> out a
>>>>>>>>> way to wait until the socket is really ready.
>>>>>>>>>
>>>>>>>>> Any thoughts?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Oct 27, 2015, at 4:11 PM, George Bosilca <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> It appear the branch solve the problem at least partially. I asked
>>>>>>>>> one of my students to hammer it pretty badly, and he reported that the
>>>>>>>>> deadlocks still occur. He also graciously provided some stacktraces:
>>>>>>>>>
>>>>>>>>> #0  0x00007f4bd5274aed in nanosleep () from /lib64/libc.so.6
>>>>>>>>> #1  0x00007f4bd52a9c94 in usleep () from /lib64/libc.so.6
>>>>>>>>> #2  0x00007f4bd2e42b00 in OPAL_PMIX_PMIX1XX_PMIx_Fence (procs=0x0,
>>>>>>>>> nprocs=0, info=0x7fff3c561960,
>>>>>>>>>     ninfo=1) at src/client/pmix_client_fence.c:100
>>>>>>>>> #3  0x00007f4bd306e6d2 in pmix1_fence (procs=0x0, collect_data=1)
>>>>>>>>> at pmix1_client.c:306
>>>>>>>>> #4  0x00007f4bd57d5cc3 in ompi_mpi_init (argc=3,
>>>>>>>>> argv=0x7fff3c561ea8, requested=3,
>>>>>>>>>     provided=0x7fff3c561d84) at runtime/ompi_mpi_init.c:644
>>>>>>>>> #5  0x00007f4bd5813399 in PMPI_Init_thread (argc=0x7fff3c561d7c,
>>>>>>>>> argv=0x7fff3c561d70, required=3,
>>>>>>>>>     provided=0x7fff3c561d84) at pinit_thread.c:69
>>>>>>>>> #6  0x0000000000401516 in main (argc=3, argv=0x7fff3c561ea8) at
>>>>>>>>> osu_mbw_mr.c:86
>>>>>>>>>
>>>>>>>>> And another process:
>>>>>>>>>
>>>>>>>>> #0  0x00007f7b9d7d8bdc in recv () from /lib64/libpthread.so.0
>>>>>>>>> #1  0x00007f7b9b0aa42d in
>>>>>>>>> opal_pmix_pmix1xx_pmix_usock_recv_blocking (sd=13, 
>>>>>>>>> data=0x7ffd62139004 "",
>>>>>>>>>     size=4) at src/usock/usock.c:168
>>>>>>>>> #2  0x00007f7b9b0af5d9 in recv_connect_ack (sd=13) at
>>>>>>>>> src/client/pmix_client.c:844
>>>>>>>>> #3  0x00007f7b9b0b085e in usock_connect (addr=0x7ffd62139330) at
>>>>>>>>> src/client/pmix_client.c:1110
>>>>>>>>> #4  0x00007f7b9b0acc24 in connect_to_server
>>>>>>>>> (address=0x7ffd62139330, cbdata=0x7ffd621390e0)
>>>>>>>>>     at src/client/pmix_client.c:181
>>>>>>>>> #5  0x00007f7b9b0ad569 in OPAL_PMIX_PMIX1XX_PMIx_Init
>>>>>>>>> (proc=0x7f7b9b4e9b60)
>>>>>>>>>     at src/client/pmix_client.c:362
>>>>>>>>> #6  0x00007f7b9b2dbd9d in pmix1_client_init () at pmix1_client.c:99
>>>>>>>>> #7  0x00007f7b9b4eb95f in pmi_component_query
>>>>>>>>> (module=0x7ffd62139490, priority=0x7ffd6213948c)
>>>>>>>>>     at ess_pmi_component.c:90
>>>>>>>>> #8  0x00007f7b9ce70ec5 in mca_base_select
>>>>>>>>> (type_name=0x7f7b9d20e059 "ess", output_id=-1,
>>>>>>>>>     components_available=0x7f7b9d431eb0,
>>>>>>>>> best_module=0x7ffd621394d0, best_component=0x7ffd621394d8,
>>>>>>>>>     priority_out=0x0) at mca_base_components_select.c:77
>>>>>>>>> #9  0x00007f7b9d1a956b in orte_ess_base_select () at
>>>>>>>>> base/ess_base_select.c:40
>>>>>>>>> #10 0x00007f7b9d160449 in orte_init (pargc=0x0, pargv=0x0,
>>>>>>>>> flags=32) at runtime/orte_init.c:219
>>>>>>>>> #11 0x00007f7b9da4377a in ompi_mpi_init (argc=3,
>>>>>>>>> argv=0x7ffd621397f8, requested=3,
>>>>>>>>>     provided=0x7ffd621396d4) at runtime/ompi_mpi_init.c:488
>>>>>>>>> #12 0x00007f7b9da81399 in PMPI_Init_thread (argc=0x7ffd621396cc,
>>>>>>>>> argv=0x7ffd621396c0, required=3,
>>>>>>>>>     provided=0x7ffd621396d4) at pinit_thread.c:69
>>>>>>>>> #13 0x0000000000401516 in main (argc=3, argv=0x7ffd621397f8) at
>>>>>>>>> osu_mbw_mr.c:86
>>>>>>>>>
>>>>>>>>>   George.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, Oct 27, 2015 at 2:36 PM, Ralph Castain <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> I haven’t been able to replicate this when using the branch in
>>>>>>>>>> this PR:
>>>>>>>>>>
>>>>>>>>>> https://github.com/open-mpi/ompi/pull/1073
>>>>>>>>>>
>>>>>>>>>> Would you mind giving it a try? It fixes some other race
>>>>>>>>>> conditions and might pick this one up too.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Oct 27, 2015, at 10:04 AM, Ralph Castain <[email protected]>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> Okay, I’ll take a look - I’ve been chasing a race condition that
>>>>>>>>>> might be related
>>>>>>>>>>
>>>>>>>>>> On Oct 27, 2015, at 9:54 AM, George Bosilca <[email protected]>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> No, it's using 2 nodes.
>>>>>>>>>>   George.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Tue, Oct 27, 2015 at 12:35 PM, Ralph Castain <[email protected]
>>>>>>>>>> > wrote:
>>>>>>>>>>
>>>>>>>>>>> Is this on a single node?
>>>>>>>>>>>
>>>>>>>>>>> On Oct 27, 2015, at 9:25 AM, George Bosilca <[email protected]>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> I get intermittent deadlocks wit the latest trunk. The smallest
>>>>>>>>>>> reproducer is a shell for loop around a small (2 processes) short 
>>>>>>>>>>> (20
>>>>>>>>>>> seconds) MPI application. After few tens of iterations the MPI_Init 
>>>>>>>>>>> will
>>>>>>>>>>> deadlock with the following backtrace:
>>>>>>>>>>>
>>>>>>>>>>> #0  0x00007fa94b5d9aed in nanosleep () from /lib64/libc.so.6
>>>>>>>>>>> #1  0x00007fa94b60ec94 in usleep () from /lib64/libc.so.6
>>>>>>>>>>> #2  0x00007fa94960ba08 in OPAL_PMIX_PMIX1XX_PMIx_Fence
>>>>>>>>>>> (procs=0x0, nprocs=0, info=0x7ffd7934fb90,
>>>>>>>>>>>     ninfo=1) at src/client/pmix_client_fence.c:100
>>>>>>>>>>> #3  0x00007fa9498376a2 in pmix1_fence (procs=0x0,
>>>>>>>>>>> collect_data=1) at pmix1_client.c:305
>>>>>>>>>>> #4  0x00007fa94bb39ba4 in ompi_mpi_init (argc=3,
>>>>>>>>>>> argv=0x7ffd793500a8, requested=3,
>>>>>>>>>>>     provided=0x7ffd7934ff94) at runtime/ompi_mpi_init.c:645
>>>>>>>>>>> #5  0x00007fa94bb77281 in PMPI_Init_thread (argc=0x7ffd7934ff8c,
>>>>>>>>>>> argv=0x7ffd7934ff80, required=3,
>>>>>>>>>>>     provided=0x7ffd7934ff94) at pinit_thread.c:69
>>>>>>>>>>> #6  0x000000000040150f in main (argc=3, argv=0x7ffd793500a8) at
>>>>>>>>>>> osu_mbw_mr.c:86
>>>>>>>>>>>
>>>>>>>>>>> On my machines this is reproducible at 100% after anywhere
>>>>>>>>>>> between 50 and 100 iterations.
>>>>>>>>>>>
>>>>>>>>>>>   Thanks,
>>>>>>>>>>>     George.
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> devel mailing list
>>>>>>>>>>> [email protected]
>>>>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>>>>>> Link to this post:
>>>>>>>>>>> http://www.open-mpi.org/community/lists/devel/2015/10/18280.php
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> devel mailing list
>>>>>>>>>>> [email protected]
>>>>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>>>>>> Link to this post:
>>>>>>>>>>> http://www.open-mpi.org/community/lists/devel/2015/10/18281.php
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> devel mailing list
>>>>>>>>>> [email protected]
>>>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>>>>> Link to this post:
>>>>>>>>>> http://www.open-mpi.org/community/lists/devel/2015/10/18282.php
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> devel mailing list
>>>>>>>>>> [email protected]
>>>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>>>>> Link to this post:
>>>>>>>>>> http://www.open-mpi.org/community/lists/devel/2015/10/18284.php
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> devel mailing list
>>>>>>>>> [email protected]
>>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>>>> Link to this post:
>>>>>>>>> http://www.open-mpi.org/community/lists/devel/2015/10/18292.php
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> devel mailing list
>>>>>>>>> [email protected]
>>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>>>> Link to this post:
>>>>>>>>> http://www.open-mpi.org/community/lists/devel/2015/10/18294.php
>>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> devel mailing list
>>>>>>>> [email protected]
>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>>> Link to this post:
>>>>>>>> http://www.open-mpi.org/community/lists/devel/2015/10/18302.php
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> devel mailing list
>>>>>>>> [email protected]
>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>>> Link to this post:
>>>>>>>> http://www.open-mpi.org/community/lists/devel/2015/10/18309.php
>>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> devel mailing list
>>>>>>> [email protected]
>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>> Link to this post:
>>>>>>> http://www.open-mpi.org/community/lists/devel/2015/10/18320.php
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> devel mailing list
>>>>>>> [email protected]
>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>> Link to this post:
>>>>>>> http://www.open-mpi.org/community/lists/devel/2015/10/18323.php
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> С Уважением, Поляков Артем Юрьевич
>>>>>> Best regards, Artem Y. Polyakov
>>>>>> _______________________________________________
>>>>>> devel mailing list
>>>>>> [email protected]
>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>> Link to this post:
>>>>>> http://www.open-mpi.org/community/lists/devel/2015/11/18334.php
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> devel mailing list
>>>>>> [email protected]
>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>> Link to this post:
>>>>>> http://www.open-mpi.org/community/lists/devel/2015/11/18335.php
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> devel mailing list
>>>>> [email protected]
>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>> Link to this post:
>>>>> http://www.open-mpi.org/community/lists/devel/2015/11/18336.php
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> devel mailing list
>>>>> [email protected]
>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>> Link to this post:
>>>>> http://www.open-mpi.org/community/lists/devel/2015/11/18337.php
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> devel mailing list
>>>> [email protected]
>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>> Link to this post:
>>>> http://www.open-mpi.org/community/lists/devel/2015/11/18340.php
>>>>
>>>
>>>
>>> _______________________________________________
>>> devel mailing list
>>> [email protected]
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> Link to this post:
>>> http://www.open-mpi.org/community/lists/devel/2015/11/18341.php
>>>
>>
>>
>>
>> --
>> С Уважением, Поляков Артем Юрьевич
>> Best regards, Artem Y. Polyakov
>>
>
>
>
> --
> С Уважением, Поляков Артем Юрьевич
> Best regards, Artem Y. Polyakov
>
> _______________________________________________
> devel mailing list
> [email protected]
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2015/11/18345.php
>

Re: [OMPI devel] PMIX deadlock

Reply via email to