Hi Gilles I took a crack at solving this in r32744 - CMRd it for 1.8.3 and assigned it to you for review. Give it a try and let me know if I (hopefully) got it.
The approach we have used in the past is to have both sides close their connections, and then have the higher vpid retry while the lower one waits. The logic for that was still in place, but it looks like you are hitting a different code path, and I found another potential one as well. So I think I plugged the holes, but will wait to hear if you confirm. Thanks Ralph On Sep 16, 2014, at 6:27 AM, Gilles Gouaillardet <gilles.gouaillar...@gmail.com> wrote: > Ralph, > > here is the full description of a race condition in oob/tcp i very briefly > mentionned in a previous post : > > the race condition can occur when two not connected orted try to send a > message to each other for the first time and at the same time. > > that can occur when running mpi helloworld on 4 nodes with the grpcomm/rcd > module. > > here is a scenario in which the race condition occurs : > > orted vpid 2 and 3 enter the allgather > /* they are not orte yet oob/tcp connected*/ > and they call orte.send_buffer_nb each other. > from a libevent point of view, vpid 2 and 3 will call > mca_oob_tcp_peer_try_connect > > vpid 2 calls mca_oob_tcp_send_handler > > vpid 3 calls connection_event_handler > > depending on the value returned by random() in libevent, vpid 3 will > either call mca_oob_tcp_send_handler (likely) or recv_handler (unlikely) > if vpid 3 calls recv_handler, it will close the two sockets to vpid 2 > > then vpid 2 will call mca_oob_tcp_recv_handler > (peer->state is MCA_OOB_TCP_CONNECT_ACK) > that will invoke mca_oob_tcp_recv_connect_ack > tcp_peer_recv_blocking will fail > /* zero bytes are recv'ed since vpid 3 previously closed the socket before > writing a header */ > and this is handled by mca_oob_tcp_recv_handler as a fatal error > /* ORTE_FORCED_TERMINATE(1) */ > > could you please have a look at it ? > > if you are too busy, could you please advise where this scenario should be > handled differently ? > - should vpid 3 keep one socket instead of closing both and retrying ? > - should vpid 2 handle the failure as a non fatal error ? > > Cheers, > > Gilles > _______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/09/15836.php