> -----Original Message----- > From: Sean Hefty [mailto:[EMAIL PROTECTED] > Sent: Wednesday, April 11, 2007 4:50 PM > To: Tang, Changqing > Cc: [email protected] > Subject: Re: How fast to get RDMA_CM_EVENT_DISCONNECTED ? > > > A question about rdmacm library. I use > rdma_connect/accept to wire > > the IB connection between A and B. Somehow the IB > connection is broken > > by either process B dies, or a bad cable. If process A just > receives > > messages from process B, can process A get a > > RDMA_CM_EVENT_DISCONNECTED event ? if yes, how fast A can get such > > event ? > > If the process B dies, the kernel IB CM on B's system will > automatically disconnect. Process A should get this fairly > close to when process B dies. > > I'm not as sure about the timing for a bad cable. > > Slightly off topic, but how do you handle flow control > between process A and B if process A only receives?
Yes, Internally in A, if the # of receives exceeds lowwater(4), an ack will be sent back. I assume ACK is not trigered at the moment. when A is trying to receive a message from B, and the message never shows, A acctualy sends a heart beat back to B, however, it takes serveral seconds for this heart-beat to complete with error ( we configure timout ~1 sec, and retry count 7). Serveral seconds to detect connection failure is not acceptable for us, so if I use rdmacm, I want to know if I detect the connection failure faster than heart-beat message. Again, if there is cable issue, is there still a DISCONNECT event generated eventually ? --CQ > > - Sean > _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
