2014-10-24 13:23 GMT+02:00 Sagi Grimberg <[email protected]>: > > Thanks Roland to clarify our confusion. > > So looks ping-pong mechanism is the way to go. > > > Not sure if it will work for your solution, but you can also register to SM > traps.
Hi Sagi, Could you elaborate a bit more, how to register to SM traps in kernel module? > > Regards, > Jack > > 2014-10-23 20:43 GMT+02:00 Roland Dreier <[email protected]>: > > On Thu, Oct 23, 2014 at 6:50 AM, Jack Wang <[email protected]> wrote: > > I expected that RDMA-Write operations will fail if the other crashes. > > Also I hoped that an event is generated when a host is crashed. The subnet > > manager should notice it and notify every other device in the network. > > > Are we missing something in our modules? > > Is there a way to determine that a RC peer crashed without implementing a > > ping-pong mechanism? > > > If the remote system crashes then any memory regions, QPs, etc. are > > still valid with the remote HCA, and RDMA read/write operations will > > continue to succeed. (Unless the system reboots and reinitializes the > > adapter or something like that). > > > There isn't a way to detect a remote crash unless that remote crash > > disconnects your QP or otherwise affects the HCA on the crashed > > system. > > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to [email protected] > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html
