Or Gerlitz wrote: > My guess this is related to the CM not the SM. > > I think there is a chance that the CM on node B does not treat the REQ > sent by A after the reboot as "stale connection" situation and hence > just **silently** dtop it, that is not REJ is sent.
I agree. This sounds like an issue where the CM is treating the REQ as an old REQ for the established connection, versus a REQ for a new connection. The desired behavior in this situation would be to reject the new request, and force the remote side to disconnect. You can try initializing next_id in cm_alloc_id() (cm.c) to a random value and see if that helps. I will also try to reproduce the problem here. - Sean _______________________________________________ openib-general mailing list [email protected] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
