Arlin Davis wrote: > Fix some timeout and long disconnect delay issues discovered during > scale-out testing. Added support to retry rdma_cm address and route > resolution with configuration options and provide a disconnect call when > receiving the disconnect request to force an immediate disconnect reply > to the remote side.
Can be very nice if you share with the community the IB stack issues revealed under scale-out testing... basically what was the testbed? From what the patch does I understand you attempt to handle timeout on address and route resolution and long disconnect delay. Was the issue with address resolution being ARP request or reply messages getting lost? Was the issue with route resolution being timeout on SA Path queries? Please note that for the first two, you want to retry if the event status is -ETIMEDOUT, the patch ignores the status field. Was the issue with disconnect delay that peer A called dat_ep_disconnect() (ie sending DREQ) and the DREP was sent only when peer B got the disconnect event and called dat_ep_disconnect()? so now the DREP is sent from within the provider code when it gets the DREQ? Or. _______________________________________________ openib-general mailing list [email protected] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
