> > > If a CM REQ actually gets to the remote side, it could be rejected as > an > > invalid GID, and the client could retry the request. > > > > Retrying the request is practically calling > > > > 1. rdma_resolve_addr > > 2. rdma_resolve_route > > 3. rdma_connect > > Yep - the entire setup is broken. If the wrong remote GID was resolved, > then the wrong local GID _may_ have been selected. There's no easy > guaranteed recovery here.
On second thought, I don't think this is true. The SGID is selected based on the IP address, not the DGID. If the remote CM rejects the connection, the RDMA CM may be able to recover by restarting at step 2. Query for a new PR, then re-issue the connect request. It may be possible to do this without the application's involvement. I'm not sure how the librdmacm would handle this, since the initial resolve_route would be redone. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html
