Michael S. Tsirkin wrote: > Quoting Or Gerlitz <[EMAIL PROTECTED]>: >> the CMA API addressing is based on **IP** addresses, so when >> a client connects it provides the IP address of the server and >> optionally the its src IP as well. > > Once we get the GID that matches the IP, we can locate an extra path and arm > it. > Read up on how SDP does this in "A4.5.2 Automatic Path Migration": there are > several strategies there.
Thanks for the pointer. I was somehow aware to the method of locating the NODE GUID from the GID/LID and then another GID associated with this node but the test provides a very good description for it. I think it does not address how you get a second SGID, but this is trivial to implement... Basically i agree that if you are willing to do it out-of-cma-band, its possible to get an alt <SGID,DGID> pair (it was even deployed in production at the IB stack of another OS...). >> Again, even with the second point being somehow solved, the apm nature >> makes it a very limited in power feature of IB. > Protocols that rely on RC ACK for reliability guarantees (like SDP), basically > do not make it possible to address the hca failure case: you got an ACK, but > remote hca could have failed without committing data to memory. So APM > failover > is a requirement for these. It could be iser does not need APM, fine. This is news to me, does your HCA first sends an ACK and only then does the DMA transaction and if needed generates the CQE !?!?!? If this is not the case (thanks god) on what systems there is this issue where you (ie the HCA) issue a DMA, get a "bus completion", generates the CQE, sends an ACK but the data somehow was not committed to memory ? and how come APM is the solution to this crazy problem? Putting this a side, my basic assumption and this is something you need to check with the SDP customers is that apps coded for RC infrastructure (eg TCP, IB RC) are willing to ***reconnect*** when failure occurs. Moreover, this means that the infrastructure does not need to take care of house-keeping for unACKED messages, and once reconnect succeeds the app retransmists the unacked data. For those cma apps IPoIB failure is the only HA requirement since the IB ARP following the failure would return a valid DGID and from this point business are as usual (the listener also must not bind its id to a specific port, but this is trivial to do) Or. Or. _______________________________________________ openib-general mailing list [email protected] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
