On Thu, 17 Jan 2008, Or Gerlitz wrote: > move a little up the code that checks for a situation where the remote GID > stored in the ipoib_neigh is > different than the one present in the neighbour (handle Gratuitous ARP) or > that a bonding fail over has > happened but the neighbour still has a pointer to an ipoib_neigh created not > by the current slave. This > will cause the driver to apply the check also for connected mode neighbours.
OK, Roland, I'd am now confident that this patch is needed, see below the reasonings, please apply to 2.6.25, later I will send it also to -stable, here goes: Basically ipoib-cm is not totaly broken wrt to bonding AND connect mode --without-- this patch being applied, but OTOH it does not function at it should. My setup has a client node xmitting udp unicast to a server node where the server node is bonded (ib0 and ib1 are enslaved by bond0). I tried three types of fail-overs where each one of them causes the bonding at the server node to send gratuitous ARP where without this patch no act is taken by ipoib at the client side A) using "primary slave up" (*) B) taking an interface down C) taking a port down In the "primary slave up" fail-over case, since the non-active slave interface is up and running, the traffic keeps going through it, so forever at the client side there's a neighbour pointing to GID X where the traffic goes to (the QP associated with) GID Y. In the interface down fail-over case, the going down code closes the RX QP, since the connected mode (cm) is implemented over RC (...) this causes a send completion with IB_WC_RETRY_EXC_ERR error to be generated by the HCA, ipoib_cm_handle_tx_wc calls ipoib_neigh_free and when the next xmit is called from the stack, ipoib creates a new ipoib_neigh, this time against the correct GID In the port going down case, again the RC implementation causes the retry exceeded error to take place and from here its the same as in the previous case. Other then all the above, gratitious ARP is used in other HA schemes such as floating IP address between I/O targets, since the connected mode ignores it, this scheme will not work without the patch. Or (*) the bonding HA mode enables you to select a primary slave which once up would be moved to be the active slave. So to cause this failover, I take the primary (eg ib0) down, and then fail-over happens to the second slave (eg ib1), now I take the primary up and a second fail-over happens. _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
