On Thu, 17 Jan 2008, Or Gerlitz wrote:
> move a little up the code that checks for a situation where the remote GID 
> stored in the ipoib_neigh is
> different than the one present in the neighbour (handle Gratuitous ARP) or 
> that a bonding fail over has
> happened but the neighbour still has a pointer to an ipoib_neigh created not 
> by the current slave. This
> will cause the driver to apply the check also for connected mode neighbours.

OK, Roland, I'd am now confident that this patch is needed, see below the 
reasonings,
please apply to 2.6.25, later I will send it also to -stable, here goes:

Basically ipoib-cm is not totaly broken wrt to bonding AND connect mode 
--without-- this
patch being applied, but OTOH it does not function at it should. My setup has a 
client node
xmitting udp unicast to a server node where the server node is bonded (ib0 and 
ib1 are
enslaved by bond0). I tried three types of fail-overs where each one of them 
causes the
bonding at the server node to send gratuitous ARP where without this patch no 
act is
taken by ipoib at the client side

A) using "primary slave up" (*)
B) taking an interface down
C) taking a port down

In the "primary slave up" fail-over case, since the non-active slave interface 
is up and running,
the traffic keeps going through it, so forever at the client side there's a 
neighbour pointing
to GID X where the traffic goes to (the QP associated with) GID Y.

In the interface down fail-over case, the going down code closes the RX QP, 
since the connected
mode (cm) is implemented over RC (...) this causes a send completion with 
IB_WC_RETRY_EXC_ERR
error to be generated by the HCA, ipoib_cm_handle_tx_wc calls ipoib_neigh_free 
and when the next
xmit is called from the stack, ipoib creates a new ipoib_neigh, this time 
against the correct GID

In the port going down case, again the RC implementation causes the retry 
exceeded error to
take place and from here its the same as in the previous case.

Other then all the above, gratitious ARP is used in other HA schemes such as 
floating IP address
between I/O targets, since the connected mode ignores it, this scheme will not 
work without the patch.

Or

(*) the bonding HA mode enables you to select a primary slave which once
up would be moved to be the active slave. So to cause this failover, I
take the primary (eg ib0) down, and then fail-over happens to the second
slave (eg ib1), now I take the primary up and a second fail-over happens.
_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to