On Fri, Dec 02, 2016 at 03:31:24PM +0100, Simon Horman wrote: > On Thu, Dec 01, 2016 at 02:01:59PM +0100, Torgny Lindberg wrote: > > A reboot of one switch in an MC-LAG bond makes all bond links > > to go down, causing a total connectivity loss for 3 seconds. > > > > Packet capture shows that spurious LACP PDUs are sent to OVS with > > a different MAC address (partner system id) during the final > > stages of the MC-LAG switch reboot. > > > > The current code selects a lead interface based on information > > in the LACP PDU, regardless of its synchronization state. If a > > non-synchronized interface is selected as the OVS lead interface > > then all other interfaces are forced down as their stored partner > > system id differs and the bond ends up with no working interface. > > The bond recovers within three seconds after the last spurious > > message. > > > > To avoid the problem, this commit requires a lead interface > > to be synchronized. In case no synchronized interface exists, > > the selection of lead interface is done as in the current code. > > > > Signed-off-by: Torgny Lindberg <[email protected]> > > This patch looks good to me and it appears to me that it is applicable all > the way back to branch-2.0. I am however wondering what the views of others > are to applying it to branches all the way back to there as it is arguably > a behavioural change. In particular I wonder if there any chance people > could be relying on this behaviour for some reason?
Usually, we don't apply patches beyond the last long-term stable release, unless it's a severe bug. Sometimes I go a bit farther and for me, lately, that means back to branch-2.3. It's hard to ever get any testing for any releases older than that, anyway. I don't understand LACP well enough to judge this change, but if you understand it and believe that it is correct then I'd suggest applying it back to branch-2.3. _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
