Based on my (long ago) reading of the LACP spec, only supporting a single aggregator is a valid configuration. Furthermore, it's what makes the most sense given the structure of the OVS bonding configuration. I'd really rather not make a non standard change to the protocol to support a buggy upstream mlag implementation cause I don't know how it could affect other less buggy switches. My preferences is to shelve this for now FWIW.
Ethan On Tue, Aug 5, 2014 at 2:16 PM, Flavio Leitner <f...@redhat.com> wrote: > On Mon, Aug 04, 2014 at 12:08:48PM -0700, Andy Zhou wrote: >> Zoltan, >> >> Sorry it took a while to get back to you. I am just coming up to >> speed on OVS LACP implementation, so my understanding may not be >> correct. Please feel free to point them out If I am wrong. >> >> According to wikipeida MC-LAG entry, there is no standard for it, they >> are mostly designed and implemented by vendors. >> >> After reading through the commit message, and comparing with the >> 802.1AX spec, I feel this seems like there is a bug in the MC-LAG >> implementation/configuration issue. When the partner on port A comes >> back again, should it wait for MC-LAG sync before using the default >> profile to exchange states with OVS? > > I agree that it sounds like a problem in the MC-LAG. However, I also > agree that OVS could do better. > > The aggregation selection policy is somewhat a gray area not defined > in any spec. The bonding driver offers ad_select= parameter which > allows to switch to the new aggregator only if, for instance, all the > ports are down in an active aggregator. > > The Team driver implementing 802.3ad also provides the policy selection > parameter. The default is to consider the prio in the LACPDU, but you > can also tell to not select any other aggregator if the current one is > still usable, or per bandwidth or per number of ports available. > > My suggestion if we want to change something is to stick with bonding > driver default behavior regarding to select a new aggregator: > """ > table or 0 > > The active aggregator is chosen by largest aggregate > bandwidth. > > Reselection of the active aggregator occurs only when all > slaves of the active aggregator are down or the active > aggregator has no slaves. > > This is the default value. > """ > Documentation/networking/bonding.txt > > That would avoid problems with transient states like the reported one. > > fbl > >> On Mon, Jul 14, 2014 at 3:11 PM, Ben Pfaff <b...@nicira.com> wrote: >> > On Tue, Jul 08, 2014 at 05:35:57PM +0100, Zoltan Kiss wrote: >> >> This patch modifies the LACP selection logic by prefering a slaves with >> >> up and >> >> running partners when looking for a lead. >> >> That fixes the following scenario: >> >> - bond has 2 ports, A and B, their other ends are in separate chassis with >> >> MC-LAG sync >> >> - the partner of port A is restarted >> >> - port B is still working >> >> - the partner on port A comes back, but temporarily it is using a default >> >> config, as MC-LAG haven't synced yet >> >> - apparently that default config has a sys_priority which is smaller than >> >> the >> >> other, still running port, plus completely different sys_id >> >> - therefore OVS choose port A despite it won't ever comes up into >> >> collecting-distributing state >> >> - and port B is disabled, causing the whole bond goes down >> >> >> >> Checking through the 802.1ax standard, when port A comes up again, the two >> >> links fall apart due to the different LAG IDs. They should be attached to >> >> different Aggregators, and the Aggregators should live separately. In OVS >> >> there >> >> is no such concept as Aggregator, but I think it should be said that it >> >> has only >> >> one Aggregator, and it has an unique policy to choose which ports can >> >> join. >> >> Although changing the chassis' default config can also fix this, detecting >> >> such problems quite hard, therefore I think it is still valid to improve >> >> things >> >> in OVS side. >> >> Btw. the Linux kernel bonding drivers' LACP implementation allows more >> >> aggregators, and therefore it could handle this situation properly. >> >> >> >> Signed-off-by: Zoltan Kiss <zoltan.k...@citrix.com> >> > >> > I verified that the unit tests still pass with this applied. >> > >> > Andy Zhou said he'd review the patch. >> _______________________________________________ >> dev mailing list >> dev@openvswitch.org >> http://openvswitch.org/mailman/listinfo/dev >> > _______________________________________________ > dev mailing list > dev@openvswitch.org > http://openvswitch.org/mailman/listinfo/dev _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev