On Tue, Oct 02, 2018 at 10:28:52AM -0600, Daniel Leaberry via discuss wrote:
> I have Centos 7 with openvswitch 2.9.0. The server has 4 ports in an lacp
> bond (called allbond) connected to a set of mlagged arista switches. Here's
> the config
>
> ovs-vsctl list port allbond
> _uuid : 9f224f2d-8bb1-4cfd-84e2-d60c6d973a7a
> bond_active_slave : "90:e2:ba:d6:1c:44"
> bond_downdelay : 0
> bond_fake_iface : false
> bond_mode : balance-tcp
> bond_updelay : 40000
> cvlans : []
> external_ids : {}
> fake_bridge : false
> interfaces : [61b9a345-2f3d-4127-b9cd-eaca8a749574,
> 89ce3480-d62d-4291-9a84-bdf711016793, 941c9393-1021-490c-84ac-311250ba0343,
> dc49ffd3-c259-43b6-8072-2ce12c52d1b1]
> lacp : active
> mac : []
> name : allbond
> other_config : {}
> protected : false
> qos : []
> rstp_statistics : {}
> rstp_status : {}
> statistics : {}
> status : {}
> tag : []
> trunks : []
> vlan_mode : []
>
>
> ---- allbond ----
> bond_mode: balance-tcp
> bond may use recirculation: yes, Recirc-ID : 3
> bond-hash-basis: 0
> updelay: 40000 ms
> downdelay: 0 ms
> next rebalance: 3229 ms
> lacp_status: negotiated
> lacp_fallback_ab: false
> active slave mac: 90:e2:ba:d6:1c:44(eth5)
>
> slave eth3: enabled
> may_enable: true
> hash 50: 1 kB load
> hash 162: 1 kB load
> hash 170: 1 kB load
>
> slave eth4: enabled
> may_enable: true
> hash 123: 4 kB load
> hash 221: 12 kB load
>
> slave eth5: enabled
> active slave
> may_enable: true
> hash 94: 1 kB load
> hash 177: 1 kB load
> hash 245: 1 kB load
>
> slave eth6: enabled
> may_enable: true
> hash 97: 46 kB load
>
> As you can see updelay is set to 40 seconds. I go to the switch and shutdown
> the port for eth6. It's immediately pulled from the bond. I then clear the
> switch counters and wait a few minutes. I would expect when the port is "no
> shutdown" that 40 seconds will go by before openvswitch brings it back into
> the bond. But that doesn't happen.
>
> 2018-10-02T15:31:32.885Z|00349|bond|INFO|interface eth6: link state down
> 2018-10-02T15:31:32.885Z|00350|bond|INFO|interface eth6: disabled
> 2018-10-02T15:35:45.861Z|00352|bond|INFO|interface eth6: link state up
> 2018-10-02T15:35:45.861Z|00353|bond|INFO|interface eth6: enabled
> 2018-10-02T15:35:51.286Z|00354|bond|INFO|bond allbond: shift 93kB of load
> (with hash 97) from eth3 to eth6 (now carrying 6kB and 93kB load,
> respectively)
>
> Immediately after link is re-established the port (eth6) is enabled again and
> traffic as shown in the switch counters begins to flow again. It feels like
> I'm doing something wrong but I've googled for hours and can't find anything
> that explains why the bond_updelay is being ignored.
I spent some time looking through the history here. Ethan (CCed) added
LACP support to OVS in January 2011. From that point forward, OVS has
always ignored updelay and downdelay for a bond when LACP is enabled. I
don't know why, exactly. Maybe Ethan remembers.
It would be easy to enable updelay and downdelay for LACP bonds:
diff --git a/ofproto/bond.c b/ofproto/bond.c
index f87cdba7908f..8a90ba2686af 100644
--- a/ofproto/bond.c
+++ b/ofproto/bond.c
@@ -1717,8 +1717,7 @@ bond_link_status_update(struct bond_slave *slave)
VLOG_INFO_RL(&rl, "interface %s: will not be %s",
slave->name, up ? "disabled" : "enabled");
} else {
- int delay = (bond->lacp_status != LACP_DISABLED ? 0
- : up ? bond->updelay : bond->downdelay);
+ int delay = up ? bond->updelay : bond->downdelay;
slave->delay_expires = time_msec() + delay;
if (delay) {
VLOG_INFO_RL(&rl, "interface %s: will be %s if it stays %s "
_______________________________________________
discuss mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss