On 8 Oct 2023, at 7:26, Mike Pattrick wrote:
> Currently a bond will not always revalidate when an active member > changes. This can result in counter-intuitive behaviors like the fact > that using ovs-appctl bond/set-active-member will cause the bond to > revalidate but changing other_config:bond-primary will not trigger a > revalidate in the bond. > > When revalidation is not set but the active member changes in an > unbalanced bond, OVS may send traffic out of previously active member > instead of the new active member. > > This change will always mark unbalanced bonds for revalidation if the > active member changes. > > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2214979 > Signed-off-by: Mike Pattrick <m...@redhat.com> Hi Mike, Thanks for the patch. Some comments below, and the subject needs to end with a dot. //Eelco > --- > v2: Added a test > v3: Made the test more reliable > --- > ofproto/bond.c | 8 +++++-- > tests/system-traffic.at | 50 +++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 56 insertions(+), 2 deletions(-) > > diff --git a/ofproto/bond.c b/ofproto/bond.c > index cfdf44f85..fb108d30a 100644 > --- a/ofproto/bond.c > +++ b/ofproto/bond.c > @@ -193,6 +193,7 @@ static void bond_update_post_recirc_rules__(struct bond > *, bool force) > static bool bond_is_falling_back_to_ab(const struct bond *); > static void bond_add_lb_output_buckets(const struct bond *); > static void bond_del_lb_output_buckets(const struct bond *); > +static bool bond_is_balanced(const struct bond *bond) OVS_REQ_RDLOCK(rwlock); > > > /* Attempts to parse 's' as the name of a bond balancing mode. If > successful, > @@ -552,11 +553,15 @@ bond_find_member_by_mac(const struct bond *bond, const > struct eth_addr mac) > > static void > bond_active_member_changed(struct bond *bond) > + OVS_REQ_WRLOCK(rwlock) > { > if (bond->active_member) { > struct eth_addr mac; > netdev_get_etheraddr(bond->active_member->netdev, &mac); > bond->active_member_mac = mac; > + if (!bond_is_balanced(bond)) { > + bond->bond_revalidate = true; > + } > } else { > bond->active_member_mac = eth_addr_zero; > } > @@ -1124,7 +1129,7 @@ bond_get_recirc_id_and_hash_basis(struct bond *bond, > uint32_t *recirc_id, > /* Rebalancing. */ > > static bool > -bond_is_balanced(const struct bond *bond) OVS_REQ_RDLOCK(rwlock) Any reason why the annotation was removed? clang would be smart enough to see that a write lock is fine for a readlock. It seems to compile fine here. Or were you trying to clean this up and move it to the next line :) > +bond_is_balanced(const struct bond *bond) > { > return bond->rebalance_interval > && (bond->balance == BM_SLB || bond->balance == BM_TCP) > @@ -1728,7 +1733,6 @@ bond_unixctl_set_active_member(struct unixctl_conn > *conn, > } > > if (bond->active_member != member) { > - bond->bond_revalidate = true; > bond->active_member = member; > VLOG_INFO("bond %s: active member is now %s", > bond->name, member->name); > diff --git a/tests/system-traffic.at b/tests/system-traffic.at > index 945037ec0..7075c35ea 100644 > --- a/tests/system-traffic.at > +++ b/tests/system-traffic.at > @@ -291,6 +291,56 @@ NS_CHECK_EXEC([at_ns0], [ping -s 3200 -q -c 3 -i 0.3 -w > 2 10.1.1.2 | FORMAT_PING > OVS_TRAFFIC_VSWITCHD_STOP > AT_CLEANUP > > +AT_SETUP([datapath - ping over active-backup bond]) This does not really represent the test case. What about ‘datapath - bond active-backup failover’ > +OVS_TRAFFIC_VSWITCHD_START([_ADD_BR([br1])]) > + > +AT_CHECK([ovs-ofctl add-flow br0 "priority=1 actions=drop"]) > +AT_CHECK([ovs-ofctl add-flow br1 "priority=1 actions=drop"]) > +AT_CHECK([ovs-ofctl add-flow br0 "ip,icmp actions=normal"]) > +AT_CHECK([ovs-ofctl add-flow br1 "ip,icmp actions=normal"]) > +AT_CHECK([ovs-ofctl add-flow br0 "arp actions=normal"]) > +AT_CHECK([ovs-ofctl add-flow br1 "arp actions=normal"]) Do we need this ‘complex’ set of rules, will the following not work? AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"]) AT_CHECK([ovs-ofctl add-flow br1 "actions=normal"]) > +ADD_NAMESPACES(at_ns0, at_ns1) > + > +ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24") > +ADD_VETH(p1, at_ns1, br1, "10.1.1.2/24") > +on_exit 'ip link del link0a' > +on_exit 'ip link del link0b' > +AT_CHECK([ip link add link0a type veth peer name link1a]) > +AT_CHECK([ip link add link0b type veth peer name link1b]) > + > +AT_CHECK([ip link set dev link0a up]) > +AT_CHECK([ip link set dev link1a up]) > +AT_CHECK([ip link set dev link0b up]) > +AT_CHECK([ip link set dev link1b up]) > + > +AT_CHECK([ovs-vsctl add-bond br0 bond0 link0a link0b > bond_mode=active-backup]) > +AT_CHECK([ovs-vsctl add-bond br1 bond1 link1a link1b > bond_mode=active-backup]) > + > +dnl Set primary dnl Set the primary bond member. > +AT_CHECK([ovs-vsctl set port bond0 other_config:bond-primary=link0a -- \ > + set port bond1 other_config:bond-primary=link1a]) > + > +OVS_WAIT_UNTIL([ip netns exec at_ns0 ping -c 1 10.1.1.2]) > + > +NS_CHECK_EXEC([at_ns0], [ping -q -c 6 -i 0.3 -w 6 10.1.1.2 | FORMAT_PING], > [0], [dnl > +6 packets transmitted, 6 received, 0% packet loss, time 0ms > +]) > + > +dnl Change primary dnl Change the primary bond member. > +AT_CHECK([ovs-vsctl set port bond0 other_config:bond-primary=link0b -- \ > + set port bond1 other_config:bond-primary=link1b]) > + > +sleep 0 This short sleep makes it work, however, it could result in a flaky test. Is there something else we could wait for to be 100% sure we are ready? > + > +NS_CHECK_EXEC([at_ns0], [ping -q -c 6 -i 0.5 -w 6 10.1.1.2 | FORMAT_PING], > [0], [dnl > +6 packets transmitted, 6 received, 0% packet loss, time 0ms > +]) Should we not add some code to verify traffic goes over the correct link? > + > +OVS_TRAFFIC_VSWITCHD_STOP > +AT_CLEANUP > + > AT_SETUP([datapath - ping over vxlan tunnel]) > OVS_CHECK_TUNNEL_TSO() > OVS_CHECK_VXLAN() Can you also check this test on all datapaths? It’s failing for me once out of ten on userspace; @@ -1,2 +1,2 @@ -6 packets transmitted, 6 received, 0% packet loss, time 0ms +7 packets transmitted, 6 received, 14.2857% packet loss, time 0ms sudo make check-kernel TESTSUITEFLAGS="-k 'ping over active-backup bond'" && sudo make check-offloads TESTSUITEFLAGS="-k 'ping over active-backup bond'" && sudo make check-system-userspace TESTSUITEFLAGS="-k 'ping over active-backup bond'" _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev