In order to properly balance bond traffic, ofproto/bond periodically reads usage statistics of the post-recirculation rules (which are added to a hidden internal table).
To do that, each "struct bond_entry" (which represents a hash within a bond) stores the last seen statistics for its rule. When a hash is moved to another member (due to a bond rebalance or the previous member going down), the rule is typically just modified, i.e: same match different actions. In this case, statistics are preserved and accounting continues to work. However, if the rule gets completely deleted (e.g: when all bond members go down) and then re-created, the new rule will have 0 tx_bytes but its associated entry will still store a non-zero last-seen value. This situation leads to an overflow of the delta calculation (computed as [current_stats_value - last_seen_value]), which can affect traffic as the hash will be considered to carry a lot of traffic and rebalancing will kick in. In order to fix this situation, reset the value of last seen statistics on rule deletion. Implementation notes: Modifying pr_tx_bytes requires write-locking the global rwlock but a lockless version of update_recirc_rules was being maintained to avoid locking on bon_unref(). Considering the small impact of locking during bond removal, removing the lockless version and relying on clang's thread safety analysis is preferred. Also, folding Ilya's [1], i.e: fixing thread safety annotation in update_recirc_rules() to require holding write-lock. [1] https://patchwork.ozlabs.org/project/openvswitch/patch/[email protected]/ Reported-at: https://github.com/openvswitch/ovs-issues/issues/319 Co-authored-by: Ilya Maximets <[email protected]> Signed-off-by: Ilya Maximets <[email protected]> Signed-off-by: Adrian Moreno <[email protected]> --- ofproto/bond.c | 33 +++++++++++++++------------------ tests/ofproto-dpif.at | 16 ++++++++++++++++ 2 files changed, 31 insertions(+), 18 deletions(-) diff --git a/ofproto/bond.c b/ofproto/bond.c index cfdf44f85..84f59335f 100644 --- a/ofproto/bond.c +++ b/ofproto/bond.c @@ -186,7 +186,7 @@ static struct bond_member *choose_output_member(const struct bond *, struct flow_wildcards *, uint16_t vlan) OVS_REQ_RDLOCK(rwlock); -static void update_recirc_rules__(struct bond *); +static void update_recirc_rules(struct bond *) OVS_REQ_WRLOCK(rwlock); static bool bond_may_recirc(const struct bond *); static void bond_update_post_recirc_rules__(struct bond *, bool force) OVS_REQ_WRLOCK(rwlock); @@ -299,7 +299,10 @@ bond_unref(struct bond *bond) } free(bond->hash); bond->hash = NULL; - update_recirc_rules__(bond); + + ovs_rwlock_wrlock(&rwlock); + update_recirc_rules(bond); + ovs_rwlock_unlock(&rwlock); hmap_destroy(&bond->pr_rule_ops); free(bond->primary); @@ -331,17 +334,8 @@ add_pr_rule(struct bond *bond, const struct match *match, hmap_insert(&bond->pr_rule_ops, &pr_op->hmap_node, hash); } -/* This function should almost never be called directly. - * 'update_recirc_rules()' should be called instead. Since - * this function modifies 'bond->pr_rule_ops', it is only - * safe when 'rwlock' is held. - * - * However, when the 'bond' is the only reference in the system, - * calling this function avoid acquiring lock only to satisfy - * lock annotation. Currently, only 'bond_unref()' calls - * this function directly. */ static void -update_recirc_rules__(struct bond *bond) +update_recirc_rules(struct bond *bond) OVS_REQ_WRLOCK(rwlock) { struct match match; struct bond_pr_rule_op *pr_op; @@ -407,6 +401,15 @@ update_recirc_rules__(struct bond *bond) VLOG_ERR("failed to remove post recirculation flow %s", err_s); free(err_s); + } else if (bond->hash) { + /* If the flow deletion failed, a subsecuent call to + * ofproto_dpif_add_internal_flow() would just modify the flow + * preserving its statistics. Therefore, only reset the entry's + * byte counter if it succeeds. */ + uint32_t hash = pr_op->match.flow.dp_hash & BOND_MASK; + struct bond_entry *entry = &bond->hash[hash]; + + entry->pr_tx_bytes = 0; } hmap_remove(&bond->pr_rule_ops, &pr_op->hmap_node); @@ -421,12 +424,6 @@ update_recirc_rules__(struct bond *bond) ofpbuf_uninit(&ofpacts); } -static void -update_recirc_rules(struct bond *bond) - OVS_REQ_RDLOCK(rwlock) -{ - update_recirc_rules__(bond); -} /* Updates 'bond''s overall configuration to 's'. * diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at index e305e7b9c..71737db8c 100644 --- a/tests/ofproto-dpif.at +++ b/tests/ofproto-dpif.at @@ -547,6 +547,22 @@ ovs-appctl time/warp 1000 100 ovs-appctl bond/show > bond3.txt AT_CHECK([sed -n '/member p2/,/^$/p' bond3.txt | grep 'hash'], [0], [ignore]) +AT_CHECK([ovs-appctl netdev-dummy/set-admin-state p1 down], 0, [OK +]) +AT_CHECK([ovs-appctl netdev-dummy/set-admin-state p2 down], 0, [OK +]) +ovs-appctl time/warp 1000 100 +AT_CHECK([ovs-appctl netdev-dummy/set-admin-state p1 up], 0, [OK +]) +AT_CHECK([ovs-appctl netdev-dummy/set-admin-state p2 up], 0, [OK +]) +ovs-appctl time/warp 1000 100 + +AT_CHECK([SEND_TCP_BOND_PKTS([p5], [5], [65500])]) +# We sent 49125 KB of data total in 3 batches. No hash should have more +# than that amount of load. Just checking that it is within 5 digits. +AT_CHECK([ovs-appctl bond/show | grep -E '[[0-9]]{6}'], [1]) + OVS_VSWITCHD_STOP() AT_CLEANUP -- 2.43.2 _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
