On 2/13/24 09:28, Adrian Moreno wrote: > > > On 2/12/24 18:49, Ilya Maximets wrote: >> On 2/12/24 09:43, Adrian Moreno wrote: >>> >>> >>> On 2/9/24 17:15, Ilya Maximets wrote: >>>> On 2/9/24 08:06, Adrian Moreno wrote: >>>>> In order to properly balance bond traffic, ofproto/bond periodically >>>>> reads usage statistics of the post-recirculation rules (which are added >>>>> to a hidden internal table). >>>>> >>>>> To do that, each "struct bond_entry" (which represents a hash within a >>>>> bond) stores the last seen statistics for its rule. When a hash is moved >>>>> to another member (due to a bond rebalance or the previous member going >>>>> down), the rule is typically just modified, i.e: same match different >>>>> actions. In this case, statistics are preserved and accounting continues >>>>> to work. >>>>> >>>>> However, if the rule gets completely deleted (e.g: when all bond members >>>>> go down) and then re-created, the new rule will have 0 tx_bytes but its >>>>> associated entry will still store a non-zero last-seen value. >>>>> This situation leads to an overflow of the delta calculation (computed >>>>> as [current_stats_value - last_seen_value]), which can affect traffic >>>>> as the hash will be considered to carry a lot of traffic and rebalancing >>>>> will kick in. >>>>> >>>>> In order to fix this situation, reset the value of last seen statistics >>>>> on rule deletion. >>>>> >>>>> Implementation note: >>>>> The field ((struct bond_entry *) -> pr_tx_bytes) is guarded by rwlock >>>>> but, as the comment above the function indicates, it's only called >>>>> without having locked it when the bond is being deleted. Given that >>>>> bond->hash is free()ed before deleting the rules (which makes writing to >>>>> the entries a use-after-free anyway), we can use that to avoid >>>>> double-locking. >>>> >>>> clang is not happy about this. All the clang jobs failed in CI. >>>> We have to either remove the guarding annotation or just take the >>>> lock in the unref() function. The latter probably makes more sense. >>>> Is there a good reason to not take the lock there? >>>> >>> >>> Yes, I was kind of expecting that one. TBH, I assumed there was a good >>> reason >>> based on the comment above the function. Given it's a function that frees >>> resources, I assumed, it was handing some kind of corner-case. >>> >>> If you don't remember any context, then I'll take some more time and look >>> at all >>> possible implications of locking in bond_unref(). >> >> I wasn't part of the discussion when this code was introduced, and >> I don't really see any discussion about that part in the list archives. >> So, I'm just assuming that the logic was "there is only one reference >> to an object while it is being unrefed, so there is no need to take >> a mutex, i.e. we can avoid blocking some other threads on a global lock." >> But I don't have any evidence supporting this. At the same time, I don't >> see a clear reason to not lock, except for potentially locking some other >> threads while the destruction is ongoing. >> > > I don't see any strong reason either. It could slow down rule updates while > some > other bond is being deleted (probably not a high-frequency event) but > considering that the expensive task, the OFP table update, is serialized > anyway > (via global ofproto_mutex) impact should not be very high. > > I'll resend the patch locking at bond_unref() and adding the proper clang > flags. > > Thanks. > >>> >>>>> >>>>> Reported-at: https://github.com/openvswitch/ovs-issues/issues/319 >>>>> Signed-off-by: Adrian Moreno <[email protected]> >>>>> --- >>>>> ofproto/bond.c | 4 ++++ >>>>> 1 file changed, 4 insertions(+) >>>>> >>>>> diff --git a/ofproto/bond.c b/ofproto/bond.c >>>>> index cfdf44f85..ad56bb96b 100644 >>>>> --- a/ofproto/bond.c >>>>> +++ b/ofproto/bond.c >>>>> @@ -407,6 +407,10 @@ update_recirc_rules__(struct bond *bond) >>>>> >>>>> VLOG_ERR("failed to remove post recirculation flow >>>>> %s", err_s); >>>>> free(err_s); >>>>> + } else if (bond->hash) { >>>>> + uint32_t hash = pr_op->hmap_node.hash & BOND_MASK; >>>>> + struct bond_entry *entry = &bond->hash[hash];
An empty line here would be nice. >>>>> + entry->pr_tx_bytes = 0; >>>> >>>> A few lines below we will be leaning up the rule pointer regardless >>>> of the error. Should this code be moved there? Potentially re-ordered >>>> with hmap_remove to be sure that the hash is kept intact? >>>> >>> >>> I figured if there is an error deleting the internal flow, we should not >>> reset >>> the byte counter. Not that I can imagine a situation where this would >>> happen but >>> if we fail to delete the rule and we then account for it, we'll report a >>> potentially big rate increase that could lead to an unnecessary rebalancing. >> >> But the rule reference will be cleared from the hash entry anyway, >> so we will not get the stats from this rule again. Or am I missing >> something here? >> > > Right, it will be removed from the hash, but it'll remain in OFP. If the rule > is > then added back (i.e: one bond member comes up), the OFP rule stats will > remain > and on the first rebalance round we'll account for a big increase. OK, makes sense. Maybe add a small comment explaining that in case of a rule deletion failure the subsequent ofproto_dpif_add_internal_flow() will replace the old rule preserving the stats? Best regards, Ilya Maximets. _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
