Ilya Maximets <i.maxim...@ovn.org> writes: > On 8/1/25 10:55 AM, Adrián Moreno wrote: >> On Thu, Jul 31, 2025 at 10:48:08PM +0200, Ilya Maximets wrote: >>> On 7/31/25 8:41 PM, Aaron Conole wrote: >>>> Ilya Maximets <i.maxim...@ovn.org> writes: >>>> >>>>> On 7/30/25 4:47 PM, Aaron Conole via dev wrote: >>>>>> When adjusting bond parameters, any adjustment is considered sufficient >>>>>> for triggering a rebalance. This is a very simplistic config update >>>>>> scheme that triggers complete rebalancing even if the time adjustment >>>>>> would move the next expiration out beyond the last calculated expiration. >>>>>> >>>>>> For the interval parameter only, we can simply recalculate the expiry >>>>>> deadline and let the next bond_run() event do the rebalance if needed. >>>>>> Even if the recalculation would cause the deadline to have occurred in >>>>>> the past, it should execute on the next bond_run() anyway. This is >>>>>> still okay, as the rebalance interval timeout may not result in a >>>>>> full rebalance anyway. >>>>>> >>>>>> Reported-at: >>>>>> https://www.mail-archive.com/ovs-discuss@openvswitch.org/msg10409.html >>>>>> Signed-off-by: Aaron Conole <acon...@redhat.com> >>>>>> --- >>>>>> ofproto/bond.c | 8 +++++++- >>>>>> 1 file changed, 7 insertions(+), 1 deletion(-) >>>>>> >>>>>> diff --git a/ofproto/bond.c b/ofproto/bond.c >>>>>> index 3859ddca08..86e21607e5 100644 >>>>>> --- a/ofproto/bond.c >>>>>> +++ b/ofproto/bond.c >>>>>> @@ -459,8 +459,14 @@ bond_reconfigure(struct bond *bond, const struct >>>>>> bond_settings *s) >>>>>> } >>>>>> >>>>>> if (bond->rebalance_interval != s->rebalance_interval) { >>>>>> + /* Recompute the next rebalance interval by moving the >>>>>> next_rebalance >>>>>> + * to be offset by the new interval. Then let the rebalance >>>>>> code >>>>>> + * trigger a rebalance based on the new details. >>>>> >>>>> This sentence doesn't seem necessary. >>>> >>>> Removed. >>>> >>>>>> In this case, if >>>>>> + * all that was updated is the rebalance interval, we can skip >>>>>> + * triggering the rest of the port reconfigure mechanism. */ >>>>> >>>>> s/reconfigure mechanism/reconfiguration/ >>>> >>>> Ack. >>>> >>>>>> + int old_start_time = bond->next_rebalance - >>>>>> bond->rebalance_interval; >>>>>> bond->rebalance_interval = s->rebalance_interval; >>>>>> - revalidate = true; >>>>>> + bond->next_rebalance = old_start_time + >>>>>> bond->rebalance_interval; >>>>>> } >>>>> >>>>> We should likely still revalidate when the value goes to or from zero, >>>>> because that impacts if the bond considered balanced or not. Even if >>>>> that doesn't cause any problems today, it will inevitably be forgotten >>>>> in the future. >>>> >>>> That makes sense to me - I'll add a case for it. >>>> >>>>> Also, the subtraction may technically overflow if the next_rebalance is >>>>> not initialized. E.g. if it is not initialized yet when the bond is >>>>> going from active-backup to balance-tcp. Which may be fine, given those >>>>> are signed integers, but still doesn't feel right. >>>> >>>> I'm trying to envision a scenario that would make this happen. For >>>> example, it gets reset when going from ab -> balance-{tcp,slb}. And >>>> that will be reconfigured later in this very code path as well (since we >>>> will call the bond_entry_reset if the config got changed for any config >>>> option other than the interval). I can put a guard on !next_rebalance >>>> and tell it to trigger revalidation then (since the very first execution >>>> or something should be initialized to '0' and obviously that could >>>> probably find a trigger). But still struggling to see a place where >>>> changing bond mode won't also force the whole thing to be reinitialized >>>> in bond_entry_reset() since revalidate will be 'true' via other means. >>> >>> It doesn't cause any issues, I agree, but it just didn't feel right. :) >>> Feel free to keep this code as-is. >>> >>>> >>>>> And this patch needs a Fixes tag. >>>> >>>> I didn't put one because I'm not sure if it's really a bugfix or an >>>> improvement. If you consider it one, I'll add the fixes tag. >>> >>> Looking closer to the code, there are two separate issues here: >>> >>> 1. Traffic is re-shuffled among bond members on the rebalance interval >>> change. >>> >>> 2. Traffic is being dropped for some time after the bond is reconfigured >>> until the next bond_run(). This happens if other config options >>> are changed as well. >>> >> >> Exactly, that's what I was trying to say in the other thread, issue #2 >> is more problematic. >> >>> And, I think, the latter is what the original reporter was concerned about. >>> This is a behavior introduced in: >>> >>> 586adfd047cb ("bond: Avoid deadlock while updating post recirculation >>> rules.") >>> >>> Before that change we would not update post-recirc rules while the >>> configuration is cleared and not re-constructed yet. But after the >>> 586adfd047cb fix, we're now calling the update after we zeroed out the >>> configuration, effectively just removing all the post recirculation rules >>> or the lb-output mappings. >>> >>> Saying that, maybe the real fix is to use bond_update_post_recirc_rules__ >>> with force instead of direct call to update_recirc_rules(). That will >>> ensure that the configuration is not blank and we actually have real >>> post-recirc rules / lb-output mapping at all times. I do not remember >>> why I went with the direct call in that fix though. >>> >>> What do you think? >> >> I agree but I think this is not the only problem here. >> >> AFAICS, another issue is that, when traffic is moved from one member to >> another during rebalancing, we set bond->bond_revalidate to true. >> Soon after that we call bond_update_post_recirc_rules__ so we update the >> rules. > > It feels like setting bond_revalidate to true can just be removed from > the bond_shift_load(). Not sure what's the purpose of it.
I was trying to understand that area, and doing git archeology eventually found me the bond tag infrastructure. It even predates the original commit, and seems it was preserved after the conversion away from the old tag design. The tags looked to associate a pool of ports together during revalidation. I'm wondering if this old design had some restriction imposed on the revalidation, or if there really was some consideration that isn't readily apparent (this is a long-winded way of saying '+1 - also not sure what the purpose of flipping this revalidation flag here'). >> However, if something in the configuration changes (even with this >> patch), and bond_reconfigure is called, we will unnecessarily call >> bond_entry_reset() which blanks the hashes we have just carefully rebalanced. >> >> In addition to wiping the result of the rebalance, as Ilya points out, >> we leave the zeroed hashes until the next bond_run(), which would be >> fixed if we called bond_update_post_recirc_rules__ instead of >> update_recirc_rules directly. I do plan to work on something like that as a different patch. >> >> Thanks. >> Adrián >> >>> >>> Not re-setting the config on timeout change is still needed, as that >>> avoids re-shuffling the traffic among bond members (issue 1). >>> >>> The behavior in point 1 was always this way, AFAIU, so that can be treated >>> as just an enhancement, I suppose. >>> >>> Best regards, Ilya Maximets. >>> >> _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev