On 9/2/25 11:20 AM, Dumitru Ceara wrote: > On 9/2/25 9:47 AM, Ales Musil wrote: >> >> >> On Mon, Sep 1, 2025 at 9:20 PM Dumitru Ceara <dce...@redhat.com >> <mailto:dce...@redhat.com>> wrote: >> >> On 9/1/25 9:15 PM, Dumitru Ceara wrote: >> > On 9/1/25 5:13 PM, Ilya Maximets wrote: >> >> On 9/1/25 4:48 PM, Dumitru Ceara wrote: >> >>> On 9/1/25 4:25 PM, Ilya Maximets wrote: >> >>>> On 9/1/25 4:19 PM, Dumitru Ceara wrote: >> >>>>> On 9/1/25 4:17 PM, Dumitru Ceara wrote: >> >>>>>> On 9/1/25 4:12 PM, Ilya Maximets wrote: >> >>>>>>> On 9/1/25 3:33 PM, Dumitru Ceara wrote: >> >>>>>>>> Hi Ilya, >> >>>>>>>> >> >>>>>>>> Thanks for the review! >> >>>>>>>> >> >>>>>>>> On 9/1/25 3:23 PM, Ilya Maximets wrote: >> >>>>>>>>> On 9/1/25 1:22 PM, Dumitru Ceara wrote: >> >>>>>>>>>> Since 6919992d8781 ("Datapath_Binding: Separate type >> column and sync >> >>>>>>>>>> NB.UUID to SB."), a new type column is added to SB >> datapaths to >> >>>>>>>>>> differentiate between datapath types (switch vs router). >> The same >> >>>>>>>>>> commit also changed northd such that newly created SB >> datapaths have as >> >>>>>>>>>> UUID the same value as the NB switch/router UUID. All >> SB.Datapath >> >>>>>>>>>> records were also updated and their 'type' field populated >> >>>>>>>>>> >> >>>>>>>>>> It also added a helper function, >> datapath_get_nb_uuid_and_type(), to >> >>>>>>>>>> abstract out the extraction of a SB datapath's >> corresponding NB UUID and >> >>>>>>>>>> type in the following way: >> >>>>>>>>>> - if the SB.Datapath.type field has a value, the helper >> assumed that the >> >>>>>>>>>> SB.Datapath.UUID is equal to the NB.Logical_Switch/ >> Router.UUID so it >> >>>>>>>>>> returned the 'type' value and the SB UUID >> >>>>>>>>>> - else (for "old style") datapaths it extracted the NB >> UUID and type >> >>>>>>>>>> from the SB.Datapath.external_ids (as used to be done >> before the >> >>>>>>>>>> commit). >> >>>>>>>>>> >> >>>>>>>>>> This creates an upgrade issue thoughi: older (already >> existing before >> >>>>>>>>>> upgrade) SB datapaths are not immediately recreated and >> their 'type' >> >>>>>>>>>> is set; so the NB UUID value returned by >> datapath_get_nb_uuid_and_type() >> >>>>>>>>>> for these records is incorrect. >> >>>>>>>>>> >> >>>>>>>>>> That stays like that until datapaths are finally >> recreated (due to other >> >>>>>>>>>> events, e.g., full recompute of the datapath-sync >> nodes). All that time >> >>>>>>>>>> the lflow manager will incorrectly determine that the >> existing lflows >> >>>>>>>>>> that are stale. That's because the lflow manager uses >> >>>>>>>>>> ovn_datapath_from_sbrec() which in turn relied on >> >>>>>>>>>> datapath_get_nb_uuid_and_type() to determine the in- >> memory ovn_datapath >> >>>>>>>>>> corresponding to the logical flow. >> >>>>>>>>>> >> >>>>>>>>>> There's no need to avoid re-creating SB.Datapath bindings >> as soon as >> >>>>>>>>>> ovn-northd determines that they're not using the new >> scheme. In order >> >>>>>>>>>> to achieve that we add two sets of helpers: >> >>>>>>>>>> a. one to be used by ovn-northd, >> ovn_datapath_get_nb_uuid_and_type() >> >>>>>>>>>> which only parses the new type of datapaths >> (everything else, i.e. >> >>>>>>>>>> old-style datapaths, will be considered stale) >> >>>>>>>>>> b. one to be used by all other readers of SB datapaths (e.g., >> >>>>>>>>>> ovn-controller, ovn-ic), >> datapath_get_nb_uuid_and_type_legacy() >> >>>>>>>>>> which relies on parsing external IDs (even in the new >> type of >> >>>>>>>>>> datapaths external IDs are still populated with the NB >> UUID in order >> >>>>>>>>>> to maintain backwards compatibility). >> >>>>>>>>> >> >>>>>>>>> Can we ever switch all users to the non-leagacy? I'm not >> sure... >> >>>>>>>>> Should probably be a TODO entry for this. >> >>>>>>>>> >> >>>>>>>> >> >>>>>>>> We can, it should happen in the next LTS (26.03) though. >> And it should >> >>>>>>>> be the plan. >> >>>>>>> >> >>>>>>> How the transition is done? i.e. is that guaranteed that >> the database >> >>>>>>> is fully translated to the new format after the next >> upgrade? Or was it >> >>>>>>> decided that the full database rewrite on upgrade is not an >> issue and it >> >>>>>>> works this way? >> >>>>>>> >> >>>>>> >> >>>>>> Yes, it's guaranteed that after upgrade to 25.09.0 (or more >> recent) the >> >>>>>> database is fully translated. It was decided that it's not a >> concern to >> >>>>>> do that. Discussed here: >> >>>>>> >> >>>>>> https://mail.openvswitch.org/pipermail/ovs-dev/2025- >> August/425327.html <https://mail.openvswitch.org/pipermail/ovs- >> dev/2025-August/425327.html> >> >>>> >> >>>> OK. FWIW, in comparison with the schema upgrade situation, >> this datapath >> >>>> binding re-creation business may cause actual temporary traffic >> disruption >> >>>> as we'll delete all the dynamic MAC bindings. Is that the >> case? I didn't >> >>>> read it very carefully, but I don't see this being discussed in >> the linked >> >>>> thread. >> >>>> >> >>> >> >>> You're right, dynamic MAC bindings have not been discussed. I >> guess we >> >>> have a few options: >> >>> a. leave it as is, dynamic MAC bindings will be recreated with the >> >>> correct datapaths when needed, traffic will be buffered (for a >> bit) by >> >>> ovn-controller >> >> >> >> The main problem here is that all that traffic is going to ovn- >> controller >> >> and, as practice shows (we had an issue like this earlier this year), >> >> CoPP starts dropping a fair share of this traffic causing a >> noticeable >> >> amount of lost packets and visible delays on TCP connections. >> >> >> > >> > True, this sounds like the riskiest option. >> > >> >>> b. try to update existing dynamic MAC bindings when their >> datapath is >> >>> not accurate anymore (similar to what we do for static mac bindings) >> >> >> >> Sounds risky in way that it's not that simple to track. We may >> have a >> >> chance of moving entries to a wrong datapath. Do you have ideas >> on how >> >> to do the transition cleanly in northd? >> >> >> > >> > I was thinking of tracking the set of SB.datapath_bindings that were >> > recreated (due to old-style format) and if we have any of those >> iterate >> > through other types SB records that might need update (MAC_Binding, >> > IGMP_Group and Learned_Route), update them and potentially schedule a >> > forced full recompute for the next ovn-northd run (and trigger one). >> > It's a bit compute heavy but only when we recreate old-style datapaths >> > and it should cover all dependencies. >> > >> > It's my impression that if we do that, the only potential traffic >> impact >> > might be on IP multicast traffic. That's because we might generate >> > slightly different logical flows and SB.multicast_groups next time we >> > run northd. In my opinion, an acceptable compromise. >> > >> > The other two types of records (dynamic mac bindings and learned >> routes) >> > don't actually need the second (forced recompute) run of northd. >> > >> >> After a closer look we might not even need the forced recompute. >> >> >>> c. revert all the Datpath_Binding schema changes (type and UUID >> syncing) >> >> >> >> Maybe the safest option, but I'm not sure how much reverting this >> involves. >> >> >> > >> > I agree this is the safest option. But that also means that we'll >> > likely make it practically impossible to sync UUIDs from NB to SB >> in the >> > future. I'm not against that, I think the current incremental >> > processing infrastructure in northd doesn't really need the UUIDs >> to be >> > in sync. But there were opinions from others that it would >> simplify the >> > work if they're in sync. >> > >> > In conclusion I think I'd try "b" update SB records that need update. >> > I'm going to try to prepare something along those lines but it >> would be >> > nice to hear opinions from Han, Mark, Ales and others too.\ >> >> >> >> Based on the discussion I have merged only 3/4 and 4/4 as they >> shouldn't affect the end result no matter what option we will choose. >> > > Thanks, Ales! > >> Now for the options, none of them sounds good to be honest. >> "b" has the advantage that it's one time only thing, we can remove >> the code in 26.09. The "c" is unfortunate because there some changes >> on review already counting on that approach it might not be that straight >> forward to do the revert though. "a" I guess is out of the question >> there would >> be a lot of potential disruptions. >> >> So option "b" IMO. >> > > FWIW, I think this would be a correct implementation of "b": > https://github.com/dceara/ovn/commit/a9a882b > > It has minimal impact on performance (it does an additional datapath > lookup on datapath binding deletion) and only when we detect old-style > datapaths (in practice that's only exactly once, at upgrade) we update > datapath references of all relevant SB table records (mac_binding, > igmp_group, learned_route). Because igmp_group and learned_route also > store references to Port_Binding, I decided to also include > Port_Binding.Datapath updates in the list. > > I'm going to add a small "upgrade"-like test case for all these tables > and if people are OK with this approach, I can post it as formal v3.
And with the test included, confirmed that it works: https://github.com/dceara/ovn/commit/ebd4263 > I'll give Ilya, Mark, Han and others some more time to reply before that > though. > > Regards, > Dumitru > >> > >> >>> >> >>> Other SB records created by ovn-controller and referring to >> >>> Datapath_bindings are: >> >>> >> >>> - IGMP Group - ovn-controller will recreate this from memory >> immediately >> >>> after the datapath binding update so there might be some >> disruption but >> >>> minimal >> >> >> >> We should also remember that ovn-controller is handling the whole >> database >> >> rewrite here as well. So, it may take a bit longer than usual. >> >> >> >>> - Learned_Route - ovn-controller will recreate them too but we don't >> >>> really expect any customer deployments pre 25.09 with BGP enabled. >> >> >> >> The feature was there since 25.03, so we may have some early >> users, I guess. >> >> >> > >> > Yes, possible but unlikely. In any case, if we go with "b" or "c" we >> > can we won't impact such users. >> > >> >>> >> >>>>>> >> >>>>>>>> Good point about adding a TODO. I can post a v3 or if >> >>>>>>>> this is the only change that needs to happen maybe whoever >> merges this >> >>>>>>>> series can add the following incremental: >> >>>>>>>> >> >>>>>>>> diff --git a/TODO.rst b/TODO.rst >> >>>>>>>> index a9fe3ec4e8..4c7aa0c772 100644 >> >>>>>>>> --- a/TODO.rst >> >>>>>>>> +++ b/TODO.rst >> >>>>>>>> @@ -99,6 +99,12 @@ OVN To-do List >> >>>>>>>> * Implement I-P for datapath groups. >> >>>>>>>> * Implement I-P for route exchange relevant ports. >> >>>>>>>> >> >>>>>>>> +* ovn-northd Incremental processing >> >>>>>>>> + >> >>>>>>>> + * Remove datapath_get_nb_uuid_and_type_legacy() in the >> next LTS (26.03) as >> >>>>> >> >>>>> On second thought, this should be "after the next LTS (in 26.09)". >> >>>> >> >>>> Yep. >> >>>> >> >>>>> >> >>>>>>>> + all deployments should be using the new >> SB.Datapath_Binding.type column >> >>>>>>>> + at that point. >> >>>>>>>> + >> >>>>>>>> * ovn-northd parallel logical flow processing >> >>>>>>>> >> >>>>>>>> * Multi-threaded logical flow computation was optimized >> for the case >> >>>>>>>> --- >> >>>>>>>> >> >>>>>>>>> If not, then we should just drop the "new" way and stop >> duplicating >> >>>>>>>>> the same information in two places. >> >>>>>>>>> >> >>>>>>>>> Best regards, Ilya Maximets. >> >>>>>>>>> >> >>>>>>>> >> >>>>>>>> Regards, >> >>>>>>>> Dumitru >> >>>>>>>> >> >>>>>>> >> >>>>> >> >>>> >> >>> >> >> >> >> >> Regards, >> Ales _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev