On 9/2/25 9:47 AM, Ales Musil wrote:
> 
> 
> On Mon, Sep 1, 2025 at 9:20 PM Dumitru Ceara <dce...@redhat.com
> <mailto:dce...@redhat.com>> wrote:
> 
>     On 9/1/25 9:15 PM, Dumitru Ceara wrote:
>     > On 9/1/25 5:13 PM, Ilya Maximets wrote:
>     >> On 9/1/25 4:48 PM, Dumitru Ceara wrote:
>     >>> On 9/1/25 4:25 PM, Ilya Maximets wrote:
>     >>>> On 9/1/25 4:19 PM, Dumitru Ceara wrote:
>     >>>>> On 9/1/25 4:17 PM, Dumitru Ceara wrote:
>     >>>>>> On 9/1/25 4:12 PM, Ilya Maximets wrote:
>     >>>>>>> On 9/1/25 3:33 PM, Dumitru Ceara wrote:
>     >>>>>>>> Hi Ilya,
>     >>>>>>>>
>     >>>>>>>> Thanks for the review!
>     >>>>>>>>
>     >>>>>>>> On 9/1/25 3:23 PM, Ilya Maximets wrote:
>     >>>>>>>>> On 9/1/25 1:22 PM, Dumitru Ceara wrote:
>     >>>>>>>>>> Since 6919992d8781 ("Datapath_Binding: Separate type
>     column and sync
>     >>>>>>>>>> NB.UUID to SB."), a new type column is added to SB
>     datapaths to
>     >>>>>>>>>> differentiate between datapath types (switch vs router). 
>     The same
>     >>>>>>>>>> commit also changed northd such that newly created SB
>     datapaths have as
>     >>>>>>>>>> UUID the same value as the NB switch/router UUID.  All
>     SB.Datapath
>     >>>>>>>>>> records were also updated and their 'type' field populated
>     >>>>>>>>>>
>     >>>>>>>>>> It also added a helper function,
>     datapath_get_nb_uuid_and_type(), to
>     >>>>>>>>>> abstract out the extraction of a SB datapath's
>     corresponding NB UUID and
>     >>>>>>>>>> type in the following way:
>     >>>>>>>>>> - if the SB.Datapath.type field has a value, the helper
>     assumed that the
>     >>>>>>>>>>   SB.Datapath.UUID is equal to the NB.Logical_Switch/
>     Router.UUID so it
>     >>>>>>>>>>   returned the 'type' value and the SB UUID
>     >>>>>>>>>> - else (for "old style") datapaths it extracted the NB
>     UUID and type
>     >>>>>>>>>>   from the SB.Datapath.external_ids (as used to be done
>     before the
>     >>>>>>>>>>   commit).
>     >>>>>>>>>>
>     >>>>>>>>>> This creates an upgrade issue thoughi: older (already
>     existing before
>     >>>>>>>>>> upgrade) SB datapaths are not immediately recreated and
>     their 'type'
>     >>>>>>>>>> is set; so the NB UUID value returned by
>     datapath_get_nb_uuid_and_type()
>     >>>>>>>>>> for these records is incorrect.
>     >>>>>>>>>>
>     >>>>>>>>>> That stays like that until datapaths are finally
>     recreated (due to other
>     >>>>>>>>>> events, e.g., full recompute of the datapath-sync
>     nodes).  All that time
>     >>>>>>>>>> the lflow manager will incorrectly determine that the
>     existing lflows
>     >>>>>>>>>> that are stale.  That's because the lflow manager uses
>     >>>>>>>>>> ovn_datapath_from_sbrec() which in turn relied on
>     >>>>>>>>>> datapath_get_nb_uuid_and_type() to determine the in-
>     memory ovn_datapath
>     >>>>>>>>>> corresponding to the logical flow.
>     >>>>>>>>>>
>     >>>>>>>>>> There's no need to avoid re-creating SB.Datapath bindings
>     as soon as
>     >>>>>>>>>> ovn-northd determines that they're not using the new
>     scheme.  In order
>     >>>>>>>>>> to achieve that we add two sets of helpers:
>     >>>>>>>>>> a. one to be used by ovn-northd,
>     ovn_datapath_get_nb_uuid_and_type()
>     >>>>>>>>>>    which only parses the new type of datapaths
>     (everything else, i.e.
>     >>>>>>>>>>    old-style datapaths, will be considered stale)
>     >>>>>>>>>> b. one to be used by all other readers of SB datapaths (e.g.,
>     >>>>>>>>>>    ovn-controller, ovn-ic),
>     datapath_get_nb_uuid_and_type_legacy()
>     >>>>>>>>>>    which relies on parsing external IDs (even in the new
>     type of
>     >>>>>>>>>>    datapaths external IDs are still populated with the NB
>     UUID in order
>     >>>>>>>>>>    to maintain backwards compatibility).
>     >>>>>>>>>
>     >>>>>>>>> Can we ever switch all users to the non-leagacy?  I'm not
>     sure...
>     >>>>>>>>> Should probably be a TODO entry for this.
>     >>>>>>>>>
>     >>>>>>>>
>     >>>>>>>> We can, it should happen in the next LTS (26.03) though. 
>     And it should
>     >>>>>>>> be the plan.
>     >>>>>>>
>     >>>>>>> How the transition is done?  i.e. is that guaranteed that
>     the database
>     >>>>>>> is fully translated to the new format after the next
>     upgrade?  Or was it
>     >>>>>>> decided that the full database rewrite on upgrade is not an
>     issue and it
>     >>>>>>> works this way?
>     >>>>>>>
>     >>>>>>
>     >>>>>> Yes, it's guaranteed that after upgrade to 25.09.0 (or more
>     recent) the
>     >>>>>> database is fully translated.  It was decided that it's not a
>     concern to
>     >>>>>> do that.  Discussed here:
>     >>>>>>
>     >>>>>> https://mail.openvswitch.org/pipermail/ovs-dev/2025-
>     August/425327.html <https://mail.openvswitch.org/pipermail/ovs-
>     dev/2025-August/425327.html>
>     >>>>
>     >>>> OK.  FWIW, in comparison with the schema upgrade situation,
>     this datapath
>     >>>> binding re-creation business may cause actual temporary traffic
>     disruption
>     >>>> as we'll delete all the dynamic MAC bindings.  Is that the
>     case?  I didn't
>     >>>> read it very carefully, but I don't see this being discussed in
>     the linked
>     >>>> thread.
>     >>>>
>     >>>
>     >>> You're right, dynamic MAC bindings have not been discussed.  I
>     guess we
>     >>> have a few options:
>     >>> a. leave it as is, dynamic MAC bindings will be recreated with the
>     >>> correct datapaths when needed, traffic will be buffered (for a
>     bit) by
>     >>> ovn-controller
>     >>
>     >> The main problem here is that all that traffic is going to ovn-
>     controller
>     >> and, as practice shows (we had an issue like this earlier this year),
>     >> CoPP starts dropping a fair share of this traffic causing a
>     noticeable
>     >> amount of lost packets and visible delays on TCP connections.
>     >>
>     >
>     > True, this sounds like the riskiest option.
>     >
>     >>> b. try to update existing dynamic MAC bindings when their
>     datapath is
>     >>> not accurate anymore (similar to what we do for static mac bindings)
>     >>
>     >> Sounds risky in way that it's not that simple to track.  We may
>     have a
>     >> chance of moving entries to a wrong datapath.  Do you have ideas
>     on how
>     >> to do the transition cleanly in northd?
>     >>
>     >
>     > I was thinking of tracking the set of SB.datapath_bindings that were
>     > recreated (due to old-style format) and if we have any of those
>     iterate
>     > through other types SB records that might need update (MAC_Binding,
>     > IGMP_Group and Learned_Route), update them and potentially schedule a
>     > forced full recompute for the next ovn-northd run (and trigger one).
>     > It's a bit compute heavy but only when we recreate old-style datapaths
>     > and it should cover all dependencies.
>     >
>     > It's my impression that if we do that, the only potential traffic
>     impact
>     > might be on IP multicast traffic.  That's because we might generate
>     > slightly different logical flows and SB.multicast_groups next time we
>     > run northd.  In my opinion, an acceptable compromise.
>     >
>     > The other two types of records (dynamic mac bindings and learned
>     routes)
>     > don't actually need the second (forced recompute) run of northd.
>     >
> 
>     After a closer look we might not even need the forced recompute.
> 
>     >>> c. revert all the Datpath_Binding schema changes (type and UUID
>     syncing)
>     >>
>     >> Maybe the safest option, but I'm not sure how much reverting this
>     involves.
>     >>
>     >
>     > I agree this is the safest option.  But that also means that we'll
>     > likely make it practically impossible to sync UUIDs from NB to SB
>     in the
>     > future.  I'm not against that, I think the current incremental
>     > processing infrastructure in northd doesn't really need the UUIDs
>     to be
>     > in sync.  But there were opinions from others that it would
>     simplify the
>     > work if they're in sync.
>     >
>     > In conclusion I think I'd try "b" update SB records that need update.
>     > I'm going to try to prepare something along those lines but it
>     would be
>     > nice to hear opinions from Han, Mark, Ales and others too.\
> 
> 
> 
> Based on the discussion I have merged only 3/4 and 4/4 as they
> shouldn't affect the end result no matter what option we will choose.
> 

Thanks, Ales!

> Now for the options, none of them sounds good to be honest.
> "b" has the advantage that it's one time only thing, we can remove
> the code in 26.09. The "c" is unfortunate because  there some changes
> on review already counting on that approach it might not be that straight
> forward to do the revert though. "a" I guess is out of the question
> there would
> be a lot of potential disruptions.
> 
> So option "b" IMO.
> 

FWIW, I think this would be a correct implementation of "b":
https://github.com/dceara/ovn/commit/a9a882b

It has minimal impact on performance (it does an additional datapath
lookup on datapath binding deletion) and only when we detect old-style
datapaths (in practice that's only exactly once, at upgrade) we update
datapath references of all relevant SB table records (mac_binding,
igmp_group, learned_route).  Because igmp_group and learned_route also
store references to Port_Binding, I decided to also include
Port_Binding.Datapath updates in the list.

I'm going to add a small "upgrade"-like test case for all these tables
and if people are OK with this approach, I can post it as formal v3.
I'll give Ilya, Mark, Han and others some more time to reply before that
though.

Regards,
Dumitru

>     >
>     >>>
>     >>> Other SB records created by ovn-controller and referring to
>     >>> Datapath_bindings are:
>     >>>
>     >>> - IGMP Group - ovn-controller will recreate this from memory
>     immediately
>     >>> after the datapath binding update so there might be some
>     disruption but
>     >>> minimal
>     >>
>     >> We should also remember that ovn-controller is handling the whole
>     database
>     >> rewrite here as well.  So, it may take a bit longer than usual.
>     >>
>     >>> - Learned_Route - ovn-controller will recreate them too but we don't
>     >>> really expect any customer deployments pre 25.09 with BGP enabled.
>     >>
>     >> The feature was there since 25.03, so we may have some early
>     users, I guess.
>     >>
>     >
>     > Yes, possible but unlikely.  In any case, if we go with "b" or "c" we
>     > can we won't impact such users.
>     >
>     >>>
>     >>>>>>
>     >>>>>>>> Good point about adding a TODO.  I can post a v3 or if
>     >>>>>>>> this is the only change that needs to happen maybe whoever
>     merges this
>     >>>>>>>> series can add the following incremental:
>     >>>>>>>>
>     >>>>>>>> diff --git a/TODO.rst b/TODO.rst
>     >>>>>>>> index a9fe3ec4e8..4c7aa0c772 100644
>     >>>>>>>> --- a/TODO.rst
>     >>>>>>>> +++ b/TODO.rst
>     >>>>>>>> @@ -99,6 +99,12 @@ OVN To-do List
>     >>>>>>>>    * Implement I-P for datapath groups.
>     >>>>>>>>    * Implement I-P for route exchange relevant ports.
>     >>>>>>>> 
>     >>>>>>>> +* ovn-northd Incremental processing
>     >>>>>>>> +
>     >>>>>>>> +  * Remove datapath_get_nb_uuid_and_type_legacy() in the
>     next LTS (26.03) as
>     >>>>>
>     >>>>> On second thought, this should be "after the next LTS (in 26.09)".
>     >>>>
>     >>>> Yep.
>     >>>>
>     >>>>>
>     >>>>>>>> +    all deployments should be using the new
>     SB.Datapath_Binding.type column
>     >>>>>>>> +    at that point.
>     >>>>>>>> +
>     >>>>>>>>  * ovn-northd parallel logical flow processing
>     >>>>>>>> 
>     >>>>>>>>    * Multi-threaded logical flow computation was optimized
>     for the case
>     >>>>>>>> ---
>     >>>>>>>>
>     >>>>>>>>> If not, then we should just drop the "new" way and stop
>     duplicating
>     >>>>>>>>> the same information in two places.
>     >>>>>>>>>
>     >>>>>>>>> Best regards, Ilya Maximets.
>     >>>>>>>>>
>     >>>>>>>>
>     >>>>>>>> Regards,
>     >>>>>>>> Dumitru
>     >>>>>>>>
>     >>>>>>>
>     >>>>>
>     >>>>
>     >>>
>     >>
> 
> 
> Regards,
> Ales 

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to