On 9/2/25 11:20 AM, Dumitru Ceara wrote:
> On 9/2/25 9:47 AM, Ales Musil wrote:
>>
>>
>> On Mon, Sep 1, 2025 at 9:20 PM Dumitru Ceara <dce...@redhat.com
>> <mailto:dce...@redhat.com>> wrote:
>>
>>     On 9/1/25 9:15 PM, Dumitru Ceara wrote:
>>     > On 9/1/25 5:13 PM, Ilya Maximets wrote:
>>     >> On 9/1/25 4:48 PM, Dumitru Ceara wrote:
>>     >>> On 9/1/25 4:25 PM, Ilya Maximets wrote:
>>     >>>> On 9/1/25 4:19 PM, Dumitru Ceara wrote:
>>     >>>>> On 9/1/25 4:17 PM, Dumitru Ceara wrote:
>>     >>>>>> On 9/1/25 4:12 PM, Ilya Maximets wrote:
>>     >>>>>>> On 9/1/25 3:33 PM, Dumitru Ceara wrote:
>>     >>>>>>>> Hi Ilya,
>>     >>>>>>>>
>>     >>>>>>>> Thanks for the review!
>>     >>>>>>>>
>>     >>>>>>>> On 9/1/25 3:23 PM, Ilya Maximets wrote:
>>     >>>>>>>>> On 9/1/25 1:22 PM, Dumitru Ceara wrote:
>>     >>>>>>>>>> Since 6919992d8781 ("Datapath_Binding: Separate type
>>     column and sync
>>     >>>>>>>>>> NB.UUID to SB."), a new type column is added to SB
>>     datapaths to
>>     >>>>>>>>>> differentiate between datapath types (switch vs router). 
>>     The same
>>     >>>>>>>>>> commit also changed northd such that newly created SB
>>     datapaths have as
>>     >>>>>>>>>> UUID the same value as the NB switch/router UUID.  All
>>     SB.Datapath
>>     >>>>>>>>>> records were also updated and their 'type' field populated
>>     >>>>>>>>>>
>>     >>>>>>>>>> It also added a helper function,
>>     datapath_get_nb_uuid_and_type(), to
>>     >>>>>>>>>> abstract out the extraction of a SB datapath's
>>     corresponding NB UUID and
>>     >>>>>>>>>> type in the following way:
>>     >>>>>>>>>> - if the SB.Datapath.type field has a value, the helper
>>     assumed that the
>>     >>>>>>>>>>   SB.Datapath.UUID is equal to the NB.Logical_Switch/
>>     Router.UUID so it
>>     >>>>>>>>>>   returned the 'type' value and the SB UUID
>>     >>>>>>>>>> - else (for "old style") datapaths it extracted the NB
>>     UUID and type
>>     >>>>>>>>>>   from the SB.Datapath.external_ids (as used to be done
>>     before the
>>     >>>>>>>>>>   commit).
>>     >>>>>>>>>>
>>     >>>>>>>>>> This creates an upgrade issue thoughi: older (already
>>     existing before
>>     >>>>>>>>>> upgrade) SB datapaths are not immediately recreated and
>>     their 'type'
>>     >>>>>>>>>> is set; so the NB UUID value returned by
>>     datapath_get_nb_uuid_and_type()
>>     >>>>>>>>>> for these records is incorrect.
>>     >>>>>>>>>>
>>     >>>>>>>>>> That stays like that until datapaths are finally
>>     recreated (due to other
>>     >>>>>>>>>> events, e.g., full recompute of the datapath-sync
>>     nodes).  All that time
>>     >>>>>>>>>> the lflow manager will incorrectly determine that the
>>     existing lflows
>>     >>>>>>>>>> that are stale.  That's because the lflow manager uses
>>     >>>>>>>>>> ovn_datapath_from_sbrec() which in turn relied on
>>     >>>>>>>>>> datapath_get_nb_uuid_and_type() to determine the in-
>>     memory ovn_datapath
>>     >>>>>>>>>> corresponding to the logical flow.
>>     >>>>>>>>>>
>>     >>>>>>>>>> There's no need to avoid re-creating SB.Datapath bindings
>>     as soon as
>>     >>>>>>>>>> ovn-northd determines that they're not using the new
>>     scheme.  In order
>>     >>>>>>>>>> to achieve that we add two sets of helpers:
>>     >>>>>>>>>> a. one to be used by ovn-northd,
>>     ovn_datapath_get_nb_uuid_and_type()
>>     >>>>>>>>>>    which only parses the new type of datapaths
>>     (everything else, i.e.
>>     >>>>>>>>>>    old-style datapaths, will be considered stale)
>>     >>>>>>>>>> b. one to be used by all other readers of SB datapaths (e.g.,
>>     >>>>>>>>>>    ovn-controller, ovn-ic),
>>     datapath_get_nb_uuid_and_type_legacy()
>>     >>>>>>>>>>    which relies on parsing external IDs (even in the new
>>     type of
>>     >>>>>>>>>>    datapaths external IDs are still populated with the NB
>>     UUID in order
>>     >>>>>>>>>>    to maintain backwards compatibility).
>>     >>>>>>>>>
>>     >>>>>>>>> Can we ever switch all users to the non-leagacy?  I'm not
>>     sure...
>>     >>>>>>>>> Should probably be a TODO entry for this.
>>     >>>>>>>>>
>>     >>>>>>>>
>>     >>>>>>>> We can, it should happen in the next LTS (26.03) though. 
>>     And it should
>>     >>>>>>>> be the plan.
>>     >>>>>>>
>>     >>>>>>> How the transition is done?  i.e. is that guaranteed that
>>     the database
>>     >>>>>>> is fully translated to the new format after the next
>>     upgrade?  Or was it
>>     >>>>>>> decided that the full database rewrite on upgrade is not an
>>     issue and it
>>     >>>>>>> works this way?
>>     >>>>>>>
>>     >>>>>>
>>     >>>>>> Yes, it's guaranteed that after upgrade to 25.09.0 (or more
>>     recent) the
>>     >>>>>> database is fully translated.  It was decided that it's not a
>>     concern to
>>     >>>>>> do that.  Discussed here:
>>     >>>>>>
>>     >>>>>> https://mail.openvswitch.org/pipermail/ovs-dev/2025-
>>     August/425327.html <https://mail.openvswitch.org/pipermail/ovs-
>>     dev/2025-August/425327.html>
>>     >>>>
>>     >>>> OK.  FWIW, in comparison with the schema upgrade situation,
>>     this datapath
>>     >>>> binding re-creation business may cause actual temporary traffic
>>     disruption
>>     >>>> as we'll delete all the dynamic MAC bindings.  Is that the
>>     case?  I didn't
>>     >>>> read it very carefully, but I don't see this being discussed in
>>     the linked
>>     >>>> thread.
>>     >>>>
>>     >>>
>>     >>> You're right, dynamic MAC bindings have not been discussed.  I
>>     guess we
>>     >>> have a few options:
>>     >>> a. leave it as is, dynamic MAC bindings will be recreated with the
>>     >>> correct datapaths when needed, traffic will be buffered (for a
>>     bit) by
>>     >>> ovn-controller
>>     >>
>>     >> The main problem here is that all that traffic is going to ovn-
>>     controller
>>     >> and, as practice shows (we had an issue like this earlier this year),
>>     >> CoPP starts dropping a fair share of this traffic causing a
>>     noticeable
>>     >> amount of lost packets and visible delays on TCP connections.
>>     >>
>>     >
>>     > True, this sounds like the riskiest option.
>>     >
>>     >>> b. try to update existing dynamic MAC bindings when their
>>     datapath is
>>     >>> not accurate anymore (similar to what we do for static mac bindings)
>>     >>
>>     >> Sounds risky in way that it's not that simple to track.  We may
>>     have a
>>     >> chance of moving entries to a wrong datapath.  Do you have ideas
>>     on how
>>     >> to do the transition cleanly in northd?
>>     >>
>>     >
>>     > I was thinking of tracking the set of SB.datapath_bindings that were
>>     > recreated (due to old-style format) and if we have any of those
>>     iterate
>>     > through other types SB records that might need update (MAC_Binding,
>>     > IGMP_Group and Learned_Route), update them and potentially schedule a
>>     > forced full recompute for the next ovn-northd run (and trigger one).
>>     > It's a bit compute heavy but only when we recreate old-style datapaths
>>     > and it should cover all dependencies.
>>     >
>>     > It's my impression that if we do that, the only potential traffic
>>     impact
>>     > might be on IP multicast traffic.  That's because we might generate
>>     > slightly different logical flows and SB.multicast_groups next time we
>>     > run northd.  In my opinion, an acceptable compromise.
>>     >
>>     > The other two types of records (dynamic mac bindings and learned
>>     routes)
>>     > don't actually need the second (forced recompute) run of northd.
>>     >
>>
>>     After a closer look we might not even need the forced recompute.
>>
>>     >>> c. revert all the Datpath_Binding schema changes (type and UUID
>>     syncing)
>>     >>
>>     >> Maybe the safest option, but I'm not sure how much reverting this
>>     involves.
>>     >>
>>     >
>>     > I agree this is the safest option.  But that also means that we'll
>>     > likely make it practically impossible to sync UUIDs from NB to SB
>>     in the
>>     > future.  I'm not against that, I think the current incremental
>>     > processing infrastructure in northd doesn't really need the UUIDs
>>     to be
>>     > in sync.  But there were opinions from others that it would
>>     simplify the
>>     > work if they're in sync.
>>     >
>>     > In conclusion I think I'd try "b" update SB records that need update.
>>     > I'm going to try to prepare something along those lines but it
>>     would be
>>     > nice to hear opinions from Han, Mark, Ales and others too.\
>>
>>
>>
>> Based on the discussion I have merged only 3/4 and 4/4 as they
>> shouldn't affect the end result no matter what option we will choose.
>>
> 
> Thanks, Ales!
> 
>> Now for the options, none of them sounds good to be honest.
>> "b" has the advantage that it's one time only thing, we can remove
>> the code in 26.09. The "c" is unfortunate because  there some changes
>> on review already counting on that approach it might not be that straight
>> forward to do the revert though. "a" I guess is out of the question
>> there would
>> be a lot of potential disruptions.
>>
>> So option "b" IMO.
>>
> 
> FWIW, I think this would be a correct implementation of "b":
> https://github.com/dceara/ovn/commit/a9a882b
> 
> It has minimal impact on performance (it does an additional datapath
> lookup on datapath binding deletion) and only when we detect old-style
> datapaths (in practice that's only exactly once, at upgrade) we update
> datapath references of all relevant SB table records (mac_binding,
> igmp_group, learned_route).  Because igmp_group and learned_route also
> store references to Port_Binding, I decided to also include
> Port_Binding.Datapath updates in the list.
> 
> I'm going to add a small "upgrade"-like test case for all these tables
> and if people are OK with this approach, I can post it as formal v3.

And with the test included, confirmed that it works:
https://github.com/dceara/ovn/commit/ebd4263

> I'll give Ilya, Mark, Han and others some more time to reply before that
> though.
> 
> Regards,
> Dumitru
> 
>>     >
>>     >>>
>>     >>> Other SB records created by ovn-controller and referring to
>>     >>> Datapath_bindings are:
>>     >>>
>>     >>> - IGMP Group - ovn-controller will recreate this from memory
>>     immediately
>>     >>> after the datapath binding update so there might be some
>>     disruption but
>>     >>> minimal
>>     >>
>>     >> We should also remember that ovn-controller is handling the whole
>>     database
>>     >> rewrite here as well.  So, it may take a bit longer than usual.
>>     >>
>>     >>> - Learned_Route - ovn-controller will recreate them too but we don't
>>     >>> really expect any customer deployments pre 25.09 with BGP enabled.
>>     >>
>>     >> The feature was there since 25.03, so we may have some early
>>     users, I guess.
>>     >>
>>     >
>>     > Yes, possible but unlikely.  In any case, if we go with "b" or "c" we
>>     > can we won't impact such users.
>>     >
>>     >>>
>>     >>>>>>
>>     >>>>>>>> Good point about adding a TODO.  I can post a v3 or if
>>     >>>>>>>> this is the only change that needs to happen maybe whoever
>>     merges this
>>     >>>>>>>> series can add the following incremental:
>>     >>>>>>>>
>>     >>>>>>>> diff --git a/TODO.rst b/TODO.rst
>>     >>>>>>>> index a9fe3ec4e8..4c7aa0c772 100644
>>     >>>>>>>> --- a/TODO.rst
>>     >>>>>>>> +++ b/TODO.rst
>>     >>>>>>>> @@ -99,6 +99,12 @@ OVN To-do List
>>     >>>>>>>>    * Implement I-P for datapath groups.
>>     >>>>>>>>    * Implement I-P for route exchange relevant ports.
>>     >>>>>>>> 
>>     >>>>>>>> +* ovn-northd Incremental processing
>>     >>>>>>>> +
>>     >>>>>>>> +  * Remove datapath_get_nb_uuid_and_type_legacy() in the
>>     next LTS (26.03) as
>>     >>>>>
>>     >>>>> On second thought, this should be "after the next LTS (in 26.09)".
>>     >>>>
>>     >>>> Yep.
>>     >>>>
>>     >>>>>
>>     >>>>>>>> +    all deployments should be using the new
>>     SB.Datapath_Binding.type column
>>     >>>>>>>> +    at that point.
>>     >>>>>>>> +
>>     >>>>>>>>  * ovn-northd parallel logical flow processing
>>     >>>>>>>> 
>>     >>>>>>>>    * Multi-threaded logical flow computation was optimized
>>     for the case
>>     >>>>>>>> ---
>>     >>>>>>>>
>>     >>>>>>>>> If not, then we should just drop the "new" way and stop
>>     duplicating
>>     >>>>>>>>> the same information in two places.
>>     >>>>>>>>>
>>     >>>>>>>>> Best regards, Ilya Maximets.
>>     >>>>>>>>>
>>     >>>>>>>>
>>     >>>>>>>> Regards,
>>     >>>>>>>> Dumitru
>>     >>>>>>>>
>>     >>>>>>>
>>     >>>>>
>>     >>>>
>>     >>>
>>     >>
>>
>>
>> Regards,
>> Ales 

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to