On 8/14/25 7:40 AM, Ales Musil wrote: > On Thu, Aug 14, 2025 at 1:18 AM Han Zhou <hz...@ovn.org> wrote: > >> >> >> On Wed, Aug 13, 2025 at 3:45 PM Dumitru Ceara <dce...@redhat.com> wrote: >>> >>> Hi Han, Mark, >>> >>> On 8/13/25 10:18 PM, Han Zhou wrote: >>>> >>>> >>>> On Wed, Aug 13, 2025 at 12:36 PM Mark Michelson <mmich...@redhat.com >>>> <mailto:mmich...@redhat.com>> wrote: >>>>> >>>>> On 8/13/25 3:10 PM, Dumitru Ceara wrote: >>>>>> On Wednesday, August 13, 2025, Han Zhou <hz...@ovn.org >>>> <mailto:hz...@ovn.org> >>>>>> <mailto:hz...@ovn.org <mailto:hz...@ovn.org>>> wrote: >>>>>> >>>>>> >>>>>> >>>>>> On Wed, Aug 13, 2025 at 1:20 AM Dumitru Ceara < >> dce...@redhat.com >>>> <mailto:dce...@redhat.com> >>>>>> <mailto:dce...@redhat.com <mailto:dce...@redhat.com>>> wrote: >>>>>> > >>>>>> > Hi Mark, Han, Numan, >>>>>> > >>>>>> > On 8/12/25 8:32 PM, Mark Michelson via dev wrote: >>>>>> > > On 8/11/25 2:36 PM, Han Zhou wrote: >>>>>> > >> >>>>>> > >> >>>>>> > >> On Mon, Aug 11, 2025 at 10:22 AM Numan Siddique >>>>>> <num...@ovn.org <mailto:num...@ovn.org> <mailto:num...@ovn.org >>>> <mailto:num...@ovn.org>> >>>>>> > >> <mailto:num...@ovn.org <mailto:num...@ovn.org> >>>> <mailto:num...@ovn.org <mailto:num...@ovn.org>>>> wrote: >>>>>> > >> > >>>>>> > >> > On Mon, Aug 11, 2025 at 10:00 PM Han Zhou >>>> <hz...@ovn.org <mailto:hz...@ovn.org> >>>>>> <mailto:hz...@ovn.org <mailto:hz...@ovn.org>> >>>>>> > >> <mailto:hz...@ovn.org <mailto:hz...@ovn.org> >>>> <mailto:hz...@ovn.org <mailto:hz...@ovn.org>>>> wrote: >>>>>> > >> > > >>>>>> > >> > > On Mon, Aug 11, 2025 at 7:59 AM Mark Michelson via >> dev < >>>>>> > >> > > ovs-dev@openvswitch.org <mailto:ovs- >>>> d...@openvswitch.org> <mailto:ovs-dev@openvswitch.org <mailto:ovs- >>>> d...@openvswitch.org>> >>>>>> <mailto:ovs-dev@openvswitch.org <mailto:ovs-dev@openvswitch.org >>> >>>> <mailto:ovs-dev@openvswitch.org <mailto:ovs-dev@openvswitch.org>>>> >>>>>> wrote: >>>>>> > >> > > > >>>>>> > >> > > > Prior to this commit, if you wanted to find the >>>>>> corresponding >>>>>> > >> northbound >>>>>> > >> > > > UUID for a southbound Logical Datapath, you could >> find >>>>>> it in >>>>>> > >> one of two >>>>>> > >> > > > places: >>>>>> > >> > > > >>>>>> > >> > > > For logical switches, it was in >> external-ids:logical- >>>>>> switch. >>>>>> > >> > > > For logical routers, it was in external- >>>> ids:logical-router. >>>>>> > >> > > > >>>>>> > >> > > > With this commit, we are separating the type and >>>> UUID into >>>>>> > >> separate >>>>>> > >> > > > fields. This way, no matter the type of the >> datapath, >>>>>> you can >>>>>> > >> find the >>>>>> > >> > > > UUID. And you can find the type of the datapath >> without >>>>>> having to >>>>>> > >> > > > potentially check multiple external-ids keys to do >> so. >>>>>> > >> > > > >>>>>> > >> > > > These fields are going to be used pretty heavily by >>>>>> northd in >>>>>> > >> upcoming >>>>>> > >> > > > patches, so instead of making them external-ids, >> these >>>>>> are now >>>>>> > >> > > > full-fledged columns on the southbound >>>>>> Datapath_Binding. The >>>>>> > >> UUID of the >>>>>> > >> > > > northbound logical datapath is in a column called >>>>>> "nb_uuid" and >>>>>> > >> the type >>>>>> > >> > > > of the logical datapath is stored in an enumerated >>>> column >>>>>> > >> called "type". >>>>>> > >> > > >>>>>> > >> > > Thanks Mark. Sorry for reviewing this late. While >> still >>>>>> reviewing >>>>>> > >> the rest >>>>>> > >> > > of the patches, I want to raise a question regarding >> this >>>>>> patch. >>>>>> > >> Would it >>>>>> > >> > > be more straightforward and efficient to use NB UUIDs >>>>>> directly in >>>>>> > >> SB for >>>>>> > >> > > both LS and LR? I saw that there was an old patch >> from >>>>>> Ales which >>>>>> > >> mentioned >>>>>> > >> > > this potential change [0]. It mentioned the >> necessity of >>>>>> "collision >>>>>> > >> > > detection", but I don't think it is necessary, >> because >>>>>> UUIDs are >>>>>> > >> just >>>>>> > >> > > designed to be globally unique. Did I miss any other >>>>>> discussions >>>>>> > >> for this? >>>>>> > >> > > >>>>>> > >> > > [0] https://www.mail-archive.com/ovs- <https:// >>>> www.mail-archive.com/ovs-> >>>>>> dev%40openvswitch.org/ <http://40openvswitch.org/> <https:// >>>> www.mail-archive.com/ovs- <https://www.mail-archive.com/ovs-> >>>>>> dev%40openvswitch.org/ <http://40openvswitch.org/>> >>>>>> > >> msg89347.html <https://www.mail-archive.com/ovs- >> <https:// >>>> www.mail-archive.com/ovs-> >>>>>> dev%40openvswitch.org/ <http://40openvswitch.org/> <https:// >>>> www.mail-archive.com/ovs- <https://www.mail-archive.com/ovs-> >>>>>> dev%40openvswitch.org/ <http://40openvswitch.org/>> >>>>>> > >> msg89347.html> >>>>>> > >> > > >>>>>> > >> > >>>>>> > >> > I had included this change of using the same NB UUIDs >>>> in SB for >>>>>> > >> > Datapath-bindings here [1] >>>>>> > >> > as part of this series - [2]. Please see the comments >>>> in [1]. >>>>>> > >> > >>>>>> > >> > Later Lorenzo took over this patch series and dropped >> the >>>>>> patch [1] >>>>>> > >> from it. >>>>>> > >> > >>>>>> > >> > [1] - https://patchwork.ozlabs.org/project/ovn/ >>>> <https://patchwork.ozlabs.org/project/ovn/> <https:// >>>>>> patchwork.ozlabs.org/project/ovn/ <http://patchwork.ozlabs.org/ >>>> project/ovn/>> >>>>>> > >> patch/20250110162652.3550775-1-num...@ovn.org/ >>>> <http://20250110162652.3550775-1-num...@ovn.org/> >>>>>> <http://20250110162652.3550775-1-num...@ovn.org/ >>>> <http://20250110162652.3550775-1-num...@ovn.org/>> <https:// >>>>>> > >> patchwork.ozlabs.org/project/ovn/ <http:// >>>> patchwork.ozlabs.org/project/ovn/> >>>>>> patch/20250110162652.3550775-1- <http://patchwork.ozlabs.org/ >>>> <http://patchwork.ozlabs.org/> >>>>>> project/ovn/patch/20250110162652.3550775-1-> >>>>>> > >> num...@ovn.org/ <http://num...@ovn.org/> <http:// >>>> num...@ovn.org/ <http://num...@ovn.org/>>> >>>>>> > >> > [2] - https://patchwork.ozlabs.org/project/ovn/list/ >>>> <https://patchwork.ozlabs.org/project/ovn/list/> >>>>>> <https://patchwork.ozlabs.org/project/ovn/list/ <https:// >>>> patchwork.ozlabs.org/project/ovn/list/>>? >>>>>> > >> series=439760&state=* <https://patchwork.ozlabs.org/ >>>> project/ <https://patchwork.ozlabs.org/project/> >>>>>> ovn/list/ <https://patchwork.ozlabs.org/project/ovn/list/ >>>> <https://patchwork.ozlabs.org/project/ovn/list/>>? >>>>>> > >> series=439760&state=*> >>>>>> > >> > >>>>>> > >> > The main concern is that both LS and LR can have the >>>> same uuid. >>>>>> > >> > IMO this can happen only if LS and LR were created by >>>> passing >>>>>> > >> > "--id=<SAME_UUID>". >>>>>> > >> > >>>>>> > >> Using the same explicit UUID for LS and LR creation is >>>> clearly a >>>>>> > >> misconfiguration. >>>>>> > >> >>>>>> > >> > I personally think we can overlook this corner case and >>>>>> document it >>>>>> > >> > and suggest the users to recreate such logical >> switches/ >>>>>> routers before >>>>>> > >> > upgrading. >>>>>> > >> >>>>>> > >> I agree, and I assume legitimate deployments would not >>>> have such >>>>>> > >> conflicts. >>>>>> > > >>>>>> > > I have been looking at the datapath syncing code *a lot* >>>> lately. >>>>>> > > Existing code in northd/northd.c in the main branch already >>>>>> complains if >>>>>> > > UUIDs are shared between logical switches and logical >> routers. >>>>>> > > >>>>>> > > This is from join_datapaths() in northd/northd.c in main: >>>>>> > > >>>>>> > > SBREC_DATAPATH_BINDING_TABLE_FOR_EACH_SAFE (sb, >>>>>> sbrec_dp_table) { >>>>>> > > struct uuid key; >>>>>> > > if (!smap_get_uuid(&sb->external_ids, "logical- >>>>>> switch", &key) && >>>>>> > > !smap_get_uuid(&sb->external_ids, "logical- >>>>>> router", &key)) { >>>>>> > > ovsdb_idl_txn_add_comment( >>>>>> > > ovnsb_txn, >>>>>> > > "deleting Datapath_Binding "UUID_FMT" >> that >>>>>> lacks " >>>>>> > > "external-ids:logical-switch and " >>>>>> > > "external-ids:logical-router", >>>>>> > > UUID_ARGS(&sb->header_.uuid)); >>>>>> > > sbrec_datapath_binding_delete(sb); >>>>>> > > continue; >>>>>> > > } >>>>>> > > >>>>>> > > if (ovn_datapath_find_(datapaths, &key)) { >>>>>> > > static struct vlog_rate_limit rl = >>>>>> VLOG_RATE_LIMIT_INIT(5, >>>>>> > > 1); >>>>>> > > VLOG_INFO_RL( >>>>>> > > &rl, "deleting Datapath_Binding >>>> "UUID_FMT" with " >>>>>> > > "duplicate >> external-ids:logical-switch/router >>>>>> "UUID_FMT, >>>>>> > > UUID_ARGS(&sb->header_.uuid), >>>> UUID_ARGS(&key)); >>>>>> > > sbrec_datapath_binding_delete(sb); >>>>>> > > continue; >>>>>> > > } >>>>>> > > >>>>>> > > struct ovn_datapath *od = >>>>>> ovn_datapath_create(datapaths, &key, >>>>>> > > >> NULL, >>>>>> NULL, sb); >>>>>> > > ovs_list_push_back(sb_only, &od->list); >>>>>> > > } >>>>>> > > >>>>>> > > >>>>>> > > Notice the call to `ovn_datapath_find_()` here. If you >> have a >>>>>> logical >>>>>> > > switch and logical router with the same UUID, then northd >> will >>>>>> complain >>>>>> > > and not create an ovn_datapath for the southbound datapath >>>>>> binding with >>>>>> > > the duplicated UUID. >>>>>> > > >>>>>> > > Therefore, I think it's perfectly fine to propagate the >>>>>> northbound UUID >>>>>> > > to the southbound, since the UUID collision would already >> be >>>>>> treated as >>>>>> > > an error, and we would not break anything by doing so. >>>>>> > > >>>>>> > >>>>>> > Unfortunately, while looking at your v16 where we actually >>>>>> propagate the >>>>>> > NB UUID to the SB, I realized this might introduce a >> significant >>>>>> problem >>>>>> > during upgrades. >>>>>> > >>>>>> > Due to the "UUID sync" we now have to recreate _all_ >>>>>> SB.Datapath_binding >>>>>> > records when ovn-northd upgrades to a version that includes >> that >>>>>> change. >>>>>> > That essentially means we need to recreate most of the SB >>>> contents >>>>>> > (datapaths, lflows, datapath groups). >>>>>> > >>>>>> > That SB change will likely be a single transaction >> (ovn-northd -> >>>>>> SB) so >>>>>> > it will also be sent as a single jsonrpc update to ovn- >>>> controllers. >>>>>> > >>>>>> > In large(r) setups, e.g., with a few hundred chassis, this >> will >>>>>> likely >>>>>> > create a performance issue as the SB ovsdb-server will have >> to >>>>>> send that >>>>>> > huge transaction to all ovn-controllers. While that could be >>>>>> mitigated >>>>>> > by using ovsdb-relay for example, it's quite likely that our >>>>>> users don't >>>>>> > have that in place. >>>>>> > >>>>>> > So my recommendation is to _not_ sync NB.UUID to >>>>>> > SB.Datapath_Binding.UUID. I don't think the extra column >>>> (the v14) >>>>>> > approach was too much overhead, it's just one UUID per >> datapath >>>>>> record >>>>>> > and the northd code doesn't seem to become way more complex. >>>>>> > >>>>>> >>>>>> Hi Dumitru, >>>>>> >>>>>> >>>>>> Hi Han, >>>>>> >>>>>> Thanks for sharing your concern. However, we did have similar >>>>>> changes in the past that resulted in large SB updates, for >> example, >>>>>> "northd.c: Generate and maintain SB lflow uuid in ovn_lflow." >> that >>>>>> changed UUIDs of all logical flows, which is the biggest part >> of SB >>>>>> DB. In addition, whenever there is a SB DB schema change, >> during the >>>>>> upgrade it would anyway resend all the DB data to all ovn- >>>>>> controllers, and this time we are changing the >>>>>> >>>>>> schema as well. >>>>>> >>>>>> >>>>>> Thanks for the pointer to the precedent. I’d argue that datapath >> binding >>>>>> resync is slightly more costly than just lflow but you’re right it’s >>>>>> probably comparable. >>>>> >>>>> Updating datapath binding UUIDs has a knock-on effect since it would >>>>> update all >>>>> * Datapath_Bindings >>>>> * Logical_DP_Groups >>>>> * Logical_Flows >>>>> * Multicast_Groups >>>>> * Load_Balancers >>>>> >>>>> So I agree with Dumitru that updating datapath bindings is going to be >>>>> more costly. >>>>> >>>>>> >>>>>> So I think we shouldn't worry too much about this. Upgrades for >>>>>> large scale needs to be taken care of by operators regardless. >> And I >>>>>> think it is very >>>>>> >>>>>> >>>>>> I think we’ve all been involved in complex debugging sessions for >> issues >>>>>> that appeared when upgrading ovn at scale. I’d rather we avoid >> adding >>>>>> more potential issues. >>>>> > > >>>>>> valuable to sync UUIDs between NB and SB. It would make code >> more >>>>>> efficient, simpler, especially for incremental processing, and >> maybe >>>>>> also easier for debugging. >>>>>> >>>>>> >>>>>> However, I guess at this point we have a time issue.. we’re >> supposed to >>>>>> branch on Friday and this large series is still not merged. >>>>>> >>>>>> Both approaches have been posted, v16 which resyncs uuids, and v17 >> which >>>>>> doesn’t. If Mark is ok with it I can take the responsibility of >> merging >>>>>> either of those (or a combination of both) tomorrow. >>>>>> >>>>>> But we need some sort of agreement. Mark, Han, Numan, Ales, it >> would be >>>>>> great to hear more opinions on this. >>>>> >>>>> Above I agreed that this would be a bigger performance hit. However, >>>>> this is also a one-time thing. Once the UUIDs are updated once, and >> all >>>>> other SB tables are updated, then it doesn't happen any more. We're >> not >>>>> talking about a repeated performance hit. I think the biggest >> potential >>>>> issue here is if the ovn-controller transaction is so large that >>>>> ovn-controller ends up missing an OVSDB ECHO message and we end up in >> a >>>>> loop trying to process the large transaction over and over. >>>>> >>>>> As Han has pointed out, making this change has some nice benefits for >> us >>>>> as developers. >>>>> >>>>> My problem is that I can be persuaded either way :) >>>> >>>> I'd recommend going the "same UUID" way, because, if our only concern >>>> here is the SB DB resync performance, I'd argue that because we already >>>> have a SB schema change in the same patch, it would anyway cause the >>>> whole DB data be resynced during the DB conversion, in one jsonRPC. >>> >>> You're right and, thinking more about it, if on upgrade the SB server is >>> restarted, there's no fast-resync anyway and all ovn-controllers in the >>> cluster will get the whole SB content (or at least the part that matches >>> the monitor condition). >>> >> >> (Sorry Dumitru, I replied to Mark just before this one.) >> >> I'd like to clarify that for regular OVSDB server restart (e.g. during >> maintenance), fast-resync will work, because ovn-controller has the local >> cache. >> Fast-resync just doesn't work when there is a schema change, because the >> server will inform the clients to cancel the existing monitor and drop the >> cache, which is the case in this new release of OVN and also most of the >> releases in the past. >> >>>> Please correct me if I am wrong with the latest OVSDB schema change >>>> behavior. If the whole DB resync can be tolerated by operators, the >> UUID >>>> change should also be ok because the size of data change would be >>>> slightly smaller, although almost at the same level. Of course we >>>> understand that the DB conversion happens most likely at a separate >> step >>>> during the upgrade, so we'd expect two large updates instead of one. >> But >>>> if the bigger one can be handled, the smaller one should be fine. >>>> >>> >>> Not necessarily, the whole system might be just on the limit of hitting >>> a liveness probe timeout or similar and the extra time to do the >>> "initial processing" might get it over the limit. But I'm just >>> speculating without a real example. >> >> I believe between schema conversion and the data change from northd there >> is a chance to respond to liveness probes, but I am not 100% sure. >> On the other hand, in our practice we always do DB upgrade and northd >> upgrade in separate steps, for which I assumed the same for any large scale >> operators. Probably we should document this as suggestions in our upgrade >> guide? Again, I am not 100% sure if this is necessary. In our practice with >> separate steps it is just more controllable. >> >>> >>>> Thanks, >>>> Han >>>> >>>>> >>>>> I think a good compromise is that we merge v17 for ovn25.09. In the >>>>> ovn26.03 development cycle, we can coordinate with ovn-kubernetes >>>>> developers to try to fix the bug that causes integer enums not to work >>>>> with libovsdb. Then we can put out an update that essentially does >> what >>>>> v16 does. However, we can try to mitigate the potential performance >> hit >>>>> in certain ways, such as >>>>> * Batching the updates to datapath bindings so that we don't update >>>>> them all at once, but instead just update a few at a time each >>>>> transaction. Or, >>>>> * Keep backwards compatibility in the code and only sync UUIDs on >>>>> datapaths that are added after upgrade. Existing ones can still use >> the >>>>> "old" way of using the nb_uuid column or >>>>> external-ids:logical-switch/logical-router. Getting the nb_uuid from a >>>>> southbound datapath binding is done via a helper function, so we only >>>>> have to update the one function in order to get the nb_uuid properly. >>>>> >>>>> Either way, having months of development time to plan this instead of >> 2 >>>>> days would be better for everyone, I think. >>>>> >>> >>> I'd say, also based on the discussion above, I'm now inclining towards >>> getting the version that syncs the NB -> SB UUID in before branching. >>> >>> Another reason for me changing my mind about this is that we still have >>> some time until the actual release. So we can change the schema back >>> even after branching if we decide that's better performance-wise. >>> >>> However, we'd need a slightly different version of the series that also >>> syncs NB -> SB UUID (currently v15), one that doesn't add the integer >>> enum type. In order to work on the most recent submission I adapted the >>> v16 patch 1/8 slightly, here's the incremental diff: >>> >>> https://github.com/dceara/ovn/commit/b4345db >>> >>> It's essentially a port of the uuid syncing from v15 1/8 (skipping the >>> enum change). >>> >>> Mark, does this look correct to you? If so, I can take another look at >>> v16 with the incremental change squashed in tomorrow morning and try to >>> get it merged into main. >>> >>> In case others want to look at my current branch that's: >>> >> https://github.com/dceara/ovn/commits/refs/heads/review-datapath-i-p-v17-sync-uuid/ >>> >> >> Thanks Mark, Dumitru and Ales for the quick revisions! >> >> Han >> > > Hi Mark, Dumitru and Han, > > looking through the conversation I agree with the NB->SB uuid sync. > There is only one minor thing I have found in the diff provided by Dumitru > with the below fix applied it should be fine: > > diff --git a/ic/ovn-ic.c b/ic/ovn-ic.c > index 232578d64..bf9a56928 100644 > --- a/ic/ovn-ic.c > +++ b/ic/ovn-ic.c > @@ -2579,8 +2579,7 @@ main(int argc, char *argv[]) > ovsdb_idl_add_column(ovnsb_idl_loop.idl, &sbrec_encap_col_options); > > ovsdb_idl_add_table(ovnsb_idl_loop.idl, &sbrec_table_datapath_binding); > - ovsdb_idl_add_column(ovnsb_idl_loop.idl, > - &sbrec_datapath_binding_col_external_ids); > + ovsdb_idl_add_column(ovnsb_idl_loop.idl, > &sbrec_datapath_binding_col_type); > > ovsdb_idl_add_table(ovnsb_idl_loop.idl, &sbrec_table_port_binding); > ovsdb_idl_add_column(ovnsb_idl_loop.idl, >
Hi all, Thank you, everyone, for the discussion! I'll be folding the above change in too, along with all the other minor comments Ales had on v16 and will merge it to main today. Han, I know you mentioned you hadn't finished the review yet but I hope it's fine to address any issues you might find as follow ups. Best regards, Dumitru _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev