numan On Wed, Apr 2, 2025, 7:45 PM Numan Siddique <num...@ovn.org> wrote:
> On Mon, Mar 31, 2025 at 6:33 PM Han Zhou <hz...@ovn.org> wrote: > > > > > > > > On Thu, Mar 27, 2025 at 10:16 AM Numan Siddique <num...@ovn.org> wrote: > > > > > > On Tue, Mar 25, 2025 at 9:06 AM Han Zhou <hz...@ovn.org> wrote: > > > > > > > > > > > > > > > > On Fri, Mar 14, 2025 at 12:35 AM Numan Siddique <num...@ovn.org> > wrote: > > > > > > > > > > On Thu, Mar 13, 2025 at 12:21 PM Han Zhou <hz...@ovn.org> wrote: > > > > > > > > > > > > > > > > > > > > > > > > On Sat, Mar 1, 2025 at 6:38 AM <num...@ovn.org> wrote: > > > > > >> > > > > > >> From: Numan Siddique <num...@ovn.org> > > > > > >> > > > > > >> This patch series optimizes ovn-controller to add only relevant > > > > > >> datapaths to its "local datapaths" map if the logical topology > > > > > >> has a single flat provider logical switch connected to multiple > > > > > >> routers with a distributed gateway port. > > > > > >> > > > > > >> (Patch 3 has a detailed commit message.) > > > > > >> > > > > > >> v1 -> v2 > > > > > >> ----- > > > > > >> * Rebased to resolve the conflicts. > > > > > >> > > > > > >> Numan Siddique (3): > > > > > >> controller: Store local binding lports in local_datapath. > > > > > >> northd: Add a flag 'only_dgp_peer_ports' to the SB Datapath. > > > > > >> controller: Optimize adding 'dps' to the local datapaths. > > > > > >> > > > > > >> controller/binding.c | 324 +++++++++++--- > > > > > >> controller/binding.h | 2 + > > > > > >> controller/local_data.c | 98 ++++- > > > > > >> controller/local_data.h | 10 +- > > > > > >> controller/lport.c | 12 + > > > > > >> controller/lport.h | 4 + > > > > > >> controller/ovn-controller.c | 38 ++ > > > > > >> northd/northd.c | 31 +- > > > > > >> northd/northd.h | 4 + > > > > > >> tests/multinode.at | 185 +++++++- > > > > > >> tests/ovn-northd.at | 59 +++ > > > > > >> tests/ovn-performance.at | 6 +- > > > > > >> tests/ovn.at | 853 > ++++++++++++++++++++++++++++++++++++ > > > > > >> 13 files changed, 1545 insertions(+), 81 deletions(-) > > > > > >> > > > > > >> -- > > > > > >> 2.48.1 > > > > > >> > > > > > >> _______________________________________________ > > > > > >> dev mailing list > > > > > >> d...@openvswitch.org > > > > > >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > > > > > > > > > > > > > > > > > Hi Numan, > > > > > > > > > > > > Sorry for the slow review. I have a general comment for this > series. I believe the exact same performance optimization is supposed to be > achieved by the below commit: > > > > > > > > > > > > 22298fd37 ovn-controller: Don't flood fill local datapaths > beyond DGP boundary. > > > > > > > > > > > > The scenarios are mentioned in detail in the commit message of > that patch, which is very similar to what you described here. > > > > > > > > > > > > The only exception was when there are distributed NAT or when > the redirection type is set to "bridged". In these cases it will fall back > to add peer datapath as local. See SB:Port_Binding:options:always-redirect. > Is this something that your patch is targeting? Otherwise I wonder why it > didn't work without your patch. > > > > > > > > > > Hi Han, > > > > > > > > > > I wanted to mention in the commit message and in the cover letter > that > > > > > this patch series is an extension of the commit 22298fd37. I > somehow > > > > > missed it. > > > > > > > > > > Yes, I'm targeting the scenario of distributed NAT, > > > > > > > > > > > > > I'd request to update the commit message and focus on how the patch > addresses the distributed NAT scenario. Otherwise it would be misleading, > since the problem doesn't exist for the scenario described in the current > commit message. > > > > > > Hi Han, > > > > > > Thanks for the comments. Sure I'll update the commit message. > > > > > > > > > > > In addition, could you explain a little more about the solution. How > does the has_only_dgp_peer_ports address the distributed NAT scenario? For > my understanding, when there is distributed NAT associated to the LR, the > chassis needs to know the > > > provider LS so that the NAT pipeline can work on the chassis. If the > > > peer LS (provider LS) is not added just because of > > > has_only_dgp_peer_ports, how would the distributed NAT work? > > > > > > [NUMAN[ The provider LS is added to the local data paths but it > > > doesn't go beyond it. > > > > > > Let's take the topology in the commit message as an example. Let's say > > > all the routers lr0, lr1, lr2.... have distributed nats. > > > And let's say ovn-controller running on chassis 'A' claims the logical > > > port of sw0. It starts by adding 'sw0' to the local_datapaths. > > > Since add_local_datapath() traverses the topology through the > > > connected peers, it will add lr0 to its local_datapaths. > > > Since lr0 is connected to the provider LS public, it will also add > > > "public" to its local_datapaths. > > > The main thing added by this patch is that it stops at "public" LS if > > > the option "pure_provider_switch" set on this datapath. > > > > > > Note: I'll be renaming the option to "pure_provider_switch" in the > > > next version. > > > > > > So after claiming sw0's port, the local_datapaths map will have [sw0, > > > lr0, public] > > > Having the logical flows of these datapaths processed into openflows > > > is enough for the distributed nats associated with the "sw0"'s logical > > > ports to work. > > > > > > If this option was not set, then by the end of the > > > add_local_datapath() finally returns to its original caller, the > > > local_datapaths map would have [sw0, lr0, public, lr1, sw1, lr2, sw2, > > > ....] > > > > > > Now let's say ovn-controller of chassis 'A' claims sw3's logical port, > > > then it will first call add_local_datapath(sw3). And this will add > > > [sw0, lr3] to the local_datapaths map. > > > Note that public is already present in the local_datapaths. After > > > this the local_datapaths map will look like - [sw0, lr0, public, sw3, > > > lr3]. > > > If suppose there are no more port bindings of sw3, then the patch also > > > makes sure to remove sw3 and lr3 from the local_datapaths map. > > > > > > If you see the topology, I'm visualising that "public" LS is like a > > > root node, and ovn-controller will start traversing from the leaf and > > > stop at root. It will not traverse the other branches of "root" when > > > it reaches it. > > > > > > Suppose if there is a logical port of "public" which ovn-controller > > > claims, it adds "public" to the local datapaths and stops there. > > > If the ovn-controller has not claimed any other datapath logical > > > ports, then there is no need to add them to the local datapaths. > > > If the logical port of "public" wants to reach other logical ports, > > > the packet can traverse through the localnet port to the fabric and > > > enter the other chassis where the DNAT IP resides. > > > > > > "pure_provider_switch" option indicates to ovn-controller that it's a > > > "root" node. > > > > > > > > > > > > > I didn't see any handling in your change related to distributed NAT. > I believe I missed something, > > > [Numan] It doesn't require any special handling for distributed NATs > > > in ovn-controller. Your commit "22298fd37" will add the option > > > "always-redirect" > > > for "non distributed NATs" for the logical router datapath. And > > > add_local_datapath() will NOT add "public" to its local datapaths map. > > > However it will add "public" to its local datapath maps, if other > > > logical routers connected to public have distributed NATs. > > > > > > The main difference between the two scenarios (i.e distributed NATs > > > vs non distributed NATs) is that with distributed NATs, this patch > > > adds the "public" LS to the local datapaths map and doesn't traverse > > > further. > > > Whereas with the other scenario, "public" LS is not added at all. > > > > > > > Hi Numan, Thanks for the explanation. Now I see the difference between > the two scenarios. > > > > > > > > because I saw your test case purposely used two LRs, one with > > > distributed NAT and one without, and the one with distributed NAT did > > > have provider LS added to the chassis, although I didn't find how that > > > worked. It would be helpful for code review if you could highlight the > > > related logic, probably also helpful for maintaining the code by > > > adding related comments in the code for this specific consideration. > > > > > > Sure. I'll add more comments. > > > > > > > > > > > > > > On the other hand, I also have doubts about the criteria of deciding > not to add peers according to has_only_dgp_peer_ports. When the peer is a > provider LS and connected by DGP, we shouldn't add the peer, regardless of > whether there is another LR connected to the provider LS with a distributed > LRP, right? The provider LS (and the other LR, say, LR-X) should be added > to the chassis if there is a port-binding on the chassis that is connected > (directly or indirectly) to that LR-X, right? But before this, I think I > need to understand firstly how the distributed NAT scenario is addressed by > your patch. > > > > > > Sorry. I didn't get this question. Can you please see if my above > > > comments addressed your questions ? If not can you please elaborate a > > > bit further ? > I think I understood what you mean here. I'm thinking on those lines and will see if I can come up with something using this approach. Thanks Numan > > > > > > My question was wrong because I didn't distinguish the two scenarios > until I saw your explanation above. But still I have a related question to > the criteria of adding peers of the root (provider LS). In your > implementation the peer will be excluded (not flooded fill) only if the LS > satisfies the pure_provider_switch requirement: > > > > criteria: > > - has one or more localnet ports. > > - has one or more router ports and all its peers are distributed > gateway ports. > > > > I think this criteria is not very generic and hard to reason about. > > I'd disagree. That is the case in our deployments. All the VPC > logical routers are connected to just one provider logical switch, > Even if you take the example of openstack neutron, generally all the > tenant neutron routers are connected to an admin provider neutron > network which provides the external connectivity. > > > Let's say in your example topology one of the LRP connected to the > public LS is fully distributed instead of DGP, but all the other LRPs > are still DGP, then according to your criteria, every chassis will add > all the DPs. However, in theory only the DPs connected by the > distributed LR should be added, and the flood-fill should stop at all > the DGPs when considering peers of the "public" LS, right? > Correct. > > Can we change it to be more generic so that when considering a peer > DP, if it is traversing from a LS to a LR, we handle it just like > "always_redirect == true", while when traversing from a LR to a LS we > need to honestly examine the "always_redirect" for distributed NAT. > Should this work? > > I don't completely understand what you mean by handling " just like > "always_redirect == true". From what I understand in the present > code, we don't consider the peer datapath of the chassisredirect port > if "always-redirect" == true. > > for distributed NAT, we don't set "always_redirect" to true for the > chassisredirect port. > > Can you explain with an example topology ? > > For the scenario I mentioned, let's say "lr40" is a fully distributed > router (with no DGP) and it's connected to the public switch. And if > a compute node C1 claims a logical port of sw1, ideally I'd like to > add [sw1, lr1, public, lr40, sw40] to its local datapaths i,.e lr40 > and sw40 will be added to all the local datapaths of all the compute > nodes if they claim any logical port of other switches. But adding > the code to handle this will be very complicated and ideally I'd love > to get rid of this "pure_provider_switch" option. I think it can be a > future improvement. But I want to start with this. > > > Thanks > Numan > > > > > > > Thanks, > > Han > > > > > > > > Thanks > > > Numan > > > > > > > > > > > Thanks, > > > > Han > > > > > > > > > Numan > > > > > > > > > > > > > > > > Could you help clarify? > > > > > > > > > > > > Thanks, > > > > > > Han > > > > > > > > > > > >> > _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev