On Tue, Aug 3, 2021 at 11:57 AM Numan Siddique <[email protected]> wrote: > > On Tue, Aug 3, 2021 at 2:34 PM Han Zhou <[email protected]> wrote: > > > > On Tue, Aug 3, 2021 at 11:09 AM Numan Siddique <[email protected]> wrote: > > > > > > On Fri, Jul 30, 2021 at 3:22 AM Han Zhou <[email protected]> wrote: > > > > > > > > Note: This patch series is on top of a pending patch that is still under > > > > review: > > http://patchwork.ozlabs.org/project/ovn/patch/[email protected]/ > > > > > > > > It is RFC because: a) it is based on the unmerged patch. b) DDlog > > > > changes are not done yet. Below is a copy of the commit message of the > > last > > > > patch in this series: > > > > > > > > For a fully distributed virtual network dataplane, ovn-controller > > > > flood-fills datapaths that are connected through patch ports. This > > > > creates scale problems in ovn-controller when the connected datapaths > > > > are too many. > > > > > > > > In a particular situation, when distributed gateway ports are used to > > > > connect logical routers to logical switches, when there is no need for > > > > distributed processing of those gateway ports (e.g. no dnat_and_snat > > > > configured), the datapaths on the other side of the gateway ports are > > > > not needed locally on the current chassis. This patch avoids pulling > > > > those datapaths to local in those scenarios. > > > > > > > > There are two scenarios that can greatly benefit from this optimization. > > > > > > > > 1) When there are multiple tenants, each has its own logical topology, > > > > but sharing the same external/provider networks, connected to their > > > > own logical routers with DGPs. Without this optimization, each > > > > ovn-controller would process all logical topology of all tenants and > > > > program flows for all of them, even if there are only workloads of a > > > > very few number of tenants on the node where the ovn-controller is > > > > running, because the shared external network connects all tenants. > > > > With this change, only the logical topologies relevant to the node > > > > are processed and programmed on the node. > > > > > > > > 2) In some deployments, such as ovn-kubernetes, logical switches are > > > > bound to chassises instead of distributed, because each chassis is > > > > assigned dedicated subnets. With the current implementation, > > > > ovn-controller on each node processes all logical switches and all > > > > ports on them, without knowing that they are not distributed at all. > > > > At large scale with N nodes (N = hundreds or even more), there are > > > > roughly N times processing power wasted for the logical connectivity > > > > related flows. With this change, those depolyments can utilize DGP > > > > to connect the node level logical switches to distributed router(s), > > > > with gateway chassis (or HA chassis without really HA) of the DGP > > > > set to the chassis where the logical switch is bound. This inherently > > > > tells OVN the mapping between logical switch and chassis, and > > > > ovn-controller would smartly avoid processing topologies of other > > node > > > > level logical switches, which would hugely save compute cost of each > > > > ovn-controller. > > > > > > > > For 2), test result for an ovn-kubernetes alike deployment shows > > > > signficant improvement of ovn-controller, both CPU (>90% reduced) and > > memory. > > > > > > > > Topology: > > > > > > > > - 1000 nodes, 1 LS with 10 LSPs per node, connected to a distributed > > > > router. > > > > > > > > - 2 large port-groups PG1 and PG2, each with 2000 LSPs > > > > > > > > - 10 stateful ACLs: 5 from PG1 to PG2, 5 from PG2 to PG1 > > > > > > > > - 1 GR per node, connected to the distributed router through a join > > > > switch. Each GR also connects to an external logical switch per node. > > > > (This part is to keep the test environment close to a real > > > > ovn-kubernetes setup but shouldn't make much difference for the > > > > comparison) > > > > > > > > ==== Before the change ==== > > > > OVS flows per node: 297408 > > > > ovn-controller memory: 772696 KB > > > > ovn-controller recompute: 13s > > > > ovn-controller restart (recompute + reinstall OVS flows): 63s > > > > > > > > ==== After the change (also use DGP to connect node level LSes) ==== > > > > OVS flows per node: 81139 (~70% reduced) > > > > ovn-controller memory: 163464 KB (~80% reduced) > > > > ovn-controller recompute: 0.86s (>90% reduced) > > > > ovn-controller restart (recompute + reinstall OVS flows): 5s (>90% > > reduced) > > > > > > Hi Han, > > > > > > Thanks for these RFC patches. The improvements are significant. > > > That's awesome. > > > > > > If I understand this RFC correctly, ovn-k8s will set the > > > gateway_chassis for each logical > > > router port of the cluster router (ovn_cluster_router) connecting to > > > the node logical switch right ? > > > > > > If so, instead of using the multiple gw port feature, why can't > > > ovn-k8s just set the chassis=<node_chassis_name> > > > in the logical switch other_config option ? > > > > > > ovn-controllers can exclude the logical switches from the > > > local_datapaths if they don't belong to the local chassis. > > > > > > I'm not entirely sure if this would work. Any thoughts ? If the same > > > can be achieved using the chassis option > > > instead of multiple gw router ports, perhaps the former seems better > > > to me as it would be less work for ovn-k8s. > > > And there will be fewer resources in SB DB. What do you think ? > > > Otherwise +1 from me for > > > this RFC series. > > > > > > > Thanks Numan for the feedback! > > The reason why not introducing a new option in LS is: > > 1) The multiple DGP support is a valuable feature regardless of the use > > case of this RFC. > > 2) Don't flood-fill beyond DGP port is also valuable regardless of the > > ovn-k8s use case. As mentioned it would also help the OpenStack scalability > > when multi-tenant sharing same provider networks. > > 3) If 1) and 2) are both implemented, there is no need for an extra > > mechanism for "bind logical switches to chassis", because the outcome of 1) > > and 2) are sufficient. The changes in ovn-k8s would be the same, i.e. set > > the chassis somewhere, either to a LRP or a LS. I have sent a WIP PR to the > > ovn-k8s repo and it appears to be a very small change: > > https://github.com/ovn-org/ovn-kubernetes/pull/2388 > > > > In addition, a separate option on LS seems unnatural to me, because the end > > user must understand what they are doing by setting that option. In > > contrast, the DGP more flexibly and accurately tells what OVN should do. > > Maybe the name "Distributed Gateway Port" is somehow confusing, but the > > chassis-redirect port behind it is telling OVN that the user wants the > > traffic to be redirected to a chassis for the LRP. There can be different > > scenarios such as a single LS connecting to multiple DGPs and vice versa, > > all are valid setups that can be supported by this feature. While setting a > > chassis option for a LS is arbitrary and it is easy to create conflict > > setups, e.g. setting such an option for LS-join. Of course we can say the > > user is responsible for what they are setting, but I just don't see it > > necessary for now. > > > > Does this make sense? > > Thanks for the detailed explanation. Makes sense to me. As you > mentioned, the name "gateway" is > a bit odd. Since much of the traffic would be E-W. > > Would you be fine rebasing the DGP patch and this RFC series and reposting ? > I'd like to test it out in our setup and which would also help in > understanding the feature better.
Thanks Numan. Yes I am working on the rebasing. While resolving conflicts I found a problem of the commit: 1c9e46ab5 northd: add check_pkt_larger lflows for ingress traffic I will reply the original patch email to discuss. _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
