On Wed, Oct 5, 2022 at 8:24 AM Dumitru Ceara <[email protected]> wrote:
>
> On 10/5/22 08:19, Han Zhou wrote:
> > On Fri, Sep 30, 2022 at 7:01 AM Dumitru Ceara <[email protected]> wrote:
> >>
> >> Sometimes network components are compute node-specific.  Sometimes such
> >> components are replicated, almost identically, for multiple nodes
> >> in the cluster.
> >>
> >> One such example is the case of Kubernetes NodePort services which
> >> translate (in the ovn-kubernetes case) to Load_Balancer
> >> objects being applied to each and every node's logical gateway router.
> >> These load balancers are almost identical, the main difference being
> >> the fact that they use different VIPs (the node's IP).
> >>
> >> With the current OVN load balancer design, this becomes a problem at
> >> scale because the number of load balancers that must be configured is
> >> N x M (N nodes times M services).
> >>
> >> This series proposes a new concept in OVN: virtual network component
> >> templates.  The goal of the templates is to help reduce resource
> >> consumption in the OVN central components in specific cases like the
one
> >> described above.
> >>
> >> To achieve that, the CMS will instead configure a "templated" load
> >> balancer for every service and apply that single template record to
> >> the cluster-wide load balancer group.  This template is then
> >> instantiated differently on different compute nodes.  This translation
> >> is controlled through per-chassis "template variables" configured by
> >> the CMS in the new NB.Template_Var table.
> >>
> > Thanks Dumitru for the great improvement!
> >
>
> Thanks for reviewing this!
>
> >> A syntetic benchmark simulating what an OpenShift router (using Node
> >> Port services) scale test would do shows the following preliminary
> >> results:
> >> A. 120 node, 2K NodePort services:
> >> - before:
> >>   - Southbound DB size on disk (compacted): ~385MB
> >>   - Southbound DB memory usage (RSS): ~3GB
> >>   - Southbound DB logical flows: 720K
> >>
> >> - after:
> >>   - Southbound DB size on disk (compacted): ~100MB
> >>   - Southbound DB memory usage (RSS): ~250MB
> >>   - Southbound DB logical flows: 6K
> >>
> >> B. 250 node, 2K NodePort services:
> >> - after (didn't run the "before" test as it was taking way too long):
> >>   - Southbound DB size on disk (compacted): ~155MB
> >>   - Southbound DB memory usage (RSS): ~760MB
> >>   - Southbound DB logical flows: 6K
>
> I'll add the (hacky) benchmark script below just for clarity.
>
Thanks for sharing the script. So I see that it is one LSP + LS per node,
which is fair for this test. Even so, I would expect more lflows than 6k.
It doesn't look right that 120 nodes and 250 nodes has exactly same amount
lflows. Maybe you are counting the lflows related to LBs only (i.e. from
certain tables)?

Han

> >
> > A quick question to the test. How many LSPs per node? I am just
wondering,
> > how could the number of lflows be the same (6k) when number of nodes
> > increased from 120 to 250? For some of my scale tests, the number of
lflows
> > are far more than this even if I don't create any LBs. (also consider
that
> > ovn-k8s deployment has at least an ext-LS and a GR per node)
>
> I really only focused on logical flows (and SB.Load_Balancers) created
> due to NB.Load_Balancers provisioned like ovn-k8s provisions them today.
> So the test doesn't add a lot of LSPs.  However, in the "OpenShift
> router" scenario I was trying to fix, the load due to LSPs is also
> minimal.  It's exactly the huge number of (very similar)
> load balancers that causes issues.
>
> > I have no doubt of the effectiveness of this improvement, but just need
to
> > understand the numbers better since I am also doing scale tests and
> > measurements on top of this patch series.
>
> Sure, makes complete sense.  And if we can find even more use cases for
> component templates, even better!
>
> >
> > Thanks,
> > Han
>
> Thanks,
> Dumitru
>
> ---
> diff --git a/tutorial/node-template-lb-stress.sh
b/tutorial/node-template-lb-stress.sh
> new file mode 100755
> index 0000000000..e1a051182a
> --- /dev/null
> +++ b/tutorial/node-template-lb-stress.sh
> @@ -0,0 +1,57 @@
> +#!/bin/bash
> +
> +nrtr=$1
> +nlb=$2
> +nbackends=$3
> +
> +echo "ROUTERS        : $nrtr"
> +echo "LBS            : $nlb"
> +echo "BACKENDS PER LB: $nbackends"
> +
> +export OVN_NB_DAEMON=$(ovn-nbctl --detach)
> +export OVN_SB_DAEMON=$(ovn-sbctl --detach)
> +trap "killall -9 ovn-nbctl; killall -9 ovn-sbctl" EXIT
> +
> +lbg=$(ovn-nbctl create load_balancer_group name=lbg)
> +for i in $(seq $nrtr); do
> +    r=lr-$i
> +    lrp=lrp-$i
> +    echo Router $r
> +    ovn-nbctl lr-add $r -- set logical_router $r load_balancer_group=$lbg
> +    ovn-nbctl lrp-add $r $lrp 00:00:00:00:01:00 88.88.88.88
> +    s=ls-$i
> +    echo Switch $s
> +    ovn-nbctl ls-add $s -- set logical_switch $s load_balancer_group=$lbg
> +    lsp=lsp-$i
> +    echo LSP $lsp
> +    ovn-nbctl lsp-add $s $lsp
> +    ovs-vsctl add-port br-int $lsp -- set interface $lsp
external_ids:iface-id=$lsp
> +done
> +
> +for l in $(seq $nlb); do
> +    lb=lb-$l
> +    ovn-nbctl --template lb-add $lb "^vip:$l" "^backends$l" tcp
> +    lb_uuid=$(ovn-nbctl --columns _uuid --bare find load_balancer
name=$lb)
> +    ovn-nbctl add load_balancer_group $lbg load_balancer $lb_uuid
> +done
> +
> +for i in $(seq $nrtr); do
> +    ovn-nbctl create chassis_template_var name=vip value=42.42.42.$i
chassis_name="chassis-$i"
> +
> +    cmd=
> +    for j in $(seq $nlb); do
> +        echo "CREATING TEMPLATE VARS for RTR $i LB $j"
> +        backends=""
> +        for k in $(seq $nbackends); do
> +            j1=$(expr $j / 250)
> +            j2=$(expr $j % 250)
> +            backends="42.$k.$j1.$j2:$j,$backends"
> +        done
> +        cmd="$cmd -- create chassis_template_var name=backends$j
value=\"$backends\" chassis_name=\"chassis-$i\""
> +        if [ $(expr $j % 1000) -eq "0" ]; then
> +            ovn-nbctl $cmd
> +            cmd=
> +        fi
> +    done
> +    ovn-nbctl $cmd
> +done
>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to