Hi Dumitru,
I read the patch series, and I think the idea of chassis-specific
variables is a good idea to reduce the number of DB records for certain
things. Aside from load balancers, I suspect this could have a positive
impact for other structures as well.
Rather than criticizing the individual lines of code, I'll focus instead
on some higher-level questions/ideas.
First, one question I had was what happens when a template variable name
is used in a load balancer, but there is no appropriate value to
substitute? For instance, what if a load balancer applies to chassis-3,
but you only have template variables for chassis-1 and chassis-2? This
might be addressed in the code but I didn't notice if it was.
Second, it seems like template variables are a natural extension of
existing concepts like address sets and port groups. In those cases,
they were an unconditional collection of IP addresses or ports. For
template variables, they're a collection of untyped values with the
condition of only applying on certain Chassis. I wonder if this could
all be reconciled with a single table that uses untyped values with
user-specified conditions. Right now template variables have a "Chassis"
column, but maybe this could be replaced with a broader "condition",
"when", or "match" column. To get something integrated quickly, this
column could just accept the syntax of "chassis.name == <blah>" or
"chassis.uuid == <blah>" to allow for chassis-specific application of
the values. With this foundation, we could eventually allow
unconditional application of the value, or more complex conditions (e.g.
only apply to logical switch ports that are connected to a router with a
distributed gateway port). Doing this, we could deprecate address sets
and port groups eventually in favor of template variables.
Third, I was wondering if there could be some layer that exists between
the IDL and the application that expands the template variables as early
as possible. I'm thinking the application could inject some callback in
the IDL layer that might allow for the values to be substituted. This
way, the variable substitution is taken care of in a single place, and
by the time the application gets the data, it knows that all
substitutions have been made and there is no need to special case
template variable names vs. plain tokens. They should all be plain
tokens. I don't think construction of such a layer should be a barrier
to merging the code, but it's something worth considering as a later
improvement.
Anyway, those were my high-level thoughts on the topic. Let me know what
you think.
On 8/5/22 12:26, Dumitru Ceara wrote:
Sometimes network components are compute node-specific. Sometimes such
components are replicated, almost identically, for multiple nodes
in the cluster.
One such example is the case of Kubernetes NodePort services which
translate (in the ovn-kubernetes case) to Load_Balancer
objects being applied to each and every node's logical gateway router.
These load balancers are almost identical, the main difference being
the fact that they use different VIPs (the node's IP).
With the current OVN load balancer design, this becomes a problem at
scale because the number of load balancers that must be configured is
N x M (N nodes times M services).
This series proposes a new concept in OVN: virtual network component
templates. The goal of the templates is to help reduce resource
consumption in the OVN central components in specific cases like the one
described above.
To achieve that, the CMS will instead configure a "templated" load
balancer for every service and apply that single template record to
the cluster-wide load balancer group. This template is then
instantiated differently on different compute nodes. This translation
is controlled through per-chassis "template variables" configured by
the CMS in the new NB.Template_Var table.
A syntetic benchmark simulating what an OpenShift router (using Node
Port services) scale test would do shows the following preliminary
results:
A. 120 node, 2K NodePort services:
- before:
- Southbound DB size on disk (compacted): ~385MB
- Southbound DB memory usage (RSS): ~3GB
- Southbound DB logical flows: 720K
- after:
- Southbound DB size on disk (compacted): ~100MB
- Southbound DB memory usage (RSS): ~250MB
- Southbound DB logical flows: 6K
B. 250 node, 2K NodePort services:
- after (didn't run the "before" test as it was taking way too long):
- Southbound DB size on disk (compacted): ~155MB
- Southbound DB memory usage (RSS): ~760MB
- Southbound DB logical flows: 6K
The series is sent as RFC because there's still the need to add
some template specific unit tests and the "ovn-nbctl lb-*" helper
utilities need to be adapted to support templated load balancers.
With these two items addressed the code is self can likely qualify
for acceptance as a new feature in the upcoming release.
There also exists a more extensive TODO list (also listed in the commit
log of every patch in the series for now) but these are mainly load
balancer related functionalities that are not yet implemented for
templated load balancers but can definitely be implemented as follow ups:
- No support for LB health check if the LB is templated.
- No support for VIP ARP responder if the LB is templated.
- No support for routed VIPs if the LB is templated.
- Figure out a way to deal with templates in ovn-trace
- Determine if we need to allow Template_Var to match against chassis
hostname or other IDs.
- Make ofctrl_inject_pkt() work with template_vars.
- Make test-ovn work with template_vars.
A basic example of how to configure a templated load balancer follows:
$ ovn-nbctl create load_balancer name=lb-test \
protocol=tcp options:template=true \
vips:\"^vip:4200\"="^backends"
$ ovn-nbctl ls-add ls
$ ovn-nbctl ls-lb-add ls lb-test
# Instantiate the load balancer on chassis-1
$ ovn-nbctl create template_var name=vip value=80.80.80.1 chassis=chassis-1
$ ovn-nbctl create template_var name=backends value='"42.42.42.1:1000"'
chassis=chassis-1
# Instantiate the load balancer on chassis-2
$ ovn-nbctl create template_var name=vip value=80.80.80.2 chassis=chassis-2
$ ovn-nbctl create template_var name=backends value='"42.42.42.2:1000"'
chassis=chassis-2
Dumitru Ceara (5):
Add NB and SB Template_Var tables.
controller: Add support for templated actions and matches.
controller: Make resource references more generic.
lb: Support using templates.
controller: Add Template_Var <- LB references.
controller/lflow.c | 248 ++++++++++++++++++++++++--------
controller/lflow.h | 98 +++++++------
controller/ofctrl.c | 9 +-
controller/ofctrl.h | 3 +-
controller/ovn-controller.c | 277 ++++++++++++++++++++++++++++++++++--
include/ovn/expr.h | 4 +-
include/ovn/lex.h | 14 +-
lib/actions.c | 9 +-
lib/expr.c | 14 +-
lib/lb.c | 201 ++++++++++++++++++++++----
lib/lb.h | 36 +++--
lib/lex.c | 55 +++++++
northd/northd.c | 64 +++++----
tests/ovn.at | 2 +-
tests/test-ovn.c | 16 ++-
utilities/ovn-trace.c | 26 +++-
16 files changed, 869 insertions(+), 207 deletions(-)
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev