Re: [ovs-discuss] OVN: scaling L2 networks beyond 10k chassis - issues

Robin Jarry via discuss Sun, 01 Oct 2023 09:06:40 -0700

Hi Han,

thanks a lot for your detailed answer.


Han Zhou, Sep 30, 2023 at 01:03:
> > I think ovn-controller only consumes the logical flows. The chassis and
> > port bindings tables are used by northd to updated these logical flows.
>
> Felix was right. For example, port-binding is firstly a configuration from
> north-bound, but the states such as its physical location (the chassis
> column) are populated by ovn-controller of the owning chassis and consumed
> by other ovn-controllers that are interested in that port-binding.

I was not aware of this. Thanks.

> > Exactly, but was the signaling between the nodes ever an issue?
>
> I am not an expert of BGP, but at least for what I am aware of, there are
> scaling issues in things like BGP full mesh signaling, and there are
> solutions such as route reflector (which is again centralized) to solve
> such issues.

I am not familiar with BGP full mesh signaling. But from what can tell,
it looks like the same concept than the full mesh GENEVE tunnels. Except
that the tunnels are only used when the same logical switch is
implemented between two nodes.

> > So you have enabled monitor_all=true as well? Or did you test at scale
> > with monitor_all=false.
> >
> We do use monitor_all=false, primarily to reduce memory footprint (and also
> CPU cost of IDL processing) on each chassis. There are trade-offs to the SB
> DB server performance:
>
> - On one hand it increases the cost of conditional monitoring, which
>   is expensive for sure
> - On the other hand, it reduces the total amount of data for the
>   server to propagate to clients
>
> It really depends on your topology for making the choice. If most of the
> nodes would anyway monitor most of the DB data (something similar to a
> full-mesh), it is more reasonable to use monitor_all=true. Otherwise, in
> topology like ovn-kubernetes where each node has its dedicated part of the
> data, or in topologies where you have lots of small "island" such as a
> cloud with many small tenants that never talks to each other, using
> monitor_all=false could make sense (but still need to be carefully
> evaluated and tested for your own use cases).

I didn't see recent scale testing for openstack, but in past testing we
had to set monitor_all=true because the CPU usage of the SB ovsdb was
a bottleneck.

> > The memory usage would be reduced but I don't know to which point. One
> > of the main consumers is the logical flows table which is required
> > everywhere. Unless there is a way to only sync a portion of this table
> > depending on the chassis, disabling monitor_all would save syncing the
> > unneeded tables for ovn-controller: chassis, port bindings, etc.
>
> Probably it wasn't what you meant, but I'd like to clarify that it is not
> about unneeded tables, but unneeded rows in those tables (mainly
> logical_flow and port_binding).
> It indeed syncs only a portion of the tables. It is not depending directly
> on chassis, but depending on what port-bindings are on the chassis and what
> logical connectivity those port-bindings have. So, again, the choice really
> depends on your use cases.

What about the FDB (mac-port) and MAC binding (ip-mac) tables? I thought
ovn-controller does not need them. If that is the case, I thought that
by default, the whole tables (not only some of their rows) were excluded
from the synchronized data.

Thanks!

_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Re: [ovs-discuss] OVN: scaling L2 networks beyond 10k chassis - issues

Reply via email to