On 9/20/24 04:22, Ilya Maximets wrote:
On 9/20/24 07:26, Han Zhou wrote:
On Tue, Sep 17, 2024 at 8:54 AM Ilya Maximets <[email protected]> wrote:

ovn-northd copies external IDs from Logical Switch, Router and their
Port records to corresponding Southbound Datapath and Port Binding
records.  IDs in other tables are not used by northd in any way, so
there is no point in monitoring them.

CMSes tend to create a huge amount of external IDs for every record
to the point where they can take literally half of the database data.
In high scale clusters that can be several hundreds of MB.  Not
monitoring them saves a lot of time and memory while downloading
initial database snapshots on the first connection and should also
reduce the ongoing cost while new resources are being created.

This will also help avoiding unnecessary re-computes when external
IDs are updated without changing any other data.

Tested on a 500 MB Northbound DB that contains 1M ACLs created by
ovn-kubernetes in a test cluster mimicking a real world setup.

Just curious, is the ovn-k8s in the test cluster configured with OVN
IC mode or central mode?

It is an IC cluster.  However, AFAIU, ovn-kubernetes duplicates most
of ACLs among AZs, so the database from a central mode setup wouldn't
be much different in size.  It'll have more ports and some other things,
but nearly not enough to compete with the amount of space occupied
by ACLs.


Before the change it took 20 seconds for the ovsdb-server to send out
an initial database snapshot and 19 seconds for ovn-northd to receive
it, parse and run a full recompute, consuming 5.4 GB of RAM.  With the
change it takes 15 seconds on the database side and 11 seconds for the
ovn-northd, consuming 2.9 GB of RAM.  (Note: the test was performed in
a sandbox with no OVN chassis connected, so northd didn't generate a
lot of logical flows for those ACLs.)

So, we saved:
  - 25% of CPU time on the database side.
  - 42% of CPU time on the ovn-northd side.
  - 2.5 GB (46%) of RAM on ovn-northd.

Signed-off-by: Ilya Maximets <[email protected]>
---
  northd/ovn-northd.c | 36 ++++++++++++++++++++++++++++++++++++
  1 file changed, 36 insertions(+)

diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
index d71114f35..89ef4e870 100644
--- a/northd/ovn-northd.c
+++ b/northd/ovn-northd.c
@@ -820,6 +820,42 @@ main(int argc, char *argv[])
      ovsdb_idl_omit_alert(ovnnb_idl_loop.idl,
                           &nbrec_nb_global_col_hv_cfg_timestamp);

+    /* Ignore northbound external IDs, except for logical switch, router and
+     * their ports, for which the external IDs are propagated to corresponding
+     * southbound datapath and port binding records. */
+    const struct ovsdb_idl_column *external_ids[] = {
+        &nbrec_acl_col_external_ids,
+        &nbrec_address_set_col_external_ids,
+        &nbrec_bfd_col_external_ids,
+        &nbrec_chassis_template_var_col_external_ids,
+        &nbrec_connection_col_external_ids,
+        &nbrec_copp_col_external_ids,
+        &nbrec_dhcp_options_col_external_ids,
+        &nbrec_dhcp_relay_col_external_ids,
+        &nbrec_dns_col_external_ids,
+        &nbrec_forwarding_group_col_external_ids,
+        &nbrec_gateway_chassis_col_external_ids,
+        &nbrec_ha_chassis_col_external_ids,
+        &nbrec_ha_chassis_group_col_external_ids,
+        &nbrec_load_balancer_col_external_ids,
+        &nbrec_load_balancer_health_check_col_external_ids,
+        &nbrec_logical_router_policy_col_external_ids,
+        &nbrec_logical_router_static_route_col_external_ids,
+        &nbrec_meter_col_external_ids,
+        &nbrec_meter_band_col_external_ids,
+        &nbrec_mirror_col_external_ids,
+        &nbrec_nat_col_external_ids,
+        &nbrec_nb_global_col_external_ids,
+        &nbrec_port_group_col_external_ids,
+        &nbrec_qos_col_external_ids,
+        &nbrec_ssl_col_external_ids,
+        &nbrec_sample_collector_col_external_ids,
+        &nbrec_sampling_app_col_external_ids,
+    };
+    for (size_t i = 0; i < ARRAY_SIZE(external_ids); i++) {
+        ovsdb_idl_omit(ovnnb_idl_loop.idl, external_ids[i]);
+    }
+
      unixctl_command_register("nb-connection-status", "", 0, 0,
                               ovn_conn_show, ovnnb_idl_loop.idl);

--
2.46.0

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Thanks Ilya. The result numbers are impressive and the change is
simple and straightforward.
Acked-by: Han Zhou <[email protected]>

Thanks!

Best regards, Ilya Maximets.

Thank you Ilya and Han. I pushed this to main. I didn't push to any other branches since this is a performance improvement. If you think this should be pushed to other branches, let me know.

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to