On 7/3/25 7:54 AM, Dumitru Ceara wrote:
On 7/2/25 6:26 PM, Mark Michelson via dev wrote:
This is similar to a recent change that refactored datapath syncing.
This works in a very similar way.

* Input nodes create type-agnostic ovn_unpaired_port_bindings. These are
   fed into the en-port-binding-pair node.
* The en-port-binding-pair node ensures that a southbound Port_Binding
   is created for each unpaired port binding. Any remaining soutbound
   Port_Bindings are deleted.
* Type-specific nodes then convert the paired port bindings into
   type-specific paired port bindings that can be consumed by other
   nodes.

However, there are some important differences to note between this and
the datapath sync patch:

* This patch opts for the word "pair" instead of "sync". This is because
   en-port-binding-pair ensures that all southbound Port_Bindings are
   paired with some northbound structure. However, the columns in the
   southobund Port_Binding are not all synced with their northd
   counterpart. Other engine nodes are responsible for fully syncing the
   data.
* Not all southbound Port_Bindings have a corresponding northbound row.
   Therefore, the engine nodes that create unpaired port bindings pass an
   opaque cookie pointer to the pairing node instead of a database row.
* Port bindings tend to be identified by name. This means the code has
   to have several safeguards in place to catch scenarios such as a port
   "moving" from one datapath to another, or a port being deleted and
   re-added quickly.
* Unlike with the datapath syncing code, this required changes to
   northd's incremental processing since northd was already incrementally
   processing some northbound logical switch port changes.

Signed-off-by: Mark Michelson <mmich...@redhat.com>
---

Hi Mark,

Thanks for this new revision!  This is not the actual review though, just an
update with some scale testing results we got with ovn-heater.

v9 -> v10:
  * Rebased.
  * Used designated initializers after allocating structures.
  * Added a new "free_cookie" callback to the
    ovn_unsynced_port_binding_map callbacks. All port binding engine
    nodes now provide this callback. This means there is no longer a
    fallback to the default callbacks if none are provided.
  * Some formatting fixes.

v8 -> v9:
  * Rebased.
  * Changed some file names to be in alignment with others in the
    directory.
  * Fixed memory leak of port tunnel ids.
  * Added a missing "static" to a structure in
    en-port-binding-logical-switch-port.c.

v7 -> v8:
  * Rebased.

v6 -> v7:
  * Rebased.
  * Made some recommended changes to formatting (combining if statements,
    fixing indentation, etc.)
  * Switched from an array of router DPGs in the chassisredirect code
    to a vector.
  * Fixed several memory leaks in the chassisredirect code.
  * Fixed a crash in northd.c in the incremental case where a northbound
    logical switch port did not have a corresponding paired logical
    switch port to find due to tunnel key issues.
  * Ensured that we only compare a requested tunnel key against the
    VXLAN multicast minimum when VXLAN is actually enabled.

v5 -> v6:
  * Rebased.
  * All calls to xzalloc() in this patch have been changed to xmalloc()
    or xcalloc().
  * Candidate paired port bindings use a vector instead of a list now.
  * Chassisredirect ports have been simplified a great deal.
  * This version introduces an enum for port binding types to make type
    checking more efficient.
  * This patch abandons the attempt to blindly pull in unpaired port
    binding maps based on their positions in the engine node inputs
    array. Instead, we now explicitly pull them in by input name and
    assign them to array positions based on their port binding type.

v4 -> v5:
  * Rebased.
  * Fixed some formatting anomalies in mirror port syncing.
  * Fixed checkpatch warnings.

v3 -> v4:
  * Rebased.
  * Addressed newly-added mirror port functionality.
  * Fixed many memory leaks.

v3: This is the first version with this patch present.
---

[...]


diff --git a/TODO.rst b/TODO.rst
index 78962bb92..60ae155c5 100644
--- a/TODO.rst
+++ b/TODO.rst
@@ -168,3 +168,15 @@ OVN To-do List
      ovn\_synced\_logical_router and ovn\_synced\_logical\_switch. This will
      allow for the eventual removal of the ovn\_datapath structure from the
      codebase.
+
+* Port Binding sync nodes
+
+  * Southbound Port bindings are synced across three engine nodes:
+    - en_port_binding_pair
+    - en_northd
+    - en_sync_to_sb
+    It would be easier to work with if these were combined into a
+    single node instead.
+
+  * Add incremental processing to the en-port-binding-pair node, as
+    well as derivative nodes.

Unfortunately, it seems that the lack of incremental processing for these
northd nodes introduces a measurable performance regression.
With ovn-heater, I ran a 500 node density-heavy (simulating realistic
ovn-kubernetes workloads) test.  That showed an increase of ~50% in
time it takes to run the simulated workloads.

Most of that seems to be due to these new nodes performing recomputation
on every input change.

For ease of debugging I uploaded the NB database extracted from that
test here:
https://people.redhat.com/~dceara/ovn-northd-datapath-port-binding-refactor-v10/ovnnb_db.db.gz

To make it easier to debug directly on my machine I hacked the ovn-sandbox
script in the following way:

diff --git a/tutorial/ovn-sandbox b/tutorial/ovn-sandbox
index ed334d1c31..a64bb930dd 100755
--- a/tutorial/ovn-sandbox
+++ b/tutorial/ovn-sandbox
@@ -661,14 +661,6 @@ else
      run ovs-vsctl set open . external-ids:ovn-remote=$OVN_SB_DB
      OVN_CTRLR_PKI=""
  fi
-for i in $(seq $n_ics); do
-    if [ $i -eq 1 ]; then inst=""; else inst=$i; fi
-    rungdb $gdb_ovn_ic $gdb_ovn_ic_ex ovn-ic --detach \
-            --no-chdir --pidfile=ovn-ic${inst}.pid -vconsole:off \
-            --log-file=ovn-ic${inst}.log -vsyslog:off \
-            --ovnsb-db="$OVN_SB_DB" --ovnnb-db="$OVN_NB_DB" \
-            --ic-sb-db="$OVN_IC_SB_DB" --ic-nb-db="$OVN_IC_NB_DB"
-done
northd_args=
  OVN_NORTHD=ovn-northd
@@ -680,17 +672,11 @@ for i in $(seq $n_northds); do
              --log-file=$OVN_NORTHD$inst.log -vsyslog:off \
              --ovnsb-db="$OVN_SB_DB" --ovnnb-db="$OVN_NB_DB" $northd_args
  done
-for i in $(seq $n_controllers); do
-    if [ $i -eq 1 ]; then inst=""; else inst=$i; fi
-    rungdb $gdb_ovn_controller $gdb_ovn_controller_ex ovn-controller \
-            $OVN_CTRLR_PKI --detach --no-chdir -vsyslog:off \
-            --log-file=ovn-controller${inst}.log \
-            --pidfile=ovn-controller${inst}.pid -vconsole:off
-done
-rungdb $gdb_ovn_controller_vtep $gdb_ovn_controller_vtep_ex \
-    ovn-controller-vtep --detach --no-chdir --pidfile -vconsole:off \
-    $OVN_CTRLR_PKI --log-file -vsyslog:off \
-    --ovnsb-db="$OVN_SB_DB"
+
+run sleep 10
+run ovn-sbctl set connection . inactivity_probe=180000
+run ovn-nbctl set connection . inactivity_probe=180000
+run ovn-nbctl set NB_GLOBAL . options:northd_probe_interval=180000
cat <<EOF ---

Then ran a sandbox with the test NB database:

$ make -j4 sandbox SANDBOXFLAGS="--nbdb-source=/tmp/ovnnb_db.db"

Decreased the threshold for logging I-P node compute times:
$ ovn-appctl -t ovn-northd inc-engine/compute-log-timeout 10

I waited for things to settle down and then I added a new logical switch
port:
$ ovn-nbctl lsp-add lswitch-ovn-scale-99 lsp-foo \
   -- lsp-set-addresses lsp-foo "00:00:00:00:00:01 42.42.42.42"

ovn-northd.log shows:
2025-07-03T11:49:59.869Z|00113|inc_proc_eng|INFO|node: 
port_binding_logical_switch_port, recompute (missing handler for input 
datapath_synced_logical_switch) took 101ms
2025-07-03T11:49:59.895Z|00114|inc_proc_eng|INFO|node: port_binding_mirror, 
recompute (missing handler for input datapath_synced_logical_switch) took 15ms
2025-07-03T11:50:00.059Z|00115|inc_proc_eng|INFO|node: port_binding_pair, 
recompute (missing handler for input port_binding_logical_switch_port) took 
164ms
2025-07-03T11:50:00.126Z|00116|inc_proc_eng|INFO|node: 
port_binding_paired_logical_switch_port, recompute (missing handler for input 
port_binding_pair) took 62ms
2025-07-03T11:50:00.255Z|00117|inc_proc_eng|INFO|node: 
port_binding_logical_switch_port, recompute (missing handler for input 
datapath_synced_logical_switch) took 99ms
2025-07-03T11:50:00.290Z|00118|inc_proc_eng|INFO|node: port_binding_mirror, 
recompute (missing handler for input datapath_synced_logical_switch) took 24ms
2025-07-03T11:50:00.480Z|00119|inc_proc_eng|INFO|node: port_binding_pair, 
recompute (missing handler for input port_binding_logical_switch_port) took 
190ms
2025-07-03T11:50:00.573Z|00120|inc_proc_eng|INFO|node: 
port_binding_paired_logical_switch_port, recompute (missing handler for input 
port_binding_pair) took 85ms
2025-07-03T11:50:00.609Z|00121|inc_proc_eng|INFO|node: sync_from_sb, recompute 
(missing handler for input SB_port_binding) took 27ms

While on main most of this is done incrementally (only the sync_from_sb
node takes some time).

I plan to still review the series but I'm afraid that we cannot merge
it until we figure out how to reduce performance impact due to these
these new nodes.

Thanks for looking into this Dumitru. I was aware there could be some performance regression due to the lack of incremental processing in the new engine nodes, but I wasn't aware it was going to be so severe. I think this particular test shows the issues in the port binding nodes, but other tests could be crafted to show a similar issue in the datapath nodes.

I also agree that merging this with such a performance regression would be ill-advised, especially since we don't have much time to fix it up before soft freeze for ovn 25.09 (18 July). I think I have two choices here: try to get incremental patches developed and posted before the soft freeze, or defer this series until ovn 26.03 and take the time to develop incremental patches during that period.


Regards,
Dumitru


_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to