On 7/2/25 6:26 PM, Mark Michelson via dev wrote:
This is similar to a recent change that refactored datapath syncing.
This works in a very similar way.
* Input nodes create type-agnostic ovn_unpaired_port_bindings. These are
fed into the en-port-binding-pair node.
* The en-port-binding-pair node ensures that a southbound Port_Binding
is created for each unpaired port binding. Any remaining soutbound
Port_Bindings are deleted.
* Type-specific nodes then convert the paired port bindings into
type-specific paired port bindings that can be consumed by other
nodes.
However, there are some important differences to note between this and
the datapath sync patch:
* This patch opts for the word "pair" instead of "sync". This is because
en-port-binding-pair ensures that all southbound Port_Bindings are
paired with some northbound structure. However, the columns in the
southobund Port_Binding are not all synced with their northd
counterpart. Other engine nodes are responsible for fully syncing the
data.
* Not all southbound Port_Bindings have a corresponding northbound row.
Therefore, the engine nodes that create unpaired port bindings pass an
opaque cookie pointer to the pairing node instead of a database row.
* Port bindings tend to be identified by name. This means the code has
to have several safeguards in place to catch scenarios such as a port
"moving" from one datapath to another, or a port being deleted and
re-added quickly.
* Unlike with the datapath syncing code, this required changes to
northd's incremental processing since northd was already incrementally
processing some northbound logical switch port changes.
Signed-off-by: Mark Michelson <mmich...@redhat.com>
---
Hi Mark,
Thanks for this new revision! This is not the actual review though, just an
update with some scale testing results we got with ovn-heater.
v9 -> v10:
* Rebased.
* Used designated initializers after allocating structures.
* Added a new "free_cookie" callback to the
ovn_unsynced_port_binding_map callbacks. All port binding engine
nodes now provide this callback. This means there is no longer a
fallback to the default callbacks if none are provided.
* Some formatting fixes.
v8 -> v9:
* Rebased.
* Changed some file names to be in alignment with others in the
directory.
* Fixed memory leak of port tunnel ids.
* Added a missing "static" to a structure in
en-port-binding-logical-switch-port.c.
v7 -> v8:
* Rebased.
v6 -> v7:
* Rebased.
* Made some recommended changes to formatting (combining if statements,
fixing indentation, etc.)
* Switched from an array of router DPGs in the chassisredirect code
to a vector.
* Fixed several memory leaks in the chassisredirect code.
* Fixed a crash in northd.c in the incremental case where a northbound
logical switch port did not have a corresponding paired logical
switch port to find due to tunnel key issues.
* Ensured that we only compare a requested tunnel key against the
VXLAN multicast minimum when VXLAN is actually enabled.
v5 -> v6:
* Rebased.
* All calls to xzalloc() in this patch have been changed to xmalloc()
or xcalloc().
* Candidate paired port bindings use a vector instead of a list now.
* Chassisredirect ports have been simplified a great deal.
* This version introduces an enum for port binding types to make type
checking more efficient.
* This patch abandons the attempt to blindly pull in unpaired port
binding maps based on their positions in the engine node inputs
array. Instead, we now explicitly pull them in by input name and
assign them to array positions based on their port binding type.
v4 -> v5:
* Rebased.
* Fixed some formatting anomalies in mirror port syncing.
* Fixed checkpatch warnings.
v3 -> v4:
* Rebased.
* Addressed newly-added mirror port functionality.
* Fixed many memory leaks.
v3: This is the first version with this patch present.
---
[...]
diff --git a/TODO.rst b/TODO.rst
index 78962bb92..60ae155c5 100644
--- a/TODO.rst
+++ b/TODO.rst
@@ -168,3 +168,15 @@ OVN To-do List
ovn\_synced\_logical_router and ovn\_synced\_logical\_switch. This will
allow for the eventual removal of the ovn\_datapath structure from the
codebase.
+
+* Port Binding sync nodes
+
+ * Southbound Port bindings are synced across three engine nodes:
+ - en_port_binding_pair
+ - en_northd
+ - en_sync_to_sb
+ It would be easier to work with if these were combined into a
+ single node instead.
+
+ * Add incremental processing to the en-port-binding-pair node, as
+ well as derivative nodes.
Unfortunately, it seems that the lack of incremental processing for these
northd nodes introduces a measurable performance regression.
With ovn-heater, I ran a 500 node density-heavy (simulating realistic
ovn-kubernetes workloads) test. That showed an increase of ~50% in
time it takes to run the simulated workloads.
Most of that seems to be due to these new nodes performing recomputation
on every input change.
For ease of debugging I uploaded the NB database extracted from that
test here:
https://people.redhat.com/~dceara/ovn-northd-datapath-port-binding-refactor-v10/ovnnb_db.db.gz
To make it easier to debug directly on my machine I hacked the ovn-sandbox
script in the following way:
diff --git a/tutorial/ovn-sandbox b/tutorial/ovn-sandbox
index ed334d1c31..a64bb930dd 100755
--- a/tutorial/ovn-sandbox
+++ b/tutorial/ovn-sandbox
@@ -661,14 +661,6 @@ else
run ovs-vsctl set open . external-ids:ovn-remote=$OVN_SB_DB
OVN_CTRLR_PKI=""
fi
-for i in $(seq $n_ics); do
- if [ $i -eq 1 ]; then inst=""; else inst=$i; fi
- rungdb $gdb_ovn_ic $gdb_ovn_ic_ex ovn-ic --detach \
- --no-chdir --pidfile=ovn-ic${inst}.pid -vconsole:off \
- --log-file=ovn-ic${inst}.log -vsyslog:off \
- --ovnsb-db="$OVN_SB_DB" --ovnnb-db="$OVN_NB_DB" \
- --ic-sb-db="$OVN_IC_SB_DB" --ic-nb-db="$OVN_IC_NB_DB"
-done
northd_args=
OVN_NORTHD=ovn-northd
@@ -680,17 +672,11 @@ for i in $(seq $n_northds); do
--log-file=$OVN_NORTHD$inst.log -vsyslog:off \
--ovnsb-db="$OVN_SB_DB" --ovnnb-db="$OVN_NB_DB" $northd_args
done
-for i in $(seq $n_controllers); do
- if [ $i -eq 1 ]; then inst=""; else inst=$i; fi
- rungdb $gdb_ovn_controller $gdb_ovn_controller_ex ovn-controller \
- $OVN_CTRLR_PKI --detach --no-chdir -vsyslog:off \
- --log-file=ovn-controller${inst}.log \
- --pidfile=ovn-controller${inst}.pid -vconsole:off
-done
-rungdb $gdb_ovn_controller_vtep $gdb_ovn_controller_vtep_ex \
- ovn-controller-vtep --detach --no-chdir --pidfile -vconsole:off \
- $OVN_CTRLR_PKI --log-file -vsyslog:off \
- --ovnsb-db="$OVN_SB_DB"
+
+run sleep 10
+run ovn-sbctl set connection . inactivity_probe=180000
+run ovn-nbctl set connection . inactivity_probe=180000
+run ovn-nbctl set NB_GLOBAL . options:northd_probe_interval=180000
cat <<EOF
---
Then ran a sandbox with the test NB database:
$ make -j4 sandbox SANDBOXFLAGS="--nbdb-source=/tmp/ovnnb_db.db"
Decreased the threshold for logging I-P node compute times:
$ ovn-appctl -t ovn-northd inc-engine/compute-log-timeout 10
I waited for things to settle down and then I added a new logical switch
port:
$ ovn-nbctl lsp-add lswitch-ovn-scale-99 lsp-foo \
-- lsp-set-addresses lsp-foo "00:00:00:00:00:01 42.42.42.42"
ovn-northd.log shows:
2025-07-03T11:49:59.869Z|00113|inc_proc_eng|INFO|node:
port_binding_logical_switch_port, recompute (missing handler for input
datapath_synced_logical_switch) took 101ms
2025-07-03T11:49:59.895Z|00114|inc_proc_eng|INFO|node: port_binding_mirror,
recompute (missing handler for input datapath_synced_logical_switch) took 15ms
2025-07-03T11:50:00.059Z|00115|inc_proc_eng|INFO|node: port_binding_pair,
recompute (missing handler for input port_binding_logical_switch_port) took
164ms
2025-07-03T11:50:00.126Z|00116|inc_proc_eng|INFO|node:
port_binding_paired_logical_switch_port, recompute (missing handler for input
port_binding_pair) took 62ms
2025-07-03T11:50:00.255Z|00117|inc_proc_eng|INFO|node:
port_binding_logical_switch_port, recompute (missing handler for input
datapath_synced_logical_switch) took 99ms
2025-07-03T11:50:00.290Z|00118|inc_proc_eng|INFO|node: port_binding_mirror,
recompute (missing handler for input datapath_synced_logical_switch) took 24ms
2025-07-03T11:50:00.480Z|00119|inc_proc_eng|INFO|node: port_binding_pair,
recompute (missing handler for input port_binding_logical_switch_port) took
190ms
2025-07-03T11:50:00.573Z|00120|inc_proc_eng|INFO|node:
port_binding_paired_logical_switch_port, recompute (missing handler for input
port_binding_pair) took 85ms
2025-07-03T11:50:00.609Z|00121|inc_proc_eng|INFO|node: sync_from_sb, recompute
(missing handler for input SB_port_binding) took 27ms
While on main most of this is done incrementally (only the sync_from_sb
node takes some time).
I plan to still review the series but I'm afraid that we cannot merge
it until we figure out how to reduce performance impact due to these
these new nodes.