From: Numan Siddique <[email protected]>
The commit in the 'Fixes' tag tried to fix the ovn-nbctl sync issue by
aborting any SB transaction by calling ovsdb_idl_txn_abort(ovnsb_txn)
to avoid syncing the nb_cfg to SB DB while a SB transaction is in
progress.
However the ovsdb_idl_txn_abort() would feed back into the
force-recompute path, creating a busy loop whenever NB.nb_cfg
keeps advancing but the engine produces no SB activity.
The issue can be easily reproduced by running the below commands
if ovn-northd takes 2 or more seconds to finish its recompute.
ovn-nbctl set nb_global . nb_cfg=2 -- ls-add sw2 &&
ovn-nbctl set nb_global . nb_cfg=3.
In the above command ideally ovn-northd lflow engine
should recompute only once, but it would recompute
twice with the below log messages seen in northd.
---
2026-05-14T20:40:50.374Z|00176|coverage|INFO|124 events never hit
2026-05-14T20:40:50.418Z|00177|ovn_northd|INFO|OVNSB commit failed, force
recompute next time.
2026-05-14T20:40:50.419Z|00178|ovn_northd|INFO|OVNSB commit failed, force
recompute next time.
...
2026-05-14T20:40:50.419Z|00193|ovn_northd|INFO|OVNSB commit failed, force
recompute next time.
2026-05-14T20:40:50.419Z|00194|ovn_northd|INFO|OVNSB commit failed, force
recompute next time.
2026-05-14T20:40:51.168Z|00195|inc_proc_eng|INFO|node: northd, recompute
(forced) took 636ms
2026-05-14T20:40:53.789Z|00196|inc_proc_eng|INFO|node: routes, recompute
(forced) took 2493ms
2026-05-14T20:40:56.286Z|00197|inc_proc_eng|INFO|node: lflow, recompute
(forced) took 2257ms
2026-05-14T20:40:56.299Z|00198|timeval|WARN|Unreasonably long 5880ms poll
interval (5852ms user, 0ms system)
----
This patch fixes this issue by avoiding the force recompute if
the sb txn was aborted intentionally.
Fixes: 56819f04166b ("northd: Prevent ovn-nbctl --wait=sb from returning
early.")
CC: Ales Musil <[email protected]>
Assisted-by: Claude Opus 4.7, Claude Code
Signed-off-by: Numan Siddique <[email protected]>
---
northd/ovn-northd.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
index 7e0f888c1a..e2d1066ada 100644
--- a/northd/ovn-northd.c
+++ b/northd/ovn-northd.c
@@ -1113,10 +1113,18 @@ main(int argc, char *argv[])
/* Make sure we don't bump the next_cfg when we shouldn't.
* This should prevent ovn-nbctl sync calls to return before
- * the SB updates are actually done. */
+ * the SB updates are actually done.
+ *
+ * Track that the abort was intentional so we can distinguish
+ * it from a real commit failure below; otherwise the abort
+ * would feed back into the force-recompute path, creating a
+ * busy loop whenever NB.nb_cfg keeps advancing but the
+ * engine produces no SB activity. */
+ bool ovnsb_txn_aborted_intentionally = false;
if (!activity && ovnsb_txn &&
ovnsb_idl_loop.cur_cfg != ovnsb_idl_loop.next_cfg) {
ovsdb_idl_txn_abort(ovnsb_txn);
+ ovnsb_txn_aborted_intentionally = true;
}
/* If there are any errors, we force a full recompute in order
@@ -1127,7 +1135,8 @@ main(int argc, char *argv[])
inc_proc_northd_force_recompute_immediate();
}
- if (!ovsdb_idl_loop_commit_and_wait(&ovnsb_idl_loop)) {
+ if (!ovsdb_idl_loop_commit_and_wait(&ovnsb_idl_loop) &&
+ !ovnsb_txn_aborted_intentionally) {
VLOG_INFO("OVNSB commit failed, "
"force recompute next time.");
inc_proc_northd_force_recompute_immediate();
--
2.53.0
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev