From: Numan Siddique <[email protected]>

The commit in the 'Fixes' tag tried to fix the ovn-nbctl sync issue by
aborting any SB transaction by calling ovsdb_idl_txn_abort(ovnsb_txn)
to avoid syncing the nb_cfg to SB DB while a SB transaction is in
progress.

However the ovsdb_idl_txn_abort() would feed back into the
force-recompute path, creating a busy loop whenever NB.nb_cfg
keeps advancing but the engine produces no SB activity.

The issue can be easily reproduced by running the below commands
if ovn-northd takes 2 or more seconds to finish its recompute.

ovn-nbctl set nb_global . nb_cfg=2 -- ls-add sw2 &&
ovn-nbctl set nb_global . nb_cfg=3.

In the above command ideally ovn-northd lflow engine
should recompute only once, but it would recompute
twice with the below log messages seen in northd.

---
2026-05-14T20:40:50.374Z|00176|coverage|INFO|124 events never hit
2026-05-14T20:40:50.418Z|00177|ovn_northd|INFO|OVNSB commit failed, force 
recompute next time.
2026-05-14T20:40:50.419Z|00178|ovn_northd|INFO|OVNSB commit failed, force 
recompute next time.
...
2026-05-14T20:40:50.419Z|00193|ovn_northd|INFO|OVNSB commit failed, force 
recompute next time.
2026-05-14T20:40:50.419Z|00194|ovn_northd|INFO|OVNSB commit failed, force 
recompute next time.
2026-05-14T20:40:51.168Z|00195|inc_proc_eng|INFO|node: northd, recompute 
(forced) took 636ms
2026-05-14T20:40:53.789Z|00196|inc_proc_eng|INFO|node: routes, recompute 
(forced) took 2493ms
2026-05-14T20:40:56.286Z|00197|inc_proc_eng|INFO|node: lflow, recompute 
(forced) took 2257ms
2026-05-14T20:40:56.299Z|00198|timeval|WARN|Unreasonably long 5880ms poll 
interval (5852ms user, 0ms system)
----

This patch fixes this issue by avoiding the force recompute if
the sb txn was aborted intentionally.

Fixes: 56819f04166b ("northd: Prevent ovn-nbctl --wait=sb from returning
early.")

CC: Ales Musil <[email protected]>
Assisted-by: Claude Opus 4.7, Claude Code
Signed-off-by: Numan Siddique <[email protected]>
---
 northd/ovn-northd.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
index 7e0f888c1a..e2d1066ada 100644
--- a/northd/ovn-northd.c
+++ b/northd/ovn-northd.c
@@ -1113,10 +1113,18 @@ main(int argc, char *argv[])
 
                 /* Make sure we don't bump the next_cfg when we shouldn't.
                  * This should prevent ovn-nbctl sync calls to return before
-                 * the SB updates are actually done. */
+                 * the SB updates are actually done.
+                 *
+                 * Track that the abort was intentional so we can distinguish
+                 * it from a real commit failure below; otherwise the abort
+                 * would feed back into the force-recompute path, creating a
+                 * busy loop whenever NB.nb_cfg keeps advancing but the
+                 * engine produces no SB activity. */
+                bool ovnsb_txn_aborted_intentionally = false;
                 if (!activity && ovnsb_txn &&
                     ovnsb_idl_loop.cur_cfg != ovnsb_idl_loop.next_cfg) {
                     ovsdb_idl_txn_abort(ovnsb_txn);
+                    ovnsb_txn_aborted_intentionally = true;
                 }
 
                 /* If there are any errors, we force a full recompute in order
@@ -1127,7 +1135,8 @@ main(int argc, char *argv[])
                     inc_proc_northd_force_recompute_immediate();
                 }
 
-                if (!ovsdb_idl_loop_commit_and_wait(&ovnsb_idl_loop)) {
+                if (!ovsdb_idl_loop_commit_and_wait(&ovnsb_idl_loop) &&
+                    !ovnsb_txn_aborted_intentionally) {
                     VLOG_INFO("OVNSB commit failed, "
                               "force recompute next time.");
                     inc_proc_northd_force_recompute_immediate();
-- 
2.53.0

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to