On Fri, May 15, 2026 at 4:46 AM Ales Musil <[email protected]> wrote: > > > > On Thu, May 14, 2026 at 11:56 PM <[email protected]> wrote: >> >> From: Numan Siddique <[email protected]> > > > Hi Numan > > thank you for the fix. There is one issue with commit message see down below. > >> >> >> The commit in the 'Fixes' tag tried to fix the ovn-nbctl sync issue by >> aborting any SB transaction by calling ovsdb_idl_txn_abort(ovnsb_txn) >> to avoid syncing the nb_cfg to SB DB while a SB transaction is in >> progress. >> >> However the ovsdb_idl_txn_abort() would feed back into the >> force-recompute path, creating a busy loop whenever NB.nb_cfg >> keeps advancing but the engine produces no SB activity. >> >> The issue can be easily reproduced by running the below commands >> if ovn-northd takes 2 or more seconds to finish its recompute. >> >> ovn-nbctl set nb_global . nb_cfg=2 -- ls-add sw2 && >> ovn-nbctl set nb_global . nb_cfg=3. >> >> In the above command ideally ovn-northd lflow engine >> should recompute only once, but it would recompute >> twice with the below log messages seen in northd. >> >> --- > > > Those three dashes are why the 0-day bot complains, > it should be replaced with something else to avoid cutting > out part of the commit message. > >> 2026-05-14T20:40:50.374Z|00176|coverage|INFO|124 events never hit >> 2026-05-14T20:40:50.418Z|00177|ovn_northd|INFO|OVNSB commit failed, force >> recompute next time. >> 2026-05-14T20:40:50.419Z|00178|ovn_northd|INFO|OVNSB commit failed, force >> recompute next time. >> ... >> 2026-05-14T20:40:50.419Z|00193|ovn_northd|INFO|OVNSB commit failed, force >> recompute next time. >> 2026-05-14T20:40:50.419Z|00194|ovn_northd|INFO|OVNSB commit failed, force >> recompute next time. >> 2026-05-14T20:40:51.168Z|00195|inc_proc_eng|INFO|node: northd, recompute >> (forced) took 636ms >> 2026-05-14T20:40:53.789Z|00196|inc_proc_eng|INFO|node: routes, recompute >> (forced) took 2493ms >> 2026-05-14T20:40:56.286Z|00197|inc_proc_eng|INFO|node: lflow, recompute >> (forced) took 2257ms >> 2026-05-14T20:40:56.299Z|00198|timeval|WARN|Unreasonably long 5880ms poll >> interval (5852ms user, 0ms system) >> ---- > > > Same here. > >> >> >> This patch fixes this issue by avoiding the force recompute if >> the sb txn was aborted intentionally. >> >> Fixes: 56819f04166b ("northd: Prevent ovn-nbctl --wait=sb from returning >> early.") >> >> CC: Ales Musil <[email protected]> >> Assisted-by: Claude Opus 4.7, Claude Code >> Signed-off-by: Numan Siddique <[email protected]> >> --- >> northd/ovn-northd.c | 13 +++++++++++-- >> 1 file changed, 11 insertions(+), 2 deletions(-) >> >> diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c >> index 7e0f888c1a..e2d1066ada 100644 >> --- a/northd/ovn-northd.c >> +++ b/northd/ovn-northd.c >> @@ -1113,10 +1113,18 @@ main(int argc, char *argv[]) >> >> /* Make sure we don't bump the next_cfg when we shouldn't. >> * This should prevent ovn-nbctl sync calls to return before >> - * the SB updates are actually done. */ >> + * the SB updates are actually done. >> + * >> + * Track that the abort was intentional so we can >> distinguish >> + * it from a real commit failure below; otherwise the abort >> + * would feed back into the force-recompute path, creating a >> + * busy loop whenever NB.nb_cfg keeps advancing but the >> + * engine produces no SB activity. */ >> + bool ovnsb_txn_aborted_intentionally = false; >> if (!activity && ovnsb_txn && >> ovnsb_idl_loop.cur_cfg != ovnsb_idl_loop.next_cfg) { >> ovsdb_idl_txn_abort(ovnsb_txn); >> + ovnsb_txn_aborted_intentionally = true; >> } >> >> /* If there are any errors, we force a full recompute in >> order >> @@ -1127,7 +1135,8 @@ main(int argc, char *argv[]) >> inc_proc_northd_force_recompute_immediate(); >> } >> >> - if (!ovsdb_idl_loop_commit_and_wait(&ovnsb_idl_loop)) { >> + if (!ovsdb_idl_loop_commit_and_wait(&ovnsb_idl_loop) && >> + !ovnsb_txn_aborted_intentionally) { >> VLOG_INFO("OVNSB commit failed, " >> "force recompute next time."); >> inc_proc_northd_force_recompute_immediate(); >> -- >> 2.53.0 >> > > With that addressed: > Acked-by: Ales Musil <[email protected]>
Thanks Ales. I fixed the commit message error and applied to main and I've backported till 25.03. I'll backport until 24.03 in some time. Thanks Numan _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
