On Thu, May 14, 2026 at 11:56 PM <[email protected]> wrote:
> From: Numan Siddique <[email protected]>
>
Hi Numan
thank you for the fix. There is one issue with commit message see down
below.
>
> The commit in the 'Fixes' tag tried to fix the ovn-nbctl sync issue by
> aborting any SB transaction by calling ovsdb_idl_txn_abort(ovnsb_txn)
> to avoid syncing the nb_cfg to SB DB while a SB transaction is in
> progress.
>
> However the ovsdb_idl_txn_abort() would feed back into the
> force-recompute path, creating a busy loop whenever NB.nb_cfg
> keeps advancing but the engine produces no SB activity.
>
> The issue can be easily reproduced by running the below commands
> if ovn-northd takes 2 or more seconds to finish its recompute.
>
> ovn-nbctl set nb_global . nb_cfg=2 -- ls-add sw2 &&
> ovn-nbctl set nb_global . nb_cfg=3.
>
> In the above command ideally ovn-northd lflow engine
> should recompute only once, but it would recompute
> twice with the below log messages seen in northd.
>
> ---
Those three dashes are why the 0-day bot complains,
it should be replaced with something else to avoid cutting
out part of the commit message.
2026-05-14T20:40:50.374Z|00176|coverage|INFO|124 events never hit
> 2026-05-14T20:40:50.418Z|00177|ovn_northd|INFO|OVNSB commit failed, force
> recompute next time.
> 2026-05-14T20:40:50.419Z|00178|ovn_northd|INFO|OVNSB commit failed, force
> recompute next time.
> ...
> 2026-05-14T20:40:50.419Z|00193|ovn_northd|INFO|OVNSB commit failed, force
> recompute next time.
> 2026-05-14T20:40:50.419Z|00194|ovn_northd|INFO|OVNSB commit failed, force
> recompute next time.
> 2026-05-14T20:40:51.168Z|00195|inc_proc_eng|INFO|node: northd, recompute
> (forced) took 636ms
> 2026-05-14T20:40:53.789Z|00196|inc_proc_eng|INFO|node: routes, recompute
> (forced) took 2493ms
> 2026-05-14T20:40:56.286Z|00197|inc_proc_eng|INFO|node: lflow, recompute
> (forced) took 2257ms
> 2026-05-14T20:40:56.299Z|00198|timeval|WARN|Unreasonably long 5880ms poll
> interval (5852ms user, 0ms system)
> ----
>
Same here.
>
> This patch fixes this issue by avoiding the force recompute if
> the sb txn was aborted intentionally.
>
> Fixes: 56819f04166b ("northd: Prevent ovn-nbctl --wait=sb from returning
> early.")
>
> CC: Ales Musil <[email protected]>
> Assisted-by: Claude Opus 4.7, Claude Code
> Signed-off-by: Numan Siddique <[email protected]>
> ---
> northd/ovn-northd.c | 13 +++++++++++--
> 1 file changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> index 7e0f888c1a..e2d1066ada 100644
> --- a/northd/ovn-northd.c
> +++ b/northd/ovn-northd.c
> @@ -1113,10 +1113,18 @@ main(int argc, char *argv[])
>
> /* Make sure we don't bump the next_cfg when we shouldn't.
> * This should prevent ovn-nbctl sync calls to return
> before
> - * the SB updates are actually done. */
> + * the SB updates are actually done.
> + *
> + * Track that the abort was intentional so we can
> distinguish
> + * it from a real commit failure below; otherwise the
> abort
> + * would feed back into the force-recompute path,
> creating a
> + * busy loop whenever NB.nb_cfg keeps advancing but the
> + * engine produces no SB activity. */
> + bool ovnsb_txn_aborted_intentionally = false;
> if (!activity && ovnsb_txn &&
> ovnsb_idl_loop.cur_cfg != ovnsb_idl_loop.next_cfg) {
> ovsdb_idl_txn_abort(ovnsb_txn);
> + ovnsb_txn_aborted_intentionally = true;
> }
>
> /* If there are any errors, we force a full recompute in
> order
> @@ -1127,7 +1135,8 @@ main(int argc, char *argv[])
> inc_proc_northd_force_recompute_immediate();
> }
>
> - if (!ovsdb_idl_loop_commit_and_wait(&ovnsb_idl_loop)) {
> + if (!ovsdb_idl_loop_commit_and_wait(&ovnsb_idl_loop) &&
> + !ovnsb_txn_aborted_intentionally) {
> VLOG_INFO("OVNSB commit failed, "
> "force recompute next time.");
> inc_proc_northd_force_recompute_immediate();
> --
> 2.53.0
>
>
With that addressed:
Acked-by: Ales Musil <[email protected]>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev