On Fri, May 15, 2026 at 4:46 AM Ales Musil <[email protected]> wrote:
>
>
>
> On Thu, May 14, 2026 at 11:56 PM <[email protected]> wrote:
>>
>> From: Numan Siddique <[email protected]>
>
>
> Hi Numan
>
> thank you for the fix. There is one issue with commit message see down below.
>
>>
>>
>> The commit in the 'Fixes' tag tried to fix the ovn-nbctl sync issue by
>> aborting any SB transaction by calling ovsdb_idl_txn_abort(ovnsb_txn)
>> to avoid syncing the nb_cfg to SB DB while a SB transaction is in
>> progress.
>>
>> However the ovsdb_idl_txn_abort() would feed back into the
>> force-recompute path, creating a busy loop whenever NB.nb_cfg
>> keeps advancing but the engine produces no SB activity.
>>
>> The issue can be easily reproduced by running the below commands
>> if ovn-northd takes 2 or more seconds to finish its recompute.
>>
>> ovn-nbctl set nb_global . nb_cfg=2 -- ls-add sw2 &&
>> ovn-nbctl set nb_global . nb_cfg=3.
>>
>> In the above command ideally ovn-northd lflow engine
>> should recompute only once, but it would recompute
>> twice with the below log messages seen in northd.
>>
>> ---
>
>
> Those three dashes are why the 0-day bot complains,
> it should be replaced with something else to avoid cutting
> out part of the commit message.
>
>> 2026-05-14T20:40:50.374Z|00176|coverage|INFO|124 events never hit
>> 2026-05-14T20:40:50.418Z|00177|ovn_northd|INFO|OVNSB commit failed, force 
>> recompute next time.
>> 2026-05-14T20:40:50.419Z|00178|ovn_northd|INFO|OVNSB commit failed, force 
>> recompute next time.
>> ...
>> 2026-05-14T20:40:50.419Z|00193|ovn_northd|INFO|OVNSB commit failed, force 
>> recompute next time.
>> 2026-05-14T20:40:50.419Z|00194|ovn_northd|INFO|OVNSB commit failed, force 
>> recompute next time.
>> 2026-05-14T20:40:51.168Z|00195|inc_proc_eng|INFO|node: northd, recompute 
>> (forced) took 636ms
>> 2026-05-14T20:40:53.789Z|00196|inc_proc_eng|INFO|node: routes, recompute 
>> (forced) took 2493ms
>> 2026-05-14T20:40:56.286Z|00197|inc_proc_eng|INFO|node: lflow, recompute 
>> (forced) took 2257ms
>> 2026-05-14T20:40:56.299Z|00198|timeval|WARN|Unreasonably long 5880ms poll 
>> interval (5852ms user, 0ms system)
>> ----
>
>
> Same here.
>
>>
>>
>> This patch fixes this issue by avoiding the force recompute if
>> the sb txn was aborted intentionally.
>>
>> Fixes: 56819f04166b ("northd: Prevent ovn-nbctl --wait=sb from returning
>> early.")
>>
>> CC: Ales Musil <[email protected]>
>> Assisted-by: Claude Opus 4.7, Claude Code
>> Signed-off-by: Numan Siddique <[email protected]>
>> ---
>>  northd/ovn-northd.c | 13 +++++++++++--
>>  1 file changed, 11 insertions(+), 2 deletions(-)
>>
>> diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
>> index 7e0f888c1a..e2d1066ada 100644
>> --- a/northd/ovn-northd.c
>> +++ b/northd/ovn-northd.c
>> @@ -1113,10 +1113,18 @@ main(int argc, char *argv[])
>>
>>                  /* Make sure we don't bump the next_cfg when we shouldn't.
>>                   * This should prevent ovn-nbctl sync calls to return before
>> -                 * the SB updates are actually done. */
>> +                 * the SB updates are actually done.
>> +                 *
>> +                 * Track that the abort was intentional so we can 
>> distinguish
>> +                 * it from a real commit failure below; otherwise the abort
>> +                 * would feed back into the force-recompute path, creating a
>> +                 * busy loop whenever NB.nb_cfg keeps advancing but the
>> +                 * engine produces no SB activity. */
>> +                bool ovnsb_txn_aborted_intentionally = false;
>>                  if (!activity && ovnsb_txn &&
>>                      ovnsb_idl_loop.cur_cfg != ovnsb_idl_loop.next_cfg) {
>>                      ovsdb_idl_txn_abort(ovnsb_txn);
>> +                    ovnsb_txn_aborted_intentionally = true;
>>                  }
>>
>>                  /* If there are any errors, we force a full recompute in 
>> order
>> @@ -1127,7 +1135,8 @@ main(int argc, char *argv[])
>>                      inc_proc_northd_force_recompute_immediate();
>>                  }
>>
>> -                if (!ovsdb_idl_loop_commit_and_wait(&ovnsb_idl_loop)) {
>> +                if (!ovsdb_idl_loop_commit_and_wait(&ovnsb_idl_loop) &&
>> +                    !ovnsb_txn_aborted_intentionally) {
>>                      VLOG_INFO("OVNSB commit failed, "
>>                                "force recompute next time.");
>>                      inc_proc_northd_force_recompute_immediate();
>> --
>> 2.53.0
>>
>
> With that addressed:
> Acked-by: Ales Musil <[email protected]>

Thanks Ales.  I fixed the commit message error and applied to main and
I've backported till 25.03.
I'll backport until 24.03 in some time.

Thanks
Numan
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to