According to the ovn-architecture(7) sb_cfg counter should be propagated at the moment northd completed transaction of related changes to the southbound db. However, a processing engine call happened right between two events, a sb transaction commitment, and initiating the sb_cfg write. Normally, that processing engine call has nothing to do, because it has only update from sb db that fires back result of recent transaction from northd. But if, by some reason, engine change handler falls into full recompute there will be big delay before sb_cfg propagated to nb db, and in fact it really happened in old release, for example 22.09.
The fix defers engine run for one iteration (of main loop) if we have sb_cfg ready to propagate. Signed-off-by: Aleksandr Smirnov <alekssmirnov@k2.cloud> --- v2: Address Mark's comments. --- northd/ovn-northd.c | 23 +++++++++++++++++++++-- tests/ovn-northd.at | 43 +++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 64 insertions(+), 2 deletions(-) diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c index c11839e2b..e1e0b784c 100644 --- a/northd/ovn-northd.c +++ b/northd/ovn-northd.c @@ -798,6 +798,16 @@ run_memory_trimmer(struct ovsdb_idl *ovnnb_idl, bool activity) memory_trimmer_wait(mt); } +static bool +sb_cfg_is_updated(struct ovsdb_idl *ovnnb_idl, + struct ovsdb_idl_loop *sb_loop) +{ + const struct nbrec_nb_global *nb = nbrec_nb_global_first(ovnnb_idl); + + return (nb && sb_loop->cur_cfg && nb->sb_cfg != sb_loop->cur_cfg) + ? true : false; +} + int main(int argc, char *argv[]) { @@ -1063,8 +1073,17 @@ main(int argc, char *argv[]) if (ovnnb_txn && ovnsb_txn && inc_proc_northd_can_run(&eng_ctx)) { int64_t loop_start_time = time_wall_msec(); - activity = inc_proc_northd_run(ovnnb_txn, ovnsb_txn, - &eng_ctx); + + if (!sb_cfg_is_updated(ovnnb_idl_loop.idl, + &ovnsb_idl_loop)) { + activity = inc_proc_northd_run(ovnnb_txn, + ovnsb_txn, + &eng_ctx); + } else { + poll_immediate_wake(); + clear_idl_track = false; + } + check_and_add_supported_dhcp_opts_to_sb_db( ovnsb_txn, ovnsb_idl_loop.idl); check_and_add_supported_dhcpv6_opts_to_sb_db( diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at index 0ddb12027..2be464bcb 100644 --- a/tests/ovn-northd.at +++ b/tests/ovn-northd.at @@ -16569,3 +16569,46 @@ check_row_count Advertised_Route 0 AT_CLEANUP ]) + +OVN_FOR_EACH_NORTHD_NO_HV([ +AT_SETUP([sb_cfg propagation]) + +ovn_start + +# +# Test engine call does not happen between sb db transaction +# commitment and sb_cfg write to the nb db (if have to) +# + +rm -f northd/ovn-northd.log +check as northd ovn-appctl -t ovn-northd vlog/reopen +check as northd ovn-appctl -t ovn-northd vlog/set jsonrpc:dbg +check as northd ovn-appctl -t ovn-northd vlog/set inc_proc_eng:dbg +# Any change that invoke engine processing, for example add logical switch. +# nb_cfg bump must be present in order to get feedback as sb_cfg. +check ovn-nbctl --wait=sb ls-add sw1 -- set nb_global . nb_cfg=1 + +# +# Get following messages from log file: +# 'unix... transact ... Southbound' --> Indicates the pack of updates has been +# committed to the sb db. +# 'unix... transact ... sb_cfg' --> Indicates the sb_cfg has been committed to +# the sb db. +# 'Initializing new run' --> Indicates the engine has been called. +# +# Then, take first letter of messages, here 'u' or 'I' and form them into one +# string. +# +# Finally, a 'good' string should have two 'u' conjuncted with several 'I' +# both as prefix and suffix, while 'bad' string will have 'I' between two 'u'. +# + +call_seq=`grep -E \ + "(transact.*sb_cfg)|(transact.*Southbound)|(Initializing new run)" \ + northd/ovn-northd.log | cut -d "|" -f 5- | cut -b 1 | tr -d '\n'` + +AT_CHECK([echo $call_seq | grep -qE "^I*uuI*$"], [0]) + +AT_CLEANUP +]) + -- 2.47.0 _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev