On Fri, May 30, 2025 at 10:00:58AM +0000, Smirnov Aleksandr (K2 Cloud) wrote: > On 5/27/25 5:54 PM, Felix Huettner wrote: > > Hi Felix, > > I am sorry, I misunderstood your plan with countdown. > If you want to trigger it after first update is utilized all is correct. > So, I cancel my review complain and will use in my code suggested function > instead of direct compare of ovnsb_expected_cond_seqno vs ovnsb_cond_seqno
Hi Alexander, no worries we sill found a issue with this discussion, thanks a lot. Felix > > Thank you, > > Alexander > > > On Fri, May 23, 2025 at 10:02:59AM +0000, Smirnov Aleksandr (K2 Cloud) > > wrote: > >> Hi Felix, > >> > >> Answering your propose, yes, I can reuse this solution instead of direct > >> check seqno vs expected seqno. > >> However, I have inspected what you are doing in your patch and want to > >> point out to one mistake (at least by my opinion). > > Hi Alexander, > > > > thanks for the review. > > > >> When I tried to play with seqno/expected seqno numbers I realized that > >> they only set to meaningful values BELOW the > >> line > >> ovnsb_expected_cond_seqno = > >> update_sb_monitors( > >> ovnsb_idl_loop.idl, chassis, > >> &runtime_data->local_lports, > >> &runtime_data->lbinding_data.bindings, > >> &runtime_data->local_datapaths, > >> sb_monitor_all); > > So per default ovnsb_expected_cond_seqno is initialized to UINT_MAX. > > Then i see two locations where it might be set to an actual value: > > > > 1. At the start of the main loop in update_sb_db, but only if > > monitor_all is set > > 2. The point you mentioned that happens after the engine computed > > completely. > > > >> So, if you want to make meaningful comparison of seqno(s) to answer the > >> question 'are there upcoming updates' you have to that check after > >> update_sb_monitors() call and before next iteration of main loop. > > Or at the start of the main loop if we are sure that > > ovnsb_expected_cond_seqno > > is valid. I did this in my patch with a > > `ovnsb_expected_cond_seqno != UINT_MAX`. > > > >> Otherwise, (again by my opinion) your code will always catch guard > >> checks (time, max number, etc) > > I am not sure if i get that point correctly. I want to share an example > > case based on my understanding of what you mean. You can tell me if i got > > it right: > > > > We assume a case where monitor-all=false. > > When starting ovn-controller we will connect to the local ovsdb and > > southbound. We then run the engine the first time and set the initial > > monitoring conditions. > > At this point ovnsb_expected_cond_seqno is valid and larger than > > ovnsb_cond_seqno. > > > > Before we now get the initial monitor results from southbound something > > happens that prevents the engine from running (e.g. northd_version_match > > is false). > > At some point the initial monitor updates arrive which will cause > > ovnsb_expected_cond_seqno == ovnsb_cond_seqno. At least in my patch any > > further main loop iteration would now cause > > daemon_started_recently_countdown > > to be triggered, even though the engine never had the chance to update > > and define a new ovnsb_expected_cond_seqno. > > > > If the engine can then start running again daemon_started_recently() > > might return true, even though we did not yet allow for up to 20 times > > of initial condition changes. > > > >> This is still dark area for me, but at least I need to share my > >> understanding of seqno life cycle. > > Same for me and i think you found a valid point. > > > > However it would be great if you could confirm the above understading. > > If you meant a different point then maybe i need to fix something else > > as well. > > > > I will try to fix the above and send a new version of the patchset if > > you confirm that this was the point you meant. > > > > So thank you very much for finding that. > > > > Thanks, > > Felix > > > >> Thank you, > >> > >> Alexander > >> > >> On 5/7/25 5:26 PM, Felix Huettner wrote: > >>> On Tue, Mar 18, 2025 at 01:08:12PM +0300, Aleksandr Smirnov wrote: > >>>> Delay initial flow upload to OVS until all monitored updates received. > >>>> This is a kind of replacement of the wait-before-clear parameter. > >>>> Instead of waiting a certain amount of time we will wait > >>>> until the final monitored update comes from SB DB. > >>>> An event we watch in the controller's main loop is current condition > >>>> sequence number compared vs expected condition sequence number. > >>>> If they are not equal we still have to receive updates in response > >>>> to recent monitor condition change request. This check makes sense > >>>> only in code that lies after update_sb_monitors() function call. > >>>> > >>>> Note, this update will only work if wait-before-clear == 0, > >>>> i.e. you can still rely on wait-before-clear behavior. > >>>> > >>>> Signed-off-by: Aleksandr Smirnov <alekssmirnov@k2.cloud> > >>>> --- > >>>> controller/ofctrl.c | 29 ++++++++++++++++- > >>>> controller/ofctrl.h | 3 +- > >>>> controller/ovn-controller.c | 4 ++- > >>>> tests/ovn-controller.at | 62 +++++++++++++++++++++++++++++++++++++ > >>>> 4 files changed, 95 insertions(+), 3 deletions(-) > >>>> > >>>> diff --git a/controller/ofctrl.c b/controller/ofctrl.c > >>>> index 4a3d35b97..304b9bbc8 100644 > >>>> --- a/controller/ofctrl.c > >>>> +++ b/controller/ofctrl.c > >>>> @@ -349,6 +349,9 @@ static uint64_t cur_cfg; > >>>> /* Current state. */ > >>>> static enum ofctrl_state state; > >>>> > >>>> +/* Release wait before clear stage. */ > >>>> +static bool wait_before_clear_proceed = false; > >>>> + > >>>> /* The time (ms) to stay in the state S_WAIT_BEFORE_CLEAR. Read from > >>>> * external_ids: ovn-ofctrl-wait-before-clear. */ > >>>> static unsigned int wait_before_clear_time = 0; > >>>> @@ -446,6 +449,7 @@ run_S_NEW(void) > >>>> struct ofpbuf *buf = ofpraw_alloc(OFPRAW_NXT_TLV_TABLE_REQUEST, > >>>> rconn_get_version(swconn), 0); > >>>> xid = queue_msg(buf); > >>>> + wait_before_clear_proceed = false; > >>>> state = S_TLV_TABLE_REQUESTED; > >>>> } > >>>> > >>>> @@ -638,6 +642,14 @@ error: > >>>> static void > >>>> run_S_WAIT_BEFORE_CLEAR(void) > >>>> { > >>>> + if (wait_before_clear_time == 0) { > >>>> + if (wait_before_clear_proceed) { > >>>> + state = S_CLEAR_FLOWS; > >>>> + } > >>>> + > >>>> + return; > >>>> + } > >>>> + > >>>> if (!wait_before_clear_time || > >>>> (wait_before_clear_expire && > >>>> time_msec() >= wait_before_clear_expire)) { > >>>> @@ -2695,11 +2707,26 @@ ofctrl_put(struct ovn_desired_flow_table > >>>> *lflow_table, > >>>> uint64_t req_cfg, > >>>> bool lflows_changed, > >>>> bool pflows_changed, > >>>> - struct tracked_acl_ids *tracked_acl_ids) > >>>> + struct tracked_acl_ids *tracked_acl_ids, > >>>> + bool monitor_cond_complete) > >>>> { > >>>> static bool skipped_last_time = false; > >>>> static uint64_t old_req_cfg = 0; > >>>> bool need_put = false; > >>>> + > >>>> + if (state == S_WAIT_BEFORE_CLEAR) { > >>>> + /* If no more monitored condition changes expected > >>>> + Release wait before clear stage and skip > >>>> + over poll wait. */ > >>>> + if (monitor_cond_complete) { > >>>> + wait_before_clear_proceed = true; > >>>> + poll_immediate_wake(); > >>>> + } > >>>> + > >>>> + skipped_last_time = true; > >>>> + return; > >>>> + } > >>>> + > >>>> if (lflows_changed || pflows_changed || skipped_last_time || > >>>> ofctrl_initial_clear) { > >>>> need_put = true; > >>>> diff --git a/controller/ofctrl.h b/controller/ofctrl.h > >>>> index d1ee69cb0..76e2fbece 100644 > >>>> --- a/controller/ofctrl.h > >>>> +++ b/controller/ofctrl.h > >>>> @@ -68,7 +68,8 @@ void ofctrl_put(struct ovn_desired_flow_table > >>>> *lflow_table, > >>>> uint64_t nb_cfg, > >>>> bool lflow_changed, > >>>> bool pflow_changed, > >>>> - struct tracked_acl_ids *tracked_acl_ids); > >>>> + struct tracked_acl_ids *tracked_acl_ids, > >>>> + bool monitor_cond_complete); > >>>> bool ofctrl_has_backlog(void); > >>>> void ofctrl_wait(void); > >>>> void ofctrl_destroy(void); > >>>> diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c > >>>> index bc8acddf1..2623fc758 100644 > >>>> --- a/controller/ovn-controller.c > >>>> +++ b/controller/ovn-controller.c > >>>> @@ -6384,7 +6384,9 @@ main(int argc, char *argv[]) > >>>> ofctrl_seqno_get_req_cfg(), > >>>> > >>>> engine_node_changed(&en_lflow_output), > >>>> > >>>> engine_node_changed(&en_pflow_output), > >>>> - tracked_acl_ids); > >>>> + tracked_acl_ids, > >>>> + ovnsb_cond_seqno > >>>> + == ovnsb_expected_cond_seqno); > >>> Hi Aleksandr, > >>> > >>> i just wanted to let you know that we could probably combine this with > >>> https://patchwork.ozlabs.org/project/ovn/patch/67010dca0a8dcc390e873349d662c9163e01050f.1746623091.git.felix.huettner@stackit.cloud/ > >>> then you could use "!daemon_started_recently()" here. > >>> > >>> But that is just for information. > >>> > >>> Thanks a lot, > >>> Felix > >>> > >>>> stopwatch_stop(OFCTRL_PUT_STOPWATCH_NAME, > >>>> time_msec()); > >>>> } > >>>> stopwatch_start(OFCTRL_SEQNO_RUN_STOPWATCH_NAME, > >>>> diff --git a/tests/ovn-controller.at b/tests/ovn-controller.at > >>>> index efb0a1741..812616711 100644 > >>>> --- a/tests/ovn-controller.at > >>>> +++ b/tests/ovn-controller.at > >>>> @@ -2463,6 +2463,68 @@ AT_CHECK([ovs-ofctl dump-flows br-int | grep > >>>> group -c], [0], [3 > >>>> OVN_CLEANUP([hv1]) > >>>> AT_CLEANUP > >>>> > >>>> +OVN_FOR_EACH_NORTHD([ > >>>> +AT_SETUP([ovn-controller - ofctrl delay until all monitored updates > >>>> come]) > >>>> + > >>>> +# Prepare testing configuration > >>>> +ovn_start > >>>> + > >>>> +net_add n1 > >>>> +sim_add hv1 > >>>> +as hv1 > >>>> +check ovs-vsctl add-br br-phys > >>>> +ovn_attach n1 br-phys 192.168.0.1 > >>>> +check ovs-vsctl -- add-port br-int hv1-vif1 -- \ > >>>> + set interface hv1-vif1 external-ids:iface-id=ls1-lp1 > >>>> + > >>>> +check ovn-nbctl ls-add ls1 > >>>> + > >>>> +check ovn-nbctl lsp-add ls1 ls1-lp1 \ > >>>> +-- lsp-set-addresses ls1-lp1 "f0:00:00:00:00:01 10.1.2.3" > >>>> + > >>>> +check ovn-nbctl lb-add lb1 1.1.1.1 10.1.2.3 \ > >>>> +-- ls-lb-add ls1 lb1 > >>>> + > >>>> +check ovn-nbctl lb-add lb2 2.2.2.2 10.1.2.4 \ > >>>> +-- ls-lb-add ls1 lb2 > >>>> + > >>>> +check ovn-nbctl --wait=hv sync > >>>> + > >>>> +# Stop ovn-controller > >>>> +OVS_APP_EXIT_AND_WAIT([ovn-controller]) > >>>> + > >>>> +# Sine we test initial flow upload controller is restarted. > >>>> +# Clear log file and start controller. > >>>> +rm -f hv1/ovn-controller.log > >>>> +start_daemon ovn-controller -vfile:jsonrpc:dbg -vfile:ofctrl:dbg > >>>> + > >>>> +# Monitor log file until flow finally uploaded to OVS > >>>> +OVS_WAIT_UNTIL([grep -q 'Setting lport.*in OVS' hv1/ovn-controller.log]) > >>>> + > >>>> +# Analyse log file, select records about: > >>>> +# 1. monitor_cond changes made for SB DB (message class is 'jsonrpc') > >>>> +# 2. 'clearing all flows' message which is issued after 'wait before > >>>> +# clear' stage released (message class is 'ofctrl') > >>>> +# > >>>> +# We expect here that all monitoring condition changes should be made > >>>> before > >>>> +# OVS flow cleared / uploaded. > >>>> +# For now all monitoring updates comes in three iterations: initial, > >>>> +# direct dps, indirect dps that corresponds to > >>>> +# three messages of type 1 followed by one message of type 2 > >>>> +# > >>>> +# For monitor-all=true one message of type 1 followed by one message of > >>>> type 2 > >>>> +# > >>>> +# Then we cut off message class and take first letter > >>>> +# (j for jsonrpc and o for ofctrl) > >>>> +# > >>>> +call_seq=`grep -E \ > >>>> + "(clearing all flows)|(monitor_cond.*South)" \ > >>>> + hv1/ovn-controller.log | cut -d "|" -f 3- | cut -b 1 | tr -d '\n'` > >>>> +AT_CHECK([echo $call_seq | grep -qE "^j+o$"], [0]) > >>>> + > >>>> +OVN_CLEANUP([hv1]) > >>>> +AT_CLEANUP > >>>> +]) > >>>> > >>>> AT_SETUP([ovn-controller - check ovn-chassis-mac-mappings]) > >>>> > >>>> -- > >>>> 2.47.0 > >>>> > >>>> _______________________________________________ > >>>> dev mailing list > >>>> d...@openvswitch.org > >>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev > >> > _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev