Hi Ilya,

In the scope of your discussion with Han:
   My code have no guard check (by time of max uploads limit).
   If I see that some update is expected to come I delay openflow upload.
   It is done with checking sb db seqno compared vs expected seqno.
   All concerns I saw in your discussion thread are about uncertain ( != 
infinite) number of update messages
   that could potentially come from sb db.
   I think this is not a problem in theory, we have to wait until last 
update message comes.

   Practically, I have tried to test ovn-controller with big and close 
to real database.
   Even in that case number of updates do not exceed 3.
   A real problem is that first update is quite simple and, if applied, 
stops traffic on the given controller.
   Second update is huge and takes lot of time to generate/upload 
openflow -- during that time controller
   can be considered as 'down'. So, in very practical approach, I think 
is minimally enough to wait until second update
   processed and then run openflow clear+upload.

   Finally, I got a message from Felix that he is working on similar 
code that should delay controller startup until all updates received.
    
(https://patchwork.ozlabs.org/project/ovn/patch/67010dca0a8dcc390e873349d662c9163e01050f.1746623091.git.felix.huettner@stackit.cloud/)
   I have sent my review and concerns about his implementation.
   If his code will be successfully reviewed and merged I can just use 
his work to manage openflow uploads.


   Anyways, extra explanations will be added to the commit message.

Thank you,

Alexander

On 5/12/25 1:25 PM, Ilya Maximets wrote:
> On 5/6/25 10:50 PM, Numan Siddique wrote:
>> On Tue, Mar 18, 2025 at 6:08 AM Aleksandr Smirnov <alekssmirnov@k2.cloud> 
>> wrote:
>>> Delay initial flow upload to OVS until all monitored updates received.
>>> This is a kind of replacement of the wait-before-clear parameter.
>>> Instead of waiting a certain amount of time we will wait
>>> until the final monitored update comes from SB DB.
>>> An event we watch in the controller's main loop is current condition
>>> sequence number compared vs expected condition sequence number.
>>> If they are not equal we still have to receive updates in response
>>> to recent monitor condition change request. This check makes sense
>>> only in code that lies after update_sb_monitors() function call.
>>>
>>> Note, this update will only work if wait-before-clear == 0,
>>> i.e. you can still rely on wait-before-clear behavior.
>>>
>>> Signed-off-by: Aleksandr Smirnov <alekssmirnov@k2.cloud>
>> Hi Aleksandr,
>>
>> Sorry for the delay in reviewing this.  I was a little concerned about
>> this patch series initially
>> when I saw the patch commit message.  After reviewing this patch,  my
>> concerns are cleared
>> and I think it's a smart approach.
>>
>> I tested it locally with a pretty big sized OVN SB DB.  ovn-controller
>> installs around 400k openflow rules.
>> Without this patch,  ovn-controller takes around 10 seconds to add
>> back all the flows when it is restarted.
>> With this patch,  I don't see any delay.  The existing flows are
>> cleared and added back without any delay.
>>
>> I'd definitely like to get opinions from other maintainers too if they
>> have any concerns.
>>
>> @Han Zhou   As you added the option - ovn-ofctrl-wait-before-clear
>> option can you please take a look at it ?
>>
> We had a conversation about a similar change here before:
>    
> https://patchwork.ozlabs.org/project/ovn/patch/20220822091918.2804852-1-i.maxim...@ovn.org/
>
> I'm not sure if this patch is addressing all the concerns raised by Han
> in that thread.  I.e. how does this change behave when we have multiple
> stages of condition changes (more than 2)?  Or is it not an issue?
> At the very least, I'd like to see concerns from that thread to be
> addressed in the commit message, so we don't need to look for information
> in many different email chains.
>
> Best regards, Ilya Maximets.
>
>
>>  From my side
>>
>> Acked-by: Numan Siddque <num...@ovn.org>
>>
>> Numan
>>
>>
>>
>>> ---
>>>   controller/ofctrl.c         | 29 ++++++++++++++++-
>>>   controller/ofctrl.h         |  3 +-
>>>   controller/ovn-controller.c |  4 ++-
>>>   tests/ovn-controller.at     | 62 +++++++++++++++++++++++++++++++++++++
>>>   4 files changed, 95 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/controller/ofctrl.c b/controller/ofctrl.c
>>> index 4a3d35b97..304b9bbc8 100644
>>> --- a/controller/ofctrl.c
>>> +++ b/controller/ofctrl.c
>>> @@ -349,6 +349,9 @@ static uint64_t cur_cfg;
>>>   /* Current state. */
>>>   static enum ofctrl_state state;
>>>
>>> +/* Release wait before clear stage. */
>>> +static bool wait_before_clear_proceed = false;
>>> +
>>>   /* The time (ms) to stay in the state S_WAIT_BEFORE_CLEAR. Read from
>>>    * external_ids: ovn-ofctrl-wait-before-clear. */
>>>   static unsigned int wait_before_clear_time = 0;
>>> @@ -446,6 +449,7 @@ run_S_NEW(void)
>>>       struct ofpbuf *buf = ofpraw_alloc(OFPRAW_NXT_TLV_TABLE_REQUEST,
>>>                                         rconn_get_version(swconn), 0);
>>>       xid = queue_msg(buf);
>>> +    wait_before_clear_proceed = false;
>>>       state = S_TLV_TABLE_REQUESTED;
>>>   }
>>>
>>> @@ -638,6 +642,14 @@ error:
>>>   static void
>>>   run_S_WAIT_BEFORE_CLEAR(void)
>>>   {
>>> +    if (wait_before_clear_time == 0) {
>>> +        if (wait_before_clear_proceed) {
>>> +            state = S_CLEAR_FLOWS;
>>> +        }
>>> +
>>> +        return;
>>> +    }
>>> +
>>>       if (!wait_before_clear_time ||
>>>           (wait_before_clear_expire &&
>>>            time_msec() >= wait_before_clear_expire)) {
>>> @@ -2695,11 +2707,26 @@ ofctrl_put(struct ovn_desired_flow_table 
>>> *lflow_table,
>>>              uint64_t req_cfg,
>>>              bool lflows_changed,
>>>              bool pflows_changed,
>>> -           struct tracked_acl_ids *tracked_acl_ids)
>>> +           struct tracked_acl_ids *tracked_acl_ids,
>>> +           bool monitor_cond_complete)
>>>   {
>>>       static bool skipped_last_time = false;
>>>       static uint64_t old_req_cfg = 0;
>>>       bool need_put = false;
>>> +
>>> +    if (state == S_WAIT_BEFORE_CLEAR) {
>>> +        /* If no more monitored condition changes expected
>>> +           Release wait before clear stage and skip
>>> +           over poll wait. */
>>> +        if (monitor_cond_complete) {
>>> +            wait_before_clear_proceed = true;
>>> +            poll_immediate_wake();
>>> +        }
>>> +
>>> +        skipped_last_time = true;
>>> +        return;
>>> +    }
>>> +
>>>       if (lflows_changed || pflows_changed || skipped_last_time ||
>>>           ofctrl_initial_clear) {
>>>           need_put = true;
>>> diff --git a/controller/ofctrl.h b/controller/ofctrl.h
>>> index d1ee69cb0..76e2fbece 100644
>>> --- a/controller/ofctrl.h
>>> +++ b/controller/ofctrl.h
>>> @@ -68,7 +68,8 @@ void ofctrl_put(struct ovn_desired_flow_table 
>>> *lflow_table,
>>>                   uint64_t nb_cfg,
>>>                   bool lflow_changed,
>>>                   bool pflow_changed,
>>> -                struct tracked_acl_ids *tracked_acl_ids);
>>> +                struct tracked_acl_ids *tracked_acl_ids,
>>> +                bool monitor_cond_complete);
>>>   bool ofctrl_has_backlog(void);
>>>   void ofctrl_wait(void);
>>>   void ofctrl_destroy(void);
>>> diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
>>> index bc8acddf1..2623fc758 100644
>>> --- a/controller/ovn-controller.c
>>> +++ b/controller/ovn-controller.c
>>> @@ -6384,7 +6384,9 @@ main(int argc, char *argv[])
>>>                                      ofctrl_seqno_get_req_cfg(),
>>>                                      engine_node_changed(&en_lflow_output),
>>>                                      engine_node_changed(&en_pflow_output),
>>> -                                   tracked_acl_ids);
>>> +                                   tracked_acl_ids,
>>> +                                   ovnsb_cond_seqno
>>> +                                   == ovnsb_expected_cond_seqno);
>>>                           stopwatch_stop(OFCTRL_PUT_STOPWATCH_NAME, 
>>> time_msec());
>>>                       }
>>>                       stopwatch_start(OFCTRL_SEQNO_RUN_STOPWATCH_NAME,
>>> diff --git a/tests/ovn-controller.at b/tests/ovn-controller.at
>>> index efb0a1741..812616711 100644
>>> --- a/tests/ovn-controller.at
>>> +++ b/tests/ovn-controller.at
>>> @@ -2463,6 +2463,68 @@ AT_CHECK([ovs-ofctl dump-flows br-int | grep group 
>>> -c], [0], [3
>>>   OVN_CLEANUP([hv1])
>>>   AT_CLEANUP
>>>
>>> +OVN_FOR_EACH_NORTHD([
>>> +AT_SETUP([ovn-controller - ofctrl delay until all monitored updates come])
>>> +
>>> +# Prepare testing configuration
>>> +ovn_start
>>> +
>>> +net_add n1
>>> +sim_add hv1
>>> +as hv1
>>> +check ovs-vsctl add-br br-phys
>>> +ovn_attach n1 br-phys 192.168.0.1
>>> +check ovs-vsctl -- add-port br-int hv1-vif1 -- \
>>> +    set interface hv1-vif1 external-ids:iface-id=ls1-lp1
>>> +
>>> +check ovn-nbctl ls-add ls1
>>> +
>>> +check ovn-nbctl lsp-add ls1 ls1-lp1 \
>>> +-- lsp-set-addresses ls1-lp1 "f0:00:00:00:00:01 10.1.2.3"
>>> +
>>> +check ovn-nbctl lb-add lb1 1.1.1.1 10.1.2.3 \
>>> +-- ls-lb-add ls1 lb1
>>> +
>>> +check ovn-nbctl lb-add lb2 2.2.2.2 10.1.2.4 \
>>> +-- ls-lb-add ls1 lb2
>>> +
>>> +check ovn-nbctl --wait=hv sync
>>> +
>>> +# Stop ovn-controller
>>> +OVS_APP_EXIT_AND_WAIT([ovn-controller])
>>> +
>>> +# Sine we test initial flow upload controller is restarted.
>>> +# Clear log file and start controller.
>>> +rm -f hv1/ovn-controller.log
>>> +start_daemon ovn-controller -vfile:jsonrpc:dbg -vfile:ofctrl:dbg
>>> +
>>> +# Monitor log file until flow finally uploaded to OVS
>>> +OVS_WAIT_UNTIL([grep -q 'Setting lport.*in OVS' hv1/ovn-controller.log])
>>> +
>>> +# Analyse log file, select records about:
>>> +# 1. monitor_cond changes made for SB DB (message class is 'jsonrpc')
>>> +# 2. 'clearing all flows' message which is issued after 'wait before
>>> +#    clear' stage released (message class is 'ofctrl')
>>> +#
>>> +# We expect here that all monitoring condition changes should be made 
>>> before
>>> +# OVS flow cleared / uploaded.
>>> +# For now all monitoring updates comes in three iterations: initial,
>>> +# direct dps, indirect dps that corresponds to
>>> +# three messages of type 1 followed by one message of type 2
>>> +#
>>> +# For monitor-all=true one message of type 1 followed by one message of 
>>> type 2
>>> +#
>>> +# Then we cut off message class and take first letter
>>> +# (j for jsonrpc and o for ofctrl)
>>> +#
>>> +call_seq=`grep -E \
>>> + "(clearing all flows)|(monitor_cond.*South)" \
>>> + hv1/ovn-controller.log | cut -d "|" -f 3- | cut -b 1 | tr -d '\n'`
>>> +AT_CHECK([echo $call_seq | grep -qE "^j+o$"], [0])
>>> +
>>> +OVN_CLEANUP([hv1])
>>> +AT_CLEANUP
>>> +])
>>>
>>>   AT_SETUP([ovn-controller - check ovn-chassis-mac-mappings])
>>>
>>> --
>>> 2.47.0
>>>
>>> _______________________________________________
>>> dev mailing list
>>> d...@openvswitch.org
>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev


_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to