Re: [ovs-dev] ovn: branch-22 (?): sb_cfg propagation delay. Research & fix.

Aleksandr Smirnov (K2 Cloud) Wed, 05 Feb 2025 08:45:29 -0800

Hi Dumitru,

I sent my proposal to patchwork
https://patchwork.ozlabs.org/project/ovn/patch/[email protected]/


While I am not sure you really need to adopt that patch to main stream 
your critics will be valuable because anyways I need this fix 
implemented for our working environment.

Thank you,

Alexander

On 2/5/25 3:09 PM, Dumitru Ceara wrote:
> On 2/5/25 1:04 PM, Aleksandr Smirnov (K2 Cloud) wrote:
>> On 2/5/25 2:00 PM, Dumitru Ceara wrote:
>>> On 2/5/25 11:50 AM, Aleksandr Smirnov (K2 Cloud) wrote:
>>>> Hi guys,
>>>>
>>> Hi Aleksandr,
>>>
>>>> We experienced some problems with CMS when watching nb_cfg/sb_cfg up.
>>>> A problem is that inappropriate time delay happens between nb_cfg bump
>>>> and sb_cfg bump.
>>>> A research shows that the reason is double run of northd engine combined
>>>> with improper sequence of calls
>>>> engine <-> counter update.
>>>>
>>>> Exactly, making the change in logical network structure combined with
>>>> nb_cfg bump results in the following
>>>> execution sequence:
>>>>
>>>> 1. 'update3' notification of changed made in ovn-nbctl
>>>> 2. engine recompute run, (long time here)
>>>> 3. 'transact' request to NB DB + SB DB, send results of engine's work
>>>> 4. 'update3' + 'reply' received. this update3 fires back our changes we
>>>> just sent to SB DB.
>>>> 5. engine recompute second run, (long time here, because change handles
>>>> are not implemented in branch-22)
>>>> 6. Call update_sequence_numbers(), sb_cfg is sent to NB DB.
>>>>
>>>> Thus, time delay between p4 - p6 need to be eliminated.
>>>>
>>>> For main branch, where all change handlers are implemented, p.5 takes a
>>>> short time so, problem is not so visible.
>>>>
>>> Is there a chance that we hit this on main too though?  E.g. when some
>>> of the change handlers fail to incrementally process inputs?
>> Yes, there is. I just checked a behavior of the main branch in the same
>> test environment.
>> An execution sequence is the same.
>> So, if, by some reason, the engine falls info full recompute, the
>> problem will becomes visible.
>>
> Ack, thanks for checking!
>
>>>> I have prepared a patch that is going to fix this problem in our
>>>> environment, made over branch-22.09, see in attachment.
>>>>
>>> I can't see the attachment (maybe mailman scrubbed it?).
>> Ok, place it in text right below:
>>
>> diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
>> index 1ebb85498..d0f9b96ad 100644
>> --- a/northd/ovn-northd.c
>> +++ b/northd/ovn-northd.c
>> @@ -536,6 +536,17 @@ update_sequence_numbers(int64_t loop_start_time,
>>            }
>>        }
>>    }
>> +
>> +static bool
>> +sb_cfg_is_updated(struct ovsdb_idl *ovnnb_idl,
>> +                  struct ovsdb_idl_loop *sb_loop)
>> +{
>> +    const struct nbrec_nb_global *nb = nbrec_nb_global_first(ovnnb_idl);
>> +
>> +    return (nb && sb_loop->cur_cfg && nb->sb_cfg != sb_loop->cur_cfg)
>> +           ? true : false;
>> +}
>> +
>>    
>>    static void
>>    usage(void)
>> @@ -853,6 +864,7 @@ main(int argc, char *argv[])
>>                simap_destroy(&usage);
>>            }
>>
>> +        bool clear_idl_track = true;
>>            if (!state.paused) {
>>                if (!ovsdb_idl_has_lock(ovnsb_idl_loop.idl) &&
>>                    !ovsdb_idl_is_lock_contended(ovnsb_idl_loop.idl))
>> @@ -903,8 +915,26 @@ main(int argc, char *argv[])
>>
>>                if (ovsdb_idl_has_lock(ovnsb_idl_loop.idl)) {
>>                    int64_t loop_start_time = time_wall_msec();
>> -                inc_proc_northd_run(ovnnb_txn, ovnsb_txn, recompute);
>> -                recompute = false;
>> +
>> +                /* Check if sb_cfg can be propagated.
>> +                   Meaning is that the engine already made a bunch of
>> updates
>> +                   and those updated are already committed to SB DB.
>> +                   So, it's time to bump sb_cfg in the NB DB.
>> +                   However, the engine will run again as response to
>> 'update3'
>> +                   notification from the server that reflects our recent
>> +                  'transact'. This will delay sb_cfg bump for an
>> inappropriate
>> +                  long time. So, the solution is to skip engine run for
>> this
>> +                  iteration and make bump transaction that delivers sb_cfg
>> +                  immediately. Engine will be run during the next
>> iteration.
>> +                  We also preserve DB tracked info for the next engine run
>> +                  (see clear_idl_track). */
>> +                if
>> (!sb_cfg_is_updated(ovnnb_idl_loop.idl,&ovnsb_idl_loop)) {
>> +                    inc_proc_northd_run(ovnnb_txn, ovnsb_txn, recompute);
>> +                    recompute = false;
>> +                } else {
>> +                    clear_idl_track = false;
>> +                }
>> +
> This is not a trivial change and needs proper review.  Skipping the IDL
> "track clear" can be complex.
>
> I'm not necessarily against it though (I didn't think it through
> properly) but I think it deserves a formal patch to be posted for main
> branch so we can review it properly.
>
> Thanks,
> Dumitru
>
>>                    if (ovnsb_txn) {
>>                        check_and_add_supported_dhcp_opts_to_sb_db(
>>                                     ovnsb_txn, ovnsb_idl_loop.idl);
>> @@ -968,8 +998,10 @@ main(int argc, char *argv[])
>>                recompute = true;
>>            }
>>
>> -        ovsdb_idl_track_clear(ovnnb_idl_loop.idl);
>> -        ovsdb_idl_track_clear(ovnsb_idl_loop.idl);
>> +        if (clear_idl_track) {
>> +            ovsdb_idl_track_clear(ovnnb_idl_loop.idl);
>> +            ovsdb_idl_track_clear(ovnsb_idl_loop.idl);
>> +        }
>>
>>            unixctl_server_run(unixctl);
>>            unixctl_server_wait(unixctl);


_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] ovn: branch-22 (?): sb_cfg propagation delay. Research & fix.

Reply via email to