Hi Ales,
Thank you for the review
i addressed all your comments in v2.
regarding the test actually, it's a bit hard to test that using a unit test
because we need to let ovn-sb ignore the first binding request which is not
applicable using a unit test, i was testing that manually by dropping the
connection to the SB that will drop the first bind request but it a bit
hard
to do that i an unit test.


On Fri, Feb 2, 2024 at 1:21 PM Ales Musil <amu...@redhat.com> wrote:

>
>
> On Fri, Feb 2, 2024 at 12:19 PM Ales Musil <amu...@redhat.com> wrote:
>
>>
>>
>> On Tue, Jan 30, 2024 at 2:59 PM Mohammad Heib <mh...@redhat.com> wrote:
>>
>>> ovn-controller immediately removes the vport_bindings requests that were
>>> generated by VIFs after handling them locally, this approach is intended
>>> to avoid binding the vport to one VIF only and allocate the vport
>>> between the different VIFs that exist in the vport:virtual-parents.
>>>
>>> Although the behavior mentioned above is correct, in some cases when the
>>> SB Database is busy the transaction that binds this vport to the desired
>>> VIF/chassis can fail and the controller will not re-try to bind the
>>> vport again because we deleted the bind_vport request in the previous
>>> loop/TXN.
>>>
>>> This patch aims to change the above behavior by storing the bind_vport
>>> requests for a bit longer time and this is done by the following:
>>>     1. add relevancy_time for each new bind_vport request and
>>>        mark this request as new.
>>>
>>>     2. loop0: ovn-controller will try to handle this bind_vport request
>>>        for the first time as usual (no change).
>>>
>>>    3. loop0: ovn-controller will try to delete the already handled
>>> bind_vport
>>>       request as usual but first, it will check if this request is
>>> marked as new and
>>>       if the relevancy_time is still valid if so the controller will
>>> mark this
>>>       request as an old request and keep it, otherwise remove it.
>>>
>>>    4.loop1: ovn-controller will try to commit the same change again for
>>>      the old request, if the previous commit in loop0 succeeded the
>>>      change will not have any effect on SB, otherwise we will try to
>>>      commit the same vport_bind request again.
>>>
>>>   5. loop1: delete the old bind_vport request.
>>>
>>> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1954659
>>> Signed-off-by: Mohammad Heib <mh...@redhat.com>
>>> ---
>>>
>>
>> Hi  Mohammad,
>>
>> overall the change makes sense, I have a couple of comments see down
>> below.
>>
>>  controller/pinctrl.c | 58 +++++++++++++++++++++++++++++++++++++++-----
>>>  1 file changed, 52 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/controller/pinctrl.c b/controller/pinctrl.c
>>> index bd3bd3d81..152962448 100644
>>> --- a/controller/pinctrl.c
>>> +++ b/controller/pinctrl.c
>>> @@ -6519,10 +6519,52 @@ struct put_vport_binding {
>>>      uint32_t vport_key;
>>>
>>>      uint32_t vport_parent_key;
>>> +
>>> +    /* This vport record Only relevant if "relevancy_time"
>>> +     * is earlier than the current_time, "new_record" is true.
>>> +     */
>>> +    long long int relevancy_time;
>>>
>>
>> The intention of the variable should be probably clearer e.g.
>> relevant_until_ms.
>>
>> Also reading through the rest of the code it doesn't seem possible that
>> the binding wouldn't be deleted, hence I think there isn't any need for the
>> relevancy time, it should be enough to have a flag that will be flipped. In
>> any case we will try to commit twice. I would leave out the whole relevancy
>> and keep the flag flipping it on first commit WDYT?
>>
>>
>>> +    bool new_record;
>>>  };
>>>
>>>  /* Contains "struct put_vport_binding"s. */
>>>  static struct hmap put_vport_bindings;
>>> +/* the relevance time in ms of vport record before deleteing. */
>>> +#define VPORT_RELEVANCE_TIME 1500
>>> +
>>> +/*
>>> + * Validate if the vport_binding record that was added
>>> + * by the pinctrl thread is still relevant and needs
>>> + * to be updated in the SBDB or not.
>>> + *
>>> + * vport_binding record is only relevant and needs to be updated in SB
>>> if:
>>> + *   1. The put_vport_binding:relevancy_time still valid.
>>> + *   2. The put_vport_binding:new_record is true:
>>> + *       The new_record will be set to "true" when this vport record is
>>> created
>>> + *       by function "pinctrl_handle_bind_vport".
>>> + *
>>> + *       After the first attempt to bind this vport to the chassis and
>>> + *       virtual_parent by function "run_put_vport_bindings" we will
>>> set the
>>> + *       value of vpb:new_record to "false" and keep it in
>>> "put_vport_bindings"
>>> + *
>>> + *       After the second attempt of binding the vpb it will be removed
>>> by
>>> + *       this function.
>>> + *
>>> + *       The above guarantees that we will try to bind the vport twice
>>> in
>>> + *       a certain amount of time.
>>> + *
>>> +*/
>>> +static bool
>>> +is_vport_binding_relevant(struct put_vport_binding *vpb)
>>> +{
>>> +    long long int cur_time = time_msec();
>>> +
>>> +    if (vpb->new_record && vpb->relevancy_time > cur_time) {
>>> +        vpb->new_record = false;
>>>
>>
>> This is a nasty side effect that I wouldn't expect from starting with
>> is_.
>>
>>
>>> +        return true;
>>> +    }
>>> +    return false;
>>> +}
>>>
>>>  static void
>>>  init_put_vport_bindings(void)
>>> @@ -6531,18 +6573,21 @@ init_put_vport_bindings(void)
>>>  }
>>>
>>>  static void
>>> -flush_put_vport_bindings(void)
>>> +flush_put_vport_bindings(bool force_flush)
>>>  {
>>>      struct put_vport_binding *vport_b;
>>> -    HMAP_FOR_EACH_POP (vport_b, hmap_node, &put_vport_bindings) {
>>> -        free(vport_b);
>>> +    HMAP_FOR_EACH_SAFE (vport_b, hmap_node, &put_vport_bindings) {
>>> +        if (!is_vport_binding_relevant(vport_b) || force_flush) {
>>> +            hmap_remove(&put_vport_bindings, &vport_b->hmap_node);
>>> +            free(vport_b);
>>> +        }
>>>      }
>>>  }
>>>
>>>  static void
>>>  destroy_put_vport_bindings(void)
>>>  {
>>> -    flush_put_vport_bindings();
>>> +    flush_put_vport_bindings(true);
>>>      hmap_destroy(&put_vport_bindings);
>>>  }
>>>
>>> @@ -6620,7 +6665,7 @@ run_put_vport_bindings(struct ovsdb_idl_txn
>>> *ovnsb_idl_txn,
>>>                                sbrec_port_binding_by_key, chassis, vpb);
>>>      }
>>>
>>> -    flush_put_vport_bindings();
>>> +    flush_put_vport_bindings(false);
>>>  }
>>>
>>>  /* Called with in the pinctrl_handler thread context. */
>>> @@ -6658,7 +6703,8 @@ pinctrl_handle_bind_vport(
>>>      vpb->dp_key = dp_key;
>>>      vpb->vport_key = vport_key;
>>>      vpb->vport_parent_key = vport_parent_key;
>>> -
>>> +    vpb->new_record = true;
>>> +    vpb->relevancy_time = time_msec() + VPORT_RELEVANCE_TIME;
>>>      notify_pinctrl_main();
>>>  }
>>>
>>> --
>>> 2.34.3
>>>
>>> _______________________________________________
>>> dev mailing list
>>> d...@openvswitch.org
>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>>
>>>
>> Thanks,
>> Ales
>> --
>>
>> Ales Musil
>>
>> Senior Software Engineer - OVN Core
>>
>> Red Hat EMEA <https://www.redhat.com>
>>
>> amu...@redhat.com
>> <https://red.ht/sig>
>>
>
> I forgot to mention, is it possible to come up with a test that would
> check this behavior?
>
> --
>
> Ales Musil
>
> Senior Software Engineer - OVN Core
>
> Red Hat EMEA <https://www.redhat.com>
>
> amu...@redhat.com
> <https://red.ht/sig>
>
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to