On 3/6/23 15:22, Xavier Simonart wrote:
> Hi Dumitru
> 

Hi Xavier,

> Thanks for the review and the comments
> 

Thanks for followin up!

> On Wed, Feb 15, 2023 at 2:15 PM Dumitru Ceara <[email protected]> wrote:
> 
>> Hi Xavier,
>>
>> Thanks for this new version and sorry for the time it took to review it!
>>  I think it mostly looks good; I only have a few minor remarks/questions
>> below.
>>
>> On 1/4/23 09:32, Xavier Simonart wrote:
>>> OVN checks whether ovn-installed is already present (in OVS) before
>> updating it.
>>> This might cause ovn-installed related issues in the following case:
>>> - (1) ovn-installed is present
>>> - (2) we claim the interface
>>> - (3) we update ovs, removing ovn-installed and start installing flows
>>> - (4) (next loop), after flows installed, we check if ovn-installed is
>> absent,
>>>   and if yes, we update OVS with ovn-installed.
>>> However, in step4, if OVS is still busy from step 3, ovn-installed is
>> read as
>>> present; hence OVN controller does not update it and move to INSTALLED
>> state.
>>>
>>> ovn-installed was also not properly deleted on port or binding removal.
>>>
>>> Note that port->up is not properly removed on external_ids:iface-id
>> removal when
>>> sb is read-only. This will be fixed in a further patch.
>>>
>>> Fixes: a7c7d4519e50 ("controller: avoid recomputes triggered by SBDB
>> Port_Binding updates.")
>>>
>>> Signed-off-by: Xavier Simonart <[email protected]>
>>>
>>
>> Nit: there should be a "---" here.  Like that the comments below don't
>> get included in the actual commit.  We can probably fix this at apply
>> time so there's no need to send a v3 for now.
>>
>> Oops - thanks, good catch.
> 
>>> v2: - handled Dumitru's comments
>>>     - rebased on main
>>>     - updated state machine diagram as commented in v1 commit message
>>>     - remove ovn-installed on port deletion or external_ids:iface-id
>> removal.
>>>     - added test case
>>> ---
>>>  controller/binding.c   |  59 +++++++++-
>>>  controller/binding.h   |   6 +
>>>  controller/if-status.c | 113 +++++++++++++-----
>>>  tests/ovn.at           | 257 +++++++++++++++++++++++++++++++++++++++++
>>>  4 files changed, 403 insertions(+), 32 deletions(-)
>>>
>>> diff --git a/controller/binding.c b/controller/binding.c
>>> index 5df62baef..d85a17730 100644
>>> --- a/controller/binding.c
>>> +++ b/controller/binding.c
>>> @@ -746,6 +746,19 @@ local_binding_get_lport_ofport(const struct shash
>> *local_bindings,
>>>              u16_to_ofp(lbinding->iface->ofport[0]) : 0;
>>>  }
>>>
>>> +bool
>>> +local_binding_is_ovn_installed(struct shash *local_bindings,
>>> +                               const char *pb_name)
>>> +{
>>> +    struct local_binding *lbinding =
>>> +        local_binding_find(local_bindings, pb_name);
>>> +    if (lbinding && lbinding->iface) {
>>> +        return smap_get_bool(&lbinding->iface->external_ids,
>>> +                             OVN_INSTALLED_EXT_ID, false);
>>> +    }
>>> +    return false;
>>> +}
>>> +
>>>  bool
>>>  local_binding_is_up(struct shash *local_bindings, const char *pb_name,
>>>                      const struct sbrec_chassis *chassis_rec)
>>> @@ -783,6 +796,7 @@ local_binding_is_down(struct shash *local_bindings,
>> const char *pb_name,
>>>          } else if (b_lport->pb->chassis) {
>>>              VLOG_DBG("lport %s already claimed by other chassis",
>>>                       b_lport->pb->logical_port);
>>> +            return true;
>>>          }
>>>      }
>>>
>>> @@ -834,6 +848,33 @@ local_binding_set_up(struct shash *local_bindings,
>> const char *pb_name,
>>>      }
>>>  }
>>>
>>> +static void
>>> +remove_ovn_installed(struct local_binding *lbinding, const char
>> *pb_name)
>>> +{
>>> +    if (lbinding && lbinding->iface &&
>>> +        smap_get_bool(&lbinding->iface->external_ids,
>>> +                             OVN_INSTALLED_EXT_ID, false)) {
>>> +        /* If iface has been deleted, do not try to delete a key from
>> it */
>>> +        if (!ovsrec_interface_is_deleted(lbinding->iface)) {
>>
>> Not really a problem, it just felt a bit weird to read this knowing that
>> the function can be called even when not processing tracked changes.
>> Still, it should be fine.
>>
>> I think this is really due to the fact that, when we handle tracked
> changes and a db is ro, we have two choices
> - fail as we were doing before, and recompute
> - do nothing in the current loop, and have if-status "reconcile", but hence
> not based on tracked changes
> 

OK, I guess.

>>> +            VLOG_INFO("Removing lport %s ovn-installed in OVS",
>> pb_name);
>>> +            ovsrec_interface_update_external_ids_delkey(lbinding->iface,
>>> +
>> OVN_INSTALLED_EXT_ID);
>>> +        }
>>> +    }
>>> +}
>>> +
>>> +void
>>> +local_binding_remove_ovn_installed(struct shash *local_bindings,
>>> +                                   const char *pb_name, bool
>> ovs_readonly)
>>> +{
>>> +    if (ovs_readonly) {
>>> +        return;
>>> +    }
>>> +    struct local_binding *lbinding =
>>> +        local_binding_find(local_bindings, pb_name);
>>> +    remove_ovn_installed(lbinding, pb_name);
>>> +}
>>> +
>>>  void
>>>  local_binding_set_down(struct shash *local_bindings, const char
>> *pb_name,
>>>                         const struct sbrec_chassis *chassis_rec,
>>> @@ -1239,7 +1280,9 @@ claim_lport(const struct sbrec_port_binding *pb,
>>>                      return false;
>>>                  }
>>>              } else {
>>> -                if (pb->n_up && !pb->up[0]) {
>>> +                if ((pb->n_up && !pb->up[0]) ||
>>> +                    !smap_get_bool(&iface_rec->external_ids,
>>> +                                   OVN_INSTALLED_EXT_ID, false)) {
>>>                      if_status_mgr_claim_iface(if_mgr, pb, chassis_rec,
>>>                                                sb_readonly);
>>>                  }
>>> @@ -1464,9 +1507,11 @@ consider_vif_lport_(const struct
>> sbrec_port_binding *pb,
>>>              const char *requested_chassis_option = smap_get(
>>>                  &pb->options, "requested-chassis");
>>>              VLOG_INFO_RL(&rl,
>>> -                "Not claiming lport %s, chassis %s requested-chassis
>> %s",
>>> +                "Not claiming lport %s, chassis %s requested-chassis %s
>> "
>>> +                "pb->chassis %s",
>>>                  pb->logical_port, b_ctx_in->chassis_rec->name,
>>> -                requested_chassis_option ? requested_chassis_option :
>> "[]");
>>> +                requested_chassis_option ? requested_chassis_option :
>> "[]",
>>> +                pb->chassis ? pb->chassis->name: "");
>>>          }
>>>      }
>>>
>>> @@ -2288,6 +2333,9 @@ consider_iface_release(const struct
>> ovsrec_interface *iface_rec,
>>>                  return false;
>>>              }
>>>          }
>>> +        if (b_ctx_in->ovs_idl_txn) {
>>> +            remove_ovn_installed(lbinding, iface_id);
>>> +        }
>>
>> Shouldn't this be done through local_binding_remove_ovn_installed()
>> instead?
> 
> I think that both are quite similar:  local_binding_remove_ovn_installed()
> might be more readable, but it requires an additional local_binding_find,
> so slightly slower
> 
>> Can't we just move the iface to OIF_REMOVE_OVN_INST state here
>> and let if_status_mgr_update_bindings() synchronize ovn_installed?  Like
>> that we reconcile the OVS interface in a single place, in if-status.c.
>>
> I do not think that we can easily use the if-status module for this, as
> lbinding->iface is cleared(a few lines down) before if-status has a chance
> to run and remove ovn-install. Once binding->iface is cleared we can't
> (easily) remove ovn-install
> 

Hmm, OK, but what if b_ctx_in->ovs_idl_txn is NULL?  We're just not
going to remove ovn-installed at all then AFAICT.  Would it be costly to
lookup the interface in the if-status module?

>>
>>>
>>>      } else if (lbinding && b_lport && b_lport->type == LP_LOCALPORT) {
>>>          /* lbinding is associated with a localport.  Remove it from the
>>> @@ -2558,6 +2606,7 @@ handle_deleted_lport(const struct
>> sbrec_port_binding *pb,
>>>      if (ld) {
>>>          remove_pb_from_local_datapath(pb,
>>>                                        b_ctx_out, ld);
>>> +        if_status_mgr_release_iface(b_ctx_out->if_mgr,
>> pb->logical_port);
>>>          return;
>>>      }
>>>
>>> @@ -2581,6 +2630,7 @@ handle_deleted_lport(const struct
>> sbrec_port_binding *pb,
>>>              remove_pb_from_local_datapath(pb, b_ctx_out,
>>>                                            ld);
>>>          }
>>> +        if_status_mgr_release_iface(b_ctx_out->if_mgr,
>> pb->logical_port);
>>>      }
>>>  }
>>>
>>> @@ -2627,6 +2677,9 @@ handle_deleted_vif_lport(const struct
>> sbrec_port_binding *pb,
>>>      }
>>>
>>>      handle_deleted_lport(pb, b_ctx_in, b_ctx_out);
>>> +    if (b_ctx_in->ovs_idl_txn) {
>>> +        remove_ovn_installed(lbinding, pb->logical_port);
>>> +    }
>>
>> Same comment as above applies here I think.
>>
>>>      return true;
>>>  }
>>>
>>> diff --git a/controller/binding.h b/controller/binding.h
>>> index 6c3a98b02..72b4a35b6 100644
>>> --- a/controller/binding.h
>>> +++ b/controller/binding.h
>>> @@ -159,6 +159,12 @@ bool local_binding_is_up(struct shash
>> *local_bindings, const char *pb_name,
>>>  bool local_binding_is_down(struct shash *local_bindings, const char
>> *pb_name,
>>>                             const struct sbrec_chassis *);
>>>
>>> +bool local_binding_is_ovn_installed(struct shash *local_bindings,
>>> +                                    const char *pb_name);
>>> +void local_binding_remove_ovn_installed(struct shash *local_bindings,
>>> +                                        const char *pb_name,
>>> +                                        bool ovs_readonly);
>>> +
>>>  void local_binding_set_up(struct shash *local_bindings, const char
>> *pb_name,
>>>                            const struct sbrec_chassis *chassis_rec,
>>>                            const char *ts_now_str, bool sb_readonly,
>>> diff --git a/controller/if-status.c b/controller/if-status.c
>>> index d1c14ac30..c008aa79e 100644
>>> --- a/controller/if-status.c
>>> +++ b/controller/if-status.c
>>> @@ -54,32 +54,37 @@ VLOG_DEFINE_THIS_MODULE(if_status);
>>>   */
>>>
>>>  enum if_state {
>>> -    OIF_CLAIMED,       /* Newly claimed interface. pb->chassis update
>> not yet
>>> -                          initiated. */
>>> -    OIF_INSTALL_FLOWS, /* Claimed interface with pb->chassis update
>> sent to
>>> -                        * SB (but update notification not confirmed, so
>> the
>>> -                        * update may be resent in any of the following
>> states)
>>> -                        * and for which flows are still being installed.
>>> -                        */
>>> -    OIF_MARK_UP,       /* Interface with flows successfully installed
>> in OVS
>>> -                        * but not yet marked "up" in the binding module
>> (in
>>> -                        * SB and OVS databases).
>>> -                        */
>>> -    OIF_MARK_DOWN,     /* Released interface but not yet marked "down"
>> in the
>>> -                        * binding module (in SB and/or OVS databases).
>>> -                        */
>>> -    OIF_INSTALLED,     /* Interface flows programmed in OVS and binding
>> marked
>>> -                        * "up" in the binding module.
>>> -                        */
>>> +    OIF_CLAIMED,         /* Newly claimed interface. pb->chassis update
>> not yet
>>> +                            initiated. */
>>> +    OIF_INSTALL_FLOWS,   /* Claimed interface with pb->chassis update
>> sent to
>>> +                          * SB (but update notification not confirmed,
>> so the
>>> +                          * update may be resent in any of the following
>>> +                          * states and for which flows are still being
>>> +                          * installed.
>>> +                          */
>>> +    OIF_REMOVE_OVN_INST, /* Interface with flows successfully installed
>> in OVS,
>>
>> This doesn't sound correct to me.  Did you mean something like
>> "Interface for which flows are still being installed in OVS"?
>>
> I think it is correct, even though it sounds strange (see the graph below):
> this state is used instead of the MARK_UP state
> when flows are successfully installed, but ovn-install was present before
> we started installing flows, and is still being removed.
> While installing flows, we were also removing ovn-install
> (local_binding_set_down in if_status_mgr_update_bindings).
> In this case,  moving directly to MARK_UP might result in a race condition
> ending up in ovn-install
> not installed.
> Maybe the name of the state should be changed ? OIF_REMOVE_OLD_OVN_INST ?
> 

I think I prefer this one, yes.

>>
>>> +                          * but with ovn-installed still in OVSDB.
>>> +                          */
>>> +    OIF_MARK_UP,         /* Interface with flows successfully installed
>> in OVS
>>> +                          * but not yet marked "up" in the binding
>> module (in
>>> +                          * SB and OVS databases).
>>> +                          */
>>> +    OIF_MARK_DOWN,       /* Released interface but not yet marked
>> "down" in the
>>> +                          * binding module (in SB and/or OVS databases).
>>> +                          */
>>> +    OIF_INSTALLED,       /* Interface flows programmed in OVS and
>> binding
>>> +                          * marked "up" in the binding module.
>>> +                          */
>>>      OIF_MAX,
>>>  };
>>>
>>>  static const char *if_state_names[] = {
>>> -    [OIF_CLAIMED]       = "CLAIMED",
>>> -    [OIF_INSTALL_FLOWS] = "INSTALL_FLOWS",
>>> -    [OIF_MARK_UP]       = "MARK_UP",
>>> -    [OIF_MARK_DOWN]     = "MARK_DOWN",
>>> -    [OIF_INSTALLED]     = "INSTALLED",
>>> +    [OIF_CLAIMED]         = "CLAIMED",
>>> +    [OIF_INSTALL_FLOWS]   = "INSTALL_FLOWS",
>>> +    [OIF_REMOVE_OVN_INST] = "REMOVE_OVN_INST",
>>> +    [OIF_MARK_UP]         = "MARK_UP",
>>> +    [OIF_MARK_DOWN]       = "MARK_DOWN",
>>> +    [OIF_INSTALLED]       = "INSTALLED",
>>>  };
>>>
>>>  /*
>>> @@ -109,11 +114,26 @@ static const char *if_state_names[] = {
>>>   * |     |                      |   - remove ovn-installed from ovsdb
>>   | | |
>>>   * |     |                      |  mgr_update()
>>   | | |
>>>   * |     +----------------------+   - sbrec_update_chassis if needed
>>  | | |
>>> - * |                    |
>>   | | |
>>> - * |                    |  mgr_run(seqno rcvd)
>>  | | |
>>> - * |                    |  - set port up in sb
>>  | | |
>>> - * | release_iface      |  - set ovn-installed in ovs
>>   | | |
>>> - * |                    V
>>   | | |
>>> + * |        |            |
>>  | | |
>>> + * |        |            +----------------------------------------+
>>   | | |
>>> + * |        |                                                     |
>>   | | |
>>> + * |        | mgr_run(seqno rcvd, ovn-installed present)          |
>>   | | |
>>> + * |        V                                                     |
>>   | | |
>>> + * |    +--------------------+                                    |
>>   | | |
>>> + * |    |                    |  mgr_run()                         |
>>   | | |
>>> + * +--- | REMOVE_OVN_INST    |  - remove ovn-installed in ovs     |
>>   | | |
>>> + * |    +--------------------+                                    |
>>   | | |
>>> + * |               |                                              |
>>   | | |
>>> + * |               |                                              |
>>   | | |
>>> + * |               | mgr_update( ovn_installed not present)       |
>>   | | |
>>> + * |               |                                              |
>>   | | |
>>> + * |               |  +-------------------------------------------+
>>   | | |
>>> + * |               |  |
>>   | | |
>>> + * |               |  |  mgr_run(seqno rcvd, ovn-installed not
>> present)  | | |
>>> + * |               |  |  - set port up in sb
>>  | | |
>>> + * |               |  |  - set ovn-installed in ovs
>>   | | |
>>> + * |release_iface  |  |
>>   | | |
>>> + * |               V  V
>>   | | |
>>>   * |   +----------------------+
>>   | | |
>>>   * |   |                      |  mgr_run()
>>  | | |
>>>   * +-- |       MARK_UP        |  - set port up in sb
>>  | | |
>>> @@ -238,6 +258,7 @@ if_status_mgr_claim_iface(struct if_status_mgr *mgr,
>>>      switch (iface->state) {
>>>      case OIF_CLAIMED:
>>>      case OIF_INSTALL_FLOWS:
>>> +    case OIF_REMOVE_OVN_INST:
>>>      case OIF_MARK_UP:
>>>          /* Nothing to do here. */
>>>          break;
>>> @@ -274,6 +295,7 @@ if_status_mgr_release_iface(struct if_status_mgr
>> *mgr, const char *iface_id)
>>>          /* Not yet fully installed interfaces can be safely deleted. */
>>>          ovs_iface_destroy(mgr, iface);
>>>          break;
>>> +    case OIF_REMOVE_OVN_INST:
>>>      case OIF_MARK_UP:
>>>      case OIF_INSTALLED:
>>>          /* Properly mark interfaces "down" if their flows were already
>>> @@ -305,6 +327,7 @@ if_status_mgr_delete_iface(struct if_status_mgr
>> *mgr, const char *iface_id)
>>>          /* Not yet fully installed interfaces can be safely deleted. */
>>>          ovs_iface_destroy(mgr, iface);
>>>          break;
>>> +    case OIF_REMOVE_OVN_INST:
>>>      case OIF_MARK_UP:
>>>      case OIF_INSTALLED:
>>>          /* Properly mark interfaces "down" if their flows were already
>>> @@ -359,6 +382,17 @@ if_status_mgr_update(struct if_status_mgr *mgr,
>>>      struct shash *bindings = &binding_data->bindings;
>>>      struct hmapx_node *node;
>>>
>>> +    /* Move all interfaces that have been confirmed without
>> ovn-installed,
>>> +     * from OIF_REMOVE_OVN_INST to OIF_MARK_UP.
>>> +     */
>>> +    HMAPX_FOR_EACH_SAFE (node,
>> &mgr->ifaces_per_state[OIF_REMOVE_OVN_INST]) {
>>> +        struct ovs_iface *iface = node->data;
>>> +
>>> +        if (!local_binding_is_ovn_installed(bindings, iface->id)) {
>>> +            ovs_iface_set_state(mgr, iface, OIF_MARK_UP);
>>> +        }
>>> +    }
>>> +
>>>      /* Interfaces in OIF_MARK_UP/INSTALL_FLOWS state have already set
>> their
>>>       * pb->chassis. However, the update might still be in fly
>> (confirmation
>>>       * not received yet) or pb->chassis was overwitten by another
>> chassis.
>>> @@ -471,7 +505,19 @@ if_status_mgr_run(struct if_status_mgr *mgr,
>>>                                            iface->install_seqno)) {
>>>              continue;
>>>          }
>>> -        ovs_iface_set_state(mgr, iface, OIF_MARK_UP);
>>> +        /* Wait for ovn-installed to be absent before moving to MARK_UP
>> state.
>>> +         * Most of the times ovn-installed is already absent and hence
>> we will
>>> +         * not have to wait.
>>> +         * If there is no binding_data, we can't determine if
>> ovn-installed is
>>> +         * present or not; hence also go to the OIF_REMOVE_OVN_INST
>> state.
>>> +         */
>>> +        if (!binding_data ||
>>> +            local_binding_is_ovn_installed(&binding_data->bindings,
>>> +                                           iface->id)) {
>>> +            ovs_iface_set_state(mgr, iface, OIF_REMOVE_OVN_INST);
> 
> Will this cause significant/measurable delays in setting ovn-installed?
>> I think not but I was wondering if you have some test data for this part
>> here.
>>
> I do not think so either as:
> - this case (having ovn-installed present when we start to install flows,
> and so having to remove it) should not occur much anymore. We were hitting
> this case before because we were not properly removing ovn-installed when
> removing bindings.
> - we remove ovn-install while installing flows (i.e. in parallel, not in
> serie). So, there might be a delay if DB is slow. But before this patch,
> the same case was causing ovn-install to be missing
> 

I see, thanks for the explanation!

Regards,
Dumitru


_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to