On Mon, Apr 6, 2026 at 7:46 PM Jacob Tanenbaum <[email protected]> wrote:

> when ovn-monitor-all is set to false the ovn-controller sets
> ovn-installed on OVS interfaces too early. ovn-controller needs to wait
> for the response from the southbound database with the updates to the
> newly monitored southbound fields and after that wait for flows to be
> installed in OVS before labeling as installed.
>
> Reported-at: https://redhat.atlassian.net/browse/FDP-2887
> Signed-off-by: Jacob Tanenbaum <[email protected]>
> ---
>

Hi Jacob,

thank you for the v5. I have some comments down below.


> v4->v5
> * corrected a sanitizer error: used bool update_seqno without
>   initializing
>
> v3->v4
> * Added state OIF_WAITING_SB_COND to the state machine that manages
>   the adding of interfaces. This state waits until the Southbound has
>   updated the ovn-controller of relevent information about ports related
>   to it
> * Addressed several nits
>
> v2->v3
> * adding the ld->monitor_updated required the additiona of checking for
>   monitor_all in update_sb_monitors. I didn't account for being able to
>   toggle on monitor_all
>
> v1->v2
> * if_status_mgr_run() will run everytime the conditional seqno is
>   changed so it should be safe to only skip when the expected_seqno and
>   seqno returned from ovn are strictly not equal, that way we do not
>   have to deal with overflow in the seqno. Additionally add a boolean to
>   the local_datapath in the event that the seqno wraps around at the
>   same time the datapath would go back into the state OIF_INSTALL_FLOWS.
> * remove setting the state to itself for OIF_INSTALL_FLOWS in
>   if_status_mgr_update()
> * added assert(pb) in if_status_mgr_run()
> * removed a manual loop looking for the local_datapath and replaced with
>   get_local_datapath() in if_status_mgr_run
> * remove a few nit spelling errors in the test case
>
> asfdd
>
> diff --git a/controller/if-status.c b/controller/if-status.c
> index ee9337e63..5f807dd80 100644
> --- a/controller/if-status.c
> +++ b/controller/if-status.c
> @@ -18,6 +18,7 @@
>  #include "binding.h"
>  #include "if-status.h"
>  #include "lib/ofctrl-seqno.h"
> +#include "local_data.h"
>  #include "ovsport.h"
>  #include "simap.h"
>
> @@ -58,6 +59,9 @@ VLOG_DEFINE_THIS_MODULE(if_status);
>  enum if_state {
>      OIF_CLAIMED,          /* Newly claimed interface. pb->chassis update
> not
>                               yet initiated. */
> +    OIF_WAITING_SB_COND,  /*  Waiting for the SB updates for a given
> datapath
> +                           *
> +                           */
>      OIF_INSTALL_FLOWS,    /* Claimed interface with pb->chassis update
> sent to
>                             * SB (but update notification not confirmed,
> so the
>                             * update may be resent in any of the following
> @@ -87,6 +91,7 @@ enum if_state {
>
>  static const char *if_state_names[] = {
>      [OIF_CLAIMED]          = "CLAIMED",
> +    [OIF_WAITING_SB_COND]  = "WAITING_SB_COND",
>      [OIF_INSTALL_FLOWS]    = "INSTALL_FLOWS",
>      [OIF_REM_OLD_OVN_INST] = "REM_OLD_OVN_INST",
>      [OIF_MARK_UP]          = "MARK_UP",
> @@ -103,19 +108,35 @@ static const char *if_state_names[] = {
>   * | |   +----------------------+
>   * | |     ^ release_iface   | claim_iface()
>   * | |     |                 V - sbrec_update_chassis(if sb is rw)
> - * | |   +----------------------+
> - * | |   |                      |
> <------------------------------------------+
> - * | |   |       CLAIMED        |
> <----------------------------------------+ |
> - * | |   |                      |
> <--------------------------------------+ | |
> + * | | | +----------------------+
> + * | | | |                      |
> <------------------------------------------+
> + * | | | |       CLAIMED        |
> <----------------------------------------+ |
> + * | | | |                      |
> <--------------------------------------+ | |
> + * | | | +----------------------+
> | | |
> + * | | |               |  V  ^
>  | | |
> + * | | |               |  |  | handle_claims()
>  | | |
> + * | | |               |  |  | - sbrec_update_chassis(if sb is rw)
>  | | |
> + * | | |               |  +--+
>  | | |
> + * | | |               |
>  | | |
> + * | | |               | mgr_update(when sb is rw i.e. pb->chassis)
> | | |
> + * | | |               |            has been updated
>  | | |
> + * | | | release_iface |
>  | | |
> + * | | |               |
>  | | |
> + * | | |               V
>  | | |
> + * | | | +----------------------+
> | | |
> + * | | +-|                      |
> | | |
> + * | |   |    WAITING_SB_COND   |
> | | |
> + * | |   |                      |
> | | |
> + * | |   |                      |
> | | |
>   * | |   +----------------------+
> | | |
> - * | |                 |  V  ^
>  | | |
> - * | |                 |  |  | handle_claims()
>  | | |
> - * | |                 |  |  | - sbrec_update_chassis(if sb is rw)
>  | | |
> - * | |                 |  +--+
>  | | |
>   * | |                 |
>  | | |
> - * | |                 | mgr_update(when sb is rw i.e. pb->chassis)
> | | |
> - * | |                 |            has been updated
>  | | |
> - * | | release_iface   | - request seqno
>  | | |
> + * | |                 |
>  | | |
> + * | |                 |
>  | | |
> + * | |                 |   mgr_update(when sb_cond_seqno == expected)
> | | |
> + * | |                 |   - request seqno
>  | | |
> + * | |                 |
>  | | |
> + * | |                 |
>  | | |
> + * | | release_iface   |
>  | | |
>   * | |                 |
>  | | |
>   * | |                 V
>  | | |
>   * | |   +----------------------+
> | | |
> @@ -335,6 +356,7 @@ if_status_mgr_claim_iface(struct if_status_mgr *mgr,
>
>      switch (iface->state) {
>      case OIF_CLAIMED:
> +    case OIF_WAITING_SB_COND:
>      case OIF_INSTALL_FLOWS:
>      case OIF_REM_OLD_OVN_INST:
>      case OIF_MARK_UP:
> @@ -383,6 +405,7 @@ if_status_mgr_release_iface(struct if_status_mgr *mgr,
> const char *iface_id)
>
>      switch (iface->state) {
>      case OIF_CLAIMED:
> +    case OIF_WAITING_SB_COND:
>      case OIF_INSTALL_FLOWS:
>          /* Not yet fully installed interfaces:
>           * pb->chassis still need to be deleted.
> @@ -424,6 +447,7 @@ if_status_mgr_delete_iface(struct if_status_mgr *mgr,
> const char *iface_id,
>
>      switch (iface->state) {
>      case OIF_CLAIMED:
> +    case OIF_WAITING_SB_COND:
>      case OIF_INSTALL_FLOWS:
>          /* Not yet fully installed interfaces:
>           * pb->chassis still need to be deleted.
> @@ -500,6 +524,8 @@ if_status_mgr_update(struct if_status_mgr *mgr,
>                       const struct sbrec_chassis *chassis_rec,
>                       const struct ovsrec_interface_table *iface_table,
>                       const struct sbrec_port_binding_table *pb_table,
> +                     const struct hmap *local_datapaths,
> +                     const unsigned int ovnsb_cond_seqno,
>                       bool ovs_readonly,
>                       bool sb_readonly)
>  {
> @@ -614,7 +640,6 @@ if_status_mgr_update(struct if_status_mgr *mgr,
>
>      /* Move newly claimed interfaces from OIF_CLAIMED to
> OIF_INSTALL_FLOWS.
>       */
> -    bool new_ifaces = false;
>      if (!sb_readonly) {
>          HMAPX_FOR_EACH_SAFE (node, &mgr->ifaces_per_state[OIF_CLAIMED]) {
>              struct ovs_iface *iface = node->data;
> @@ -622,9 +647,7 @@ if_status_mgr_update(struct if_status_mgr *mgr,
>               * in if_status_handle_claims or if_status_mgr_claim_iface
>               */
>              if (iface->is_vif) {
> -                ovs_iface_set_state(mgr, iface, OIF_INSTALL_FLOWS);
> -                iface->install_seqno = mgr->iface_seqno + 1;
> -                new_ifaces = true;
> +                ovs_iface_set_state(mgr, iface, OIF_WAITING_SB_COND);
>              } else {
>                  ovs_iface_set_state(mgr, iface, OIF_MARK_UP);
>              }
> @@ -654,12 +677,42 @@ if_status_mgr_update(struct if_status_mgr *mgr,
>                           iface->id);
>          }
>      }
> +    /* Check the WAITING_SB_COND nodes after transitioning nodes from
> CLAIMED
> +     * as condition could already be satisfied to move to INSTALL_FLOWS.
> +     */
> +    bool update_seqno = false;
> +    if (!sb_readonly) {
> +        HMAPX_FOR_EACH_SAFE (node,
> +                             &mgr->ifaces_per_state[OIF_WAITING_SB_COND])
> {
> +            struct ovs_iface *iface = node->data;
> +            if (local_datapaths) {
> +                const struct sbrec_port_binding *pb =
> +                    sbrec_port_binding_table_get_for_uuid(pb_table,
> +
> &iface->pb_uuid);
> +                ovs_assert(pb);
> +                struct local_datapath *ld =
> +                    get_local_datapath(local_datapaths,
> +                                       pb->datapath->tunnel_key);
> +                if (!ld) {
> +                    continue;
> +                }
> +                if (ld->monitor_updated ||
> +                    ld->expected_cond_seqno == ovnsb_cond_seqno) {
>

This check should be done outside, this has the potential to
actually skip an iface update because of the sb_readonly
thus never marking the interface as up.


> +
> +                    ovs_iface_set_state(mgr, iface, OIF_INSTALL_FLOWS);
> +                    iface->install_seqno = mgr->iface_seqno + 1;
> +                    update_seqno = true;
> +                }
> +            }
> +        }
> +    }
> +
>      /* Register for a notification about flows being installed in OVS for
> all
> -     * newly claimed interfaces for which pb->chassis has been updated.
> -     * Request a seqno update when the flows for new interfaces have been
> -     * installed in OVS.
> +     * newly claimed interfaces for which pb->chassis has been updated
> and all
> +     * updates have been received from SB. Request a seqno update when the
> +     * flows for new interfaces have been installed in OVS.
>       */
> -    if (new_ifaces) {
> +    if (update_seqno) {
>

I don't think we need to rename this as the
semantics are still the same just delayed by the potential
seqno update.


>          mgr->iface_seqno++;
>          ofctrl_seqno_update_create(mgr->iface_seq_type_pb_cfg,
>                                     mgr->iface_seqno);
> diff --git a/controller/if-status.h b/controller/if-status.h
> index d15ca3008..08aa4faf6 100644
> --- a/controller/if-status.h
> +++ b/controller/if-status.h
> @@ -43,6 +43,8 @@ void if_status_mgr_update(struct if_status_mgr *, struct
> local_binding_data *,
>                            const struct sbrec_chassis *chassis,
>                            const struct ovsrec_interface_table
> *iface_table,
>                            const struct sbrec_port_binding_table *pb_table,
> +                          const struct hmap *local_datapaths,
> +                          const unsigned int ovnsb_cond_seqno,
>                            bool ovs_readonly,
>                            bool sb_readonly);
>  void if_status_mgr_run(struct if_status_mgr *mgr, struct
> local_binding_data *,
> diff --git a/controller/local_data.h b/controller/local_data.h
> index 948c1a935..3ec14c30c 100644
> --- a/controller/local_data.h
> +++ b/controller/local_data.h
> @@ -65,6 +65,11 @@ struct local_datapath {
>
>      struct shash external_ports;
>      struct shash multichassis_ports;
> +
> +    /* The expected seqno from the sb to be fully updated for this
> datapath. */
> +    unsigned int expected_cond_seqno;
> +    /* If the monitor has been updated for this datapath. */
> +    bool monitor_updated;
>

I'm confused why is this needed? We don't need this see below.


>  };
>
>  struct local_datapath *local_datapath_alloc(
> diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
> index 4161fe2b3..dc8f7c0cc 100644
> --- a/controller/ovn-controller.c
> +++ b/controller/ovn-controller.c
> @@ -443,6 +443,18 @@ out:;
>          expected_cond_seqno = MAX(expected_cond_seqno, cond_seqnos[i]);
>      }
>
> +    if (local_datapaths) {
> +        struct local_datapath *ld;
> +        HMAP_FOR_EACH (ld, hmap_node, local_datapaths) {
> +            if (monitor_all) {
> +                ld->monitor_updated = true;
> +            }
> +            if (!ld->monitor_updated) {
> +                ld->expected_cond_seqno = expected_cond_seqno;
>

This doesn't feel right, we will update the seqno even for
datapaths that already have the condition updated.


> +            }
> +        }
> +    }
> +
>      ovsdb_idl_condition_destroy(&pb);
>      ovsdb_idl_condition_destroy(&lf);
>      ovsdb_idl_condition_destroy(&ldpg);
> @@ -7862,6 +7874,10 @@ main(int argc, char *argv[])
>                                                      ovs_idl_loop.idl),
>                                           sbrec_port_binding_table_get(
>                                                      ovnsb_idl_loop.idl),
> +                                         runtime_data ?
> +
>  &runtime_data->local_datapaths
> +                                               : NULL,
> +                                         ovnsb_cond_seqno,
>                                           !ovs_idl_txn,
>                                           !ovnsb_idl_txn);
>                      stopwatch_stop(IF_STATUS_MGR_UPDATE_STOPWATCH_NAME,
> diff --git a/tests/ovn-controller.at b/tests/ovn-controller.at
> index c98de9bc4..d8db9c345 100644
> --- a/tests/ovn-controller.at
> +++ b/tests/ovn-controller.at
> @@ -3944,3 +3944,68 @@ OVN_CLEANUP([hv1], [hv2
>  /already has encap ip.*cannot duplicate on/d])
>  AT_CLEANUP
>  ])
> +
> +OVN_FOR_EACH_NORTHD([
> +AT_SETUP([ovn-installed])
> +ovn_start
> +
> +net_add n1
> +sim_add hv1
> +
> +as hv1
> +ovs-vsctl add-br br-phys
> +ovn_attach n1 br-phys 192.168.0.1
> +ovn-appctl vlog/set dbg
>

nit: Leftover?


> +ovs-vsctl add-port br-int vif1 -- \
> +    set Interface vif1 external-ids:iface-id=lsp1
> +
> +check ovn-nbctl ls-add ls1
> +sleep_controller hv1
> +check ovn-nbctl --wait=sb lsp-add ls1 lsp1 -- \
> +                          lsp-set-addresses lsp1 "f0:00:00:00:00:01
> 10.0.0.1"
> +
> +sleep_sb
> +wake_up_controller hv1
> +
> +# Wait for pflow for lsp1
> +OVS_WAIT_UNTIL([
> +    ofport=$(as hv1 ovs-vsctl --bare --columns ofport find Interface
> name=vif1)
> +    echo "vif1 port=$ofport"
> +    test -n "$ofport" && test 1 -le $(as hv1 ovs-ofctl dump-flows br-int
> | grep -c in_port=$ofport)
> +])
> +
> +# If ovn-installed in ovs, all flows should be installed.
> +# In that case, there should be at least one flow with lsp1 address.
> +OVS_WAIT_UNTIL([
> +    ovn_installed=$(as hv1 ovs-vsctl get Interface vif1
> external_ids:ovn-installed)
> +    flow_count=$(as hv1 ovs-ofctl dump-flows br-int | grep -Fc "10.0.0.1")
> +    # for the monitor-all=true case the flow gets installed because
> ovn-controller is monitoring all
> +    # tables in OVN_SOUTHBOUND.
> +    if test -n "$ovn_installed"; then
> +        as hv1 ovs-ofctl dump-flows br-int > output
> +        test $flow_count -ge 1
> +    else
> +        true
> +    fi
> +])
> +
> +wake_up_sb
> +# After the southbound db has woken up and can send the update to the
> +# ovn-controller not monitoring all tables in the southbound db it
> +# should be able to install the interface.
> +OVS_WAIT_UNTIL([
> +    ovn_installed=$(as hv1 ovs-vsctl get Interface vif1
> external_ids:ovn-installed)
> +    flow_count=$(as hv1 ovs-ofctl dump-flows br-int | grep -Fc "10.0.0.1")
> +    echo "installed=$ovn_installed, count=$flow_count"
> +    if test -n "$ovn_installed"; then
> +        as hv1 ovs-ofctl dump-flows br-int > output
> +        test $flow_count -ge 1
> +    else
> +        false
> +    fi
> +])
> +wait_for_ports_up
> +
> +OVN_CLEANUP([hv1])
> +AT_CLEANUP
> +])
> --
> 2.53.0
>
>
I think we only need a boolean flag in the local datapath
e.g. "monitor_cond_updated" or something like that.
The flag will be false by default so any newly added datapath
is not cosidered to be ready. Now we need to do a few things:

1) If we have monitor_all mark all datapaths as ready, that needs
    to be done before we run the "if_status_mgr_run()". It probably
    fits within "if (engine_node_changed(&en_runtime_data)
    || engine_node_changed(&en_sb_logical_dp_group))".

2) We need to set the flag to true for datapaths that were updated,
     that can probably happen inside:
     "if (ovnsb_cond_seqno == ovnsb_expected_cond_seqno &&
          ovnsb_expected_cond_seqno != UINT_MAX &&
          sb_monitor_all)".

Unfortunately both can't happen in option 2 because that would mean
one loop delay for monitor_all case.

Regards,
Ales
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to