Hi Ilya,

On 8/22/22 11:19, Ilya Maximets wrote:
> daemon_started_recently() concept is flawed in terms that it uses
> fixed number of iterations for a countdown and a fixed timeout, so
> the undesired removal of configured resources can still happen under
> certain conditions even with this mechanism in place.
> 
> The root cause of the original problem is that conditional monitoring
> in ovn-controller works in stages.  ovn-controller doesn't know all
> the information it needs to monitor until the first monitor reply
> arrives, then it adjusts conditions and receives all the data.  That
> doesn't work well with the ovsdb_idl_has_ever_connected() check that
> will return 'true' right after the first monitor reply.
> This leads to situation where resources can be removed after the
> first monitor reply and re-added back after condition change.
> 
> Instead of waiting a fixed number of iterations/seconds, the correct
> solution should be to always start with unconditional monitoring.
> When the full initial snapshot is received, switch to conditional
> monitoring as data is complete.  This way ovsdb_idl_has_ever_connected()
> check will provide an accurate answer to the question: if we have
> a sufficient amount of data to make decisions about resource removal?

This does sound like the cleanest solution, I agree.  But does it create
an issue in large scale deployments where conditional monitoring is
used?  The full initial SB snapshot is likely quite large.

> 
> Once switched to conditional monitoring, southbound database will
> remove all the unnecessary data from the ovn-controller, so it will
> not need to keep it for long.  And since setup is using conditional
> monitoring, it's likely not that big anyway to be concerned about
> a brief spike in memory consumption on startup.
> 

For OpenShift and RedHat OpenStack this stands, we use
ovn-monitor-all=true for now.  But I'm not sure about other large scale
deployments.  I'll let Han comment more on this too.

> This change will also remove unnecessary waiting on startup when all
> the data is already available.
> 
> The main change is in the ovsdb_idl_has_ever_connected() check in the
> update_sb_monitors(), the rest is mostly a revert of commits that
> introduced the startup delay.
> 
> Test enhanced to run with and w/o conditional monitoring, since the
> issue is closely related it it.
> 

Good idea.

> Alternative solution might be to store the first condition change
> sequence number and replace all ovsdb_idl_has_ever_connected() calls
> with a function that also checks for the current sequence number being
> higher than condition change one, but that will require slightly more
> code.  Can be implemented in the future, if the first unconditional
> monitor reply will become a problem some day.
> 
> Fixes: 2b27eb3482a3 ("patch.c: Avoid patch interface deletion & recreation 
> during restart.")

It seems to me we need to do something similar for OpenFlow clearing
too.  Right now we wait a configured amount of time:

https://github.com/ovn-org/ovn/commit/896adfd2d

Regards,
Dumitru

> Signed-off-by: Ilya Maximets <[email protected]>
> ---
>  TODO.rst                    |  7 -----
>  controller/ovn-controller.c | 31 ++++++++--------------
>  controller/patch.c          |  9 +------
>  controller/vif-plug.c       | 33 +++++------------------
>  controller/vif-plug.h       |  1 +
>  lib/inc-proc-eng.c          | 11 --------
>  lib/inc-proc-eng.h          |  4 ---
>  lib/ovn-util.c              | 52 -------------------------------------
>  lib/ovn-util.h              |  4 ---
>  tests/ovn-controller.at     |  3 ++-
>  tests/ovn.at                |  2 --
>  11 files changed, 22 insertions(+), 135 deletions(-)
> 
> diff --git a/TODO.rst b/TODO.rst
> index 12738280b..d54817011 100644
> --- a/TODO.rst
> +++ b/TODO.rst
> @@ -172,10 +172,3 @@ OVN To-do List
>    * Multi-threaded logical flow computation was optimized for the case
>      when datapath groups are disabled.  Datpath groups are always enabled
>      now so northd parallel processing should be revisited.
> -
> -* ovn-controller daemon module
> -
> -  * Dumitru Ceara: Add a new module e.g. ovn/lib/daemon-ovn.c that wraps
> -    OVS' daemonize_start() call and initializes the additional things, like
> -    the unixctl commands. Or, we should move the APIs such as
> -    daemon_started_recently() to OVS's lib/daemon.
> diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
> index 0b0ccc48a..2ac6409bc 100644
> --- a/controller/ovn-controller.c
> +++ b/controller/ovn-controller.c
> @@ -94,7 +94,6 @@ static unixctl_cb_func debug_dump_lflow_conj_ids;
>  static unixctl_cb_func lflow_cache_flush_cmd;
>  static unixctl_cb_func lflow_cache_show_stats_cmd;
>  static unixctl_cb_func debug_delay_nb_cfg_report;
> -static unixctl_cb_func debug_ignore_startup_delay;
>  
>  #define DEFAULT_BRIDGE_NAME "br-int"
>  #define DEFAULT_DATAPATH "system"
> @@ -190,7 +189,14 @@ update_sb_monitors(struct ovsdb_idl *ovnsb_idl,
>       * avoid the unnecessarily extra wake-ups of ovn-controller. */
>      ovsdb_idl_condition_add_clause_true(&ldpg);
>  
> -    if (monitor_all) {
> +    /* Don't enable conditional monitoring until we have a full view on a
> +     * data that we need to monitor.  Without that we will send a first
> +     * monitor request with incomplete conditions, receive a reply, update
> +     * conditions and receive the rest of data.  Such a split may result in
> +     * removal and re-addition of certain resources (e.g. patch ports)
> +     * on restart of ovn-controller.  Database will send deletion requiests
> +     * for all the data that we don't actually need. */
> +    if (monitor_all || !ovsdb_idl_has_ever_connected(ovnsb_idl)) {
>          ovsdb_idl_condition_add_clause_true(&pb);
>          ovsdb_idl_condition_add_clause_true(&lf);
>          ovsdb_idl_condition_add_clause_true(&mb);
> @@ -867,12 +873,11 @@ static void
>  store_nb_cfg(struct ovsdb_idl_txn *sb_txn, struct ovsdb_idl_txn *ovs_txn,
>               const struct sbrec_chassis_private *chassis,
>               const struct ovsrec_bridge *br_int,
> -             unsigned int delay_nb_cfg_report)
> +             unsigned int delay_nb_cfg_report, int64_t startup_ts)
>  {
>      struct ofctrl_acked_seqnos *acked_nb_cfg_seqnos =
>          ofctrl_acked_seqnos_get(ofctrl_seq_type_nb_cfg);
>      uint64_t cur_cfg = acked_nb_cfg_seqnos->last_acked;
> -    int64_t startup_ts = daemon_startup_ts();
>  
>      if (ovs_txn && br_int
>              && startup_ts != smap_get_ullong(&br_int->external_ids,
> @@ -3860,9 +3865,6 @@ main(int argc, char *argv[])
>                               debug_dump_lflow_conj_ids,
>                               &lflow_output_data->conj_ids);
>  
> -    unixctl_command_register("debug/ignore-startup-delay", "", 0, 0,
> -                             debug_ignore_startup_delay, NULL);
> -
>      unsigned int ovs_cond_seqno = UINT_MAX;
>      unsigned int ovnsb_cond_seqno = UINT_MAX;
>      unsigned int ovnsb_expected_cond_seqno = UINT_MAX;
> @@ -3884,6 +3886,7 @@ main(int argc, char *argv[])
>      /* Main loop. */
>      exiting = false;
>      restart = false;
> +    int64_t startup_ts = time_wall_msec();
>      bool sb_monitor_all = false;
>      while (!exiting) {
>          memory_run();
> @@ -4062,10 +4065,6 @@ main(int argc, char *argv[])
>                      }
>                      stopwatch_stop(CONTROLLER_LOOP_STOPWATCH_NAME,
>                                     time_msec());
> -                    if (engine_has_updated()) {
> -                        daemon_started_recently_countdown();
> -                    }
> -
>                      ct_zones_data = engine_get_data(&en_ct_zones);
>                      if (ovs_idl_txn) {
>                          if (ct_zones_data) {
> @@ -4228,7 +4227,7 @@ main(int argc, char *argv[])
>              }
>  
>              store_nb_cfg(ovnsb_idl_txn, ovs_idl_txn, chassis_private,
> -                         br_int, delay_nb_cfg_report);
> +                         br_int, delay_nb_cfg_report, startup_ts);
>  
>              if (pending_pkt.conn) {
>                  struct ed_type_addr_sets *as_data =
> @@ -4705,11 +4704,3 @@ debug_dump_lflow_conj_ids(struct unixctl_conn *conn, 
> int argc OVS_UNUSED,
>      unixctl_command_reply(conn, ds_cstr(&conj_ids_dump));
>      ds_destroy(&conj_ids_dump);
>  }
> -
> -static void
> -debug_ignore_startup_delay(struct unixctl_conn *conn, int argc OVS_UNUSED,
> -                           const char *argv[] OVS_UNUSED, void *arg 
> OVS_UNUSED)
> -{
> -    daemon_started_recently_ignore();
> -    unixctl_command_reply(conn, NULL);
> -}
> diff --git a/controller/patch.c b/controller/patch.c
> index 12e0b6f7c..ed831302c 100644
> --- a/controller/patch.c
> +++ b/controller/patch.c
> @@ -311,14 +311,7 @@ patch_run(struct ovsdb_idl_txn *ovs_idl_txn,
>      SHASH_FOR_EACH_SAFE (port_node, &existing_ports) {
>          port = port_node->data;
>          shash_delete(&existing_ports, port_node);
> -        /* Wait for some iterations before really deleting any patch ports,
> -         * because with conditional monitoring it is possible that related SB
> -         * data is not completely downloaded yet after last restart of
> -         * ovn-controller.  Otherwise it may cause unncessary dataplane
> -         * interruption during restart/upgrade. */
> -        if (!daemon_started_recently()) {
> -            remove_port(bridge_table, port);
> -        }
> +        remove_port(bridge_table, port);
>      }
>      shash_destroy(&existing_ports);
>  }
> diff --git a/controller/vif-plug.c b/controller/vif-plug.c
> index 38348bf54..eafc513f5 100644
> --- a/controller/vif-plug.c
> +++ b/controller/vif-plug.c
> @@ -469,8 +469,7 @@ vif_plug_iface_touched_this_txn(
>  static bool
>  vif_plug_handle_lport_vif(const struct sbrec_port_binding *pb,
>                            struct vif_plug_ctx_in *vif_plug_ctx_in,
> -                          struct vif_plug_ctx_out *vif_plug_ctx_out,
> -                          bool can_unplug)
> +                          struct vif_plug_ctx_out *vif_plug_ctx_out)
>  {
>      if (vif_plug_iface_touched_this_txn(vif_plug_ctx_out, pb->logical_port)) 
> {
>          return true;
> @@ -482,7 +481,7 @@ vif_plug_handle_lport_vif(const struct sbrec_port_binding 
> *pb,
>      if (lport_can_bind_on_this_chassis(vif_plug_ctx_in->chassis_rec, pb)) {
>          handled &= consider_plug_lport(pb, lbinding,
>                                         vif_plug_ctx_in, vif_plug_ctx_out);
> -    } else if (can_unplug && lbinding && lbinding->iface) {
> +    } else if (lbinding && lbinding->iface) {
>          handled &= consider_unplug_iface(lbinding->iface, pb,
>                                           vif_plug_ctx_in, vif_plug_ctx_out);
>      }
> @@ -492,8 +491,7 @@ vif_plug_handle_lport_vif(const struct sbrec_port_binding 
> *pb,
>  static bool
>  vif_plug_handle_iface(const struct ovsrec_interface *iface_rec,
>                        struct vif_plug_ctx_in *vif_plug_ctx_in,
> -                      struct vif_plug_ctx_out *vif_plug_ctx_out,
> -                      bool can_unplug)
> +                      struct vif_plug_ctx_out *vif_plug_ctx_out)
>  {
>      bool handled = true;
>      const char *vif_plug_type = smap_get(&iface_rec->external_ids,
> @@ -513,10 +511,8 @@ vif_plug_handle_iface(const struct ovsrec_interface 
> *iface_rec,
>           * consider updating it */
>          handled &= consider_plug_lport(pb, lbinding,
>                                         vif_plug_ctx_in, vif_plug_ctx_out);
> -    } else if (can_unplug
> -               && (!pb
> -                   || !lport_can_bind_on_this_chassis(
> -                       vif_plug_ctx_in->chassis_rec, pb))) {
> +    } else if (!pb || !lport_can_bind_on_this_chassis(
> +                           vif_plug_ctx_in->chassis_rec, pb)) {
>          /* No lport for this interface or it is destined for different 
> chassis,
>           * consuder unplugging it */
>          handled &= consider_unplug_iface(iface_rec, pb,
> @@ -525,31 +521,17 @@ vif_plug_handle_iface(const struct ovsrec_interface 
> *iface_rec,
>      return handled;
>  }
>  
> -/* On initial startup or on IDL reconnect, several rounds of the main loop 
> may
> - * run before data is actually loaded in the IDL, primarily depending on
> - * conditional monitoring status and other events that could trigger main 
> loop
> - * runs during this period.  Until we find a reliable way to determine the
> - * completeness of the initial data downloading we need this counter so that 
> we
> - * do not erronously unplug ports because the data is just not loaded yet.
> - */
>  void
>  vif_plug_run(struct vif_plug_ctx_in *vif_plug_ctx_in,
>               struct vif_plug_ctx_out *vif_plug_ctx_out)
>  {
> -    bool delay_plug = daemon_started_recently();
> -    if (delay_plug) {
> -        VLOG_DBG("vif_plug_run: daemon started recently, will not unplug "
> -                 "ports in this iteration.");
> -    }
> -
>      if (!vif_plug_ctx_in->chassis_rec) {
>          return;
>      }
>      const struct ovsrec_interface *iface_rec;
>      OVSREC_INTERFACE_TABLE_FOR_EACH (iface_rec,
>                                       vif_plug_ctx_in->iface_table) {
> -        vif_plug_handle_iface(iface_rec, vif_plug_ctx_in, vif_plug_ctx_out,
> -                              !delay_plug);
> +        vif_plug_handle_iface(iface_rec, vif_plug_ctx_in, vif_plug_ctx_out);
>      }
>  
>      struct sbrec_port_binding *target =
> @@ -564,8 +546,7 @@ vif_plug_run(struct vif_plug_ctx_in *vif_plug_ctx_in,
>              vif_plug_ctx_in->sbrec_port_binding_by_requested_chassis) {
>          enum en_lport_type lport_type = get_lport_type(pb);
>          if (lport_type == LP_VIF) {
> -            vif_plug_handle_lport_vif(pb, vif_plug_ctx_in, vif_plug_ctx_out,
> -                                      !delay_plug);
> +            vif_plug_handle_lport_vif(pb, vif_plug_ctx_in, vif_plug_ctx_out);
>          }
>      }
>      sbrec_port_binding_index_destroy_row(target);
> diff --git a/controller/vif-plug.h b/controller/vif-plug.h
> index 7a1978e38..76063591b 100644
> --- a/controller/vif-plug.h
> +++ b/controller/vif-plug.h
> @@ -71,6 +71,7 @@ void vif_plug_clear_changed(struct shash 
> *deleted_iface_ids);
>  void vif_plug_finish_changed(struct shash *changed_iface_ids);
>  void vif_plug_clear_deleted(struct shash *deleted_iface_ids);
>  void vif_plug_finish_deleted(struct shash *changed_iface_ids);
> +void vif_plug_reset_idl_prime_counter(void);
>  
>  #ifdef  __cplusplus
>  }
> diff --git a/lib/inc-proc-eng.c b/lib/inc-proc-eng.c
> index 2e2b31033..575b774ae 100644
> --- a/lib/inc-proc-eng.c
> +++ b/lib/inc-proc-eng.c
> @@ -312,17 +312,6 @@ engine_has_run(void)
>      return false;
>  }
>  
> -bool
> -engine_has_updated(void)
> -{
> -    for (size_t i = 0; i < engine_n_nodes; i++) {
> -        if (engine_nodes[i]->state == EN_UPDATED) {
> -            return true;
> -        }
> -    }
> -    return false;
> -}
> -
>  bool
>  engine_aborted(void)
>  {
> diff --git a/lib/inc-proc-eng.h b/lib/inc-proc-eng.h
> index dc365dc18..9bfab1f7c 100644
> --- a/lib/inc-proc-eng.h
> +++ b/lib/inc-proc-eng.h
> @@ -238,10 +238,6 @@ bool engine_node_changed(struct engine_node *node);
>  /* Return true if the engine has run in the last iteration. */
>  bool engine_has_run(void);
>  
> -/* Return true if the engine has any update in any node, i.e. any input
> - * has changed; false if nothing has changed. */
> -bool engine_has_updated(void);
> -
>  /* Returns true if during the last engine run we had to abort processing. */
>  bool engine_aborted(void);
>  
> diff --git a/lib/ovn-util.c b/lib/ovn-util.c
> index d80db179a..616999eab 100644
> --- a/lib/ovn-util.c
> +++ b/lib/ovn-util.c
> @@ -883,55 +883,3 @@ get_bridge(const struct ovsrec_bridge_table 
> *bridge_table, const char *br_name)
>      }
>      return NULL;
>  }
> -
> -#define DAEMON_STARTUP_DELAY_SEED 20
> -#define DAEMON_STARTUP_DELAY_MS   10000
> -
> -static int64_t startup_ts;
> -static int startup_delay = DAEMON_STARTUP_DELAY_SEED;
> -
> -/* Used by debug command only, for tests. */
> -static bool ignore_startup_delay = false;
> -
> -OVS_CONSTRUCTOR(startup_ts_initializer) {
> -    startup_ts = time_wall_msec();
> -}
> -
> -int64_t
> -daemon_startup_ts(void)
> -{
> -    return startup_ts;
> -}
> -
> -void
> -daemon_started_recently_countdown(void)
> -{
> -    if (startup_delay > 0) {
> -        startup_delay--;
> -    }
> -}
> -
> -void
> -daemon_started_recently_ignore(void)
> -{
> -    ignore_startup_delay = true;
> -}
> -
> -bool
> -daemon_started_recently(void)
> -{
> -    if (ignore_startup_delay) {
> -        return false;
> -    }
> -
> -    VLOG_DBG("startup_delay: %d, startup_ts: %"PRId64, startup_delay,
> -             startup_ts);
> -
> -    /* Ensure that at least an amount of updates has been handled. */
> -    if (startup_delay) {
> -        return true;
> -    }
> -
> -    /* Ensure that at least an amount of time has passed. */
> -    return time_wall_msec() - startup_ts <= DAEMON_STARTUP_DELAY_MS;
> -}
> diff --git a/lib/ovn-util.h b/lib/ovn-util.h
> index 145f974ed..b3905ef7b 100644
> --- a/lib/ovn-util.h
> +++ b/lib/ovn-util.h
> @@ -309,9 +309,5 @@ struct ovsrec_bridge_table;
>  const struct ovsrec_bridge *get_bridge(const struct ovsrec_bridge_table *,
>                                         const char *br_name);
>  
> -void daemon_started_recently_countdown(void);
> -void daemon_started_recently_ignore(void);
> -bool daemon_started_recently(void);
> -int64_t daemon_startup_ts(void);
>  
>  #endif /* OVN_UTIL_H */
> diff --git a/tests/ovn-controller.at b/tests/ovn-controller.at
> index 3c3fb31c7..2d765f31a 100644
> --- a/tests/ovn-controller.at
> +++ b/tests/ovn-controller.at
> @@ -114,7 +114,6 @@ check_patches \
>      'br-eth0 patch-localnet2-to-br-int patch-br-int-to-localnet2'
>  
>  # Delete the mapping and the ovn-bridge-mapping patch ports should go away.
> -check ovn-appctl -t ovn-controller debug/ignore-startup-delay
>  AT_CHECK([ovs-vsctl remove Open_vSwitch . external-ids ovn-bridge-mappings])
>  check_bridge_mappings
>  check_patches
> @@ -2268,6 +2267,7 @@ OVS_WAIT_UNTIL([test 6 = $(as hv1 ovs-ofctl dump-flows 
> br-int | grep -c $(port_b
>  OVN_CLEANUP([hv1])
>  AT_CLEANUP
>  
> +OVN_FOR_EACH_NORTHD([
>  AT_SETUP([ovn-controller - restart should not delete patch ports])
>  
>  ovn_start
> @@ -2337,3 +2337,4 @@ done
>  AT_CHECK([grep "deleted interface patch" hv1/ovs-vswitchd.log], [1], 
> [ignore])
>  OVN_CLEANUP([hv1])
>  AT_CLEANUP
> +])
> diff --git a/tests/ovn.at b/tests/ovn.at
> index bba2c9c1d..21a99089f 100644
> --- a/tests/ovn.at
> +++ b/tests/ovn.at
> @@ -31681,7 +31681,6 @@ OVS_WAIT_UNTIL([
>      test x43 = x$(as hv1 ovs-vsctl get Interface ${iface1_uuid} mtu_request)
>  ])
>  
> -as hv1 check ovn-appctl -t ovn-controller debug/ignore-startup-delay
>  # Check that pointing requested-chassis somewhere else will unplug the port
>  check ovn-nbctl --wait=hv set Logical_Switch_Port lsp1 \
>      options:requested-chassis=non-existent-chassis
> @@ -31689,7 +31688,6 @@ OVS_WAIT_UNTIL([
>      ! as hv1 ovs-vsctl get Interface ${iface1_uuid} _uuid
>  ])
>  
> -as hv2 check ovn-appctl -t ovn-controller debug/ignore-startup-delay
>  # Check that removing an lport will unplug it
>  AT_CHECK([test x${iface2_uuid} = x$(as hv2 ovs-vsctl get Interface 
> ${iface2_uuid} _uuid)], [0], [])
>  check ovn-nbctl --wait=hv lsp-del ${lsp2_uuid}

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to