Hi Ilya,
On 8/22/22 11:19, Ilya Maximets wrote:
> daemon_started_recently() concept is flawed in terms that it uses
> fixed number of iterations for a countdown and a fixed timeout, so
> the undesired removal of configured resources can still happen under
> certain conditions even with this mechanism in place.
>
> The root cause of the original problem is that conditional monitoring
> in ovn-controller works in stages. ovn-controller doesn't know all
> the information it needs to monitor until the first monitor reply
> arrives, then it adjusts conditions and receives all the data. That
> doesn't work well with the ovsdb_idl_has_ever_connected() check that
> will return 'true' right after the first monitor reply.
> This leads to situation where resources can be removed after the
> first monitor reply and re-added back after condition change.
>
> Instead of waiting a fixed number of iterations/seconds, the correct
> solution should be to always start with unconditional monitoring.
> When the full initial snapshot is received, switch to conditional
> monitoring as data is complete. This way ovsdb_idl_has_ever_connected()
> check will provide an accurate answer to the question: if we have
> a sufficient amount of data to make decisions about resource removal?
This does sound like the cleanest solution, I agree. But does it create
an issue in large scale deployments where conditional monitoring is
used? The full initial SB snapshot is likely quite large.
>
> Once switched to conditional monitoring, southbound database will
> remove all the unnecessary data from the ovn-controller, so it will
> not need to keep it for long. And since setup is using conditional
> monitoring, it's likely not that big anyway to be concerned about
> a brief spike in memory consumption on startup.
>
For OpenShift and RedHat OpenStack this stands, we use
ovn-monitor-all=true for now. But I'm not sure about other large scale
deployments. I'll let Han comment more on this too.
> This change will also remove unnecessary waiting on startup when all
> the data is already available.
>
> The main change is in the ovsdb_idl_has_ever_connected() check in the
> update_sb_monitors(), the rest is mostly a revert of commits that
> introduced the startup delay.
>
> Test enhanced to run with and w/o conditional monitoring, since the
> issue is closely related it it.
>
Good idea.
> Alternative solution might be to store the first condition change
> sequence number and replace all ovsdb_idl_has_ever_connected() calls
> with a function that also checks for the current sequence number being
> higher than condition change one, but that will require slightly more
> code. Can be implemented in the future, if the first unconditional
> monitor reply will become a problem some day.
>
> Fixes: 2b27eb3482a3 ("patch.c: Avoid patch interface deletion & recreation
> during restart.")
It seems to me we need to do something similar for OpenFlow clearing
too. Right now we wait a configured amount of time:
https://github.com/ovn-org/ovn/commit/896adfd2d
Regards,
Dumitru
> Signed-off-by: Ilya Maximets <[email protected]>
> ---
> TODO.rst | 7 -----
> controller/ovn-controller.c | 31 ++++++++--------------
> controller/patch.c | 9 +------
> controller/vif-plug.c | 33 +++++------------------
> controller/vif-plug.h | 1 +
> lib/inc-proc-eng.c | 11 --------
> lib/inc-proc-eng.h | 4 ---
> lib/ovn-util.c | 52 -------------------------------------
> lib/ovn-util.h | 4 ---
> tests/ovn-controller.at | 3 ++-
> tests/ovn.at | 2 --
> 11 files changed, 22 insertions(+), 135 deletions(-)
>
> diff --git a/TODO.rst b/TODO.rst
> index 12738280b..d54817011 100644
> --- a/TODO.rst
> +++ b/TODO.rst
> @@ -172,10 +172,3 @@ OVN To-do List
> * Multi-threaded logical flow computation was optimized for the case
> when datapath groups are disabled. Datpath groups are always enabled
> now so northd parallel processing should be revisited.
> -
> -* ovn-controller daemon module
> -
> - * Dumitru Ceara: Add a new module e.g. ovn/lib/daemon-ovn.c that wraps
> - OVS' daemonize_start() call and initializes the additional things, like
> - the unixctl commands. Or, we should move the APIs such as
> - daemon_started_recently() to OVS's lib/daemon.
> diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
> index 0b0ccc48a..2ac6409bc 100644
> --- a/controller/ovn-controller.c
> +++ b/controller/ovn-controller.c
> @@ -94,7 +94,6 @@ static unixctl_cb_func debug_dump_lflow_conj_ids;
> static unixctl_cb_func lflow_cache_flush_cmd;
> static unixctl_cb_func lflow_cache_show_stats_cmd;
> static unixctl_cb_func debug_delay_nb_cfg_report;
> -static unixctl_cb_func debug_ignore_startup_delay;
>
> #define DEFAULT_BRIDGE_NAME "br-int"
> #define DEFAULT_DATAPATH "system"
> @@ -190,7 +189,14 @@ update_sb_monitors(struct ovsdb_idl *ovnsb_idl,
> * avoid the unnecessarily extra wake-ups of ovn-controller. */
> ovsdb_idl_condition_add_clause_true(&ldpg);
>
> - if (monitor_all) {
> + /* Don't enable conditional monitoring until we have a full view on a
> + * data that we need to monitor. Without that we will send a first
> + * monitor request with incomplete conditions, receive a reply, update
> + * conditions and receive the rest of data. Such a split may result in
> + * removal and re-addition of certain resources (e.g. patch ports)
> + * on restart of ovn-controller. Database will send deletion requiests
> + * for all the data that we don't actually need. */
> + if (monitor_all || !ovsdb_idl_has_ever_connected(ovnsb_idl)) {
> ovsdb_idl_condition_add_clause_true(&pb);
> ovsdb_idl_condition_add_clause_true(&lf);
> ovsdb_idl_condition_add_clause_true(&mb);
> @@ -867,12 +873,11 @@ static void
> store_nb_cfg(struct ovsdb_idl_txn *sb_txn, struct ovsdb_idl_txn *ovs_txn,
> const struct sbrec_chassis_private *chassis,
> const struct ovsrec_bridge *br_int,
> - unsigned int delay_nb_cfg_report)
> + unsigned int delay_nb_cfg_report, int64_t startup_ts)
> {
> struct ofctrl_acked_seqnos *acked_nb_cfg_seqnos =
> ofctrl_acked_seqnos_get(ofctrl_seq_type_nb_cfg);
> uint64_t cur_cfg = acked_nb_cfg_seqnos->last_acked;
> - int64_t startup_ts = daemon_startup_ts();
>
> if (ovs_txn && br_int
> && startup_ts != smap_get_ullong(&br_int->external_ids,
> @@ -3860,9 +3865,6 @@ main(int argc, char *argv[])
> debug_dump_lflow_conj_ids,
> &lflow_output_data->conj_ids);
>
> - unixctl_command_register("debug/ignore-startup-delay", "", 0, 0,
> - debug_ignore_startup_delay, NULL);
> -
> unsigned int ovs_cond_seqno = UINT_MAX;
> unsigned int ovnsb_cond_seqno = UINT_MAX;
> unsigned int ovnsb_expected_cond_seqno = UINT_MAX;
> @@ -3884,6 +3886,7 @@ main(int argc, char *argv[])
> /* Main loop. */
> exiting = false;
> restart = false;
> + int64_t startup_ts = time_wall_msec();
> bool sb_monitor_all = false;
> while (!exiting) {
> memory_run();
> @@ -4062,10 +4065,6 @@ main(int argc, char *argv[])
> }
> stopwatch_stop(CONTROLLER_LOOP_STOPWATCH_NAME,
> time_msec());
> - if (engine_has_updated()) {
> - daemon_started_recently_countdown();
> - }
> -
> ct_zones_data = engine_get_data(&en_ct_zones);
> if (ovs_idl_txn) {
> if (ct_zones_data) {
> @@ -4228,7 +4227,7 @@ main(int argc, char *argv[])
> }
>
> store_nb_cfg(ovnsb_idl_txn, ovs_idl_txn, chassis_private,
> - br_int, delay_nb_cfg_report);
> + br_int, delay_nb_cfg_report, startup_ts);
>
> if (pending_pkt.conn) {
> struct ed_type_addr_sets *as_data =
> @@ -4705,11 +4704,3 @@ debug_dump_lflow_conj_ids(struct unixctl_conn *conn,
> int argc OVS_UNUSED,
> unixctl_command_reply(conn, ds_cstr(&conj_ids_dump));
> ds_destroy(&conj_ids_dump);
> }
> -
> -static void
> -debug_ignore_startup_delay(struct unixctl_conn *conn, int argc OVS_UNUSED,
> - const char *argv[] OVS_UNUSED, void *arg
> OVS_UNUSED)
> -{
> - daemon_started_recently_ignore();
> - unixctl_command_reply(conn, NULL);
> -}
> diff --git a/controller/patch.c b/controller/patch.c
> index 12e0b6f7c..ed831302c 100644
> --- a/controller/patch.c
> +++ b/controller/patch.c
> @@ -311,14 +311,7 @@ patch_run(struct ovsdb_idl_txn *ovs_idl_txn,
> SHASH_FOR_EACH_SAFE (port_node, &existing_ports) {
> port = port_node->data;
> shash_delete(&existing_ports, port_node);
> - /* Wait for some iterations before really deleting any patch ports,
> - * because with conditional monitoring it is possible that related SB
> - * data is not completely downloaded yet after last restart of
> - * ovn-controller. Otherwise it may cause unncessary dataplane
> - * interruption during restart/upgrade. */
> - if (!daemon_started_recently()) {
> - remove_port(bridge_table, port);
> - }
> + remove_port(bridge_table, port);
> }
> shash_destroy(&existing_ports);
> }
> diff --git a/controller/vif-plug.c b/controller/vif-plug.c
> index 38348bf54..eafc513f5 100644
> --- a/controller/vif-plug.c
> +++ b/controller/vif-plug.c
> @@ -469,8 +469,7 @@ vif_plug_iface_touched_this_txn(
> static bool
> vif_plug_handle_lport_vif(const struct sbrec_port_binding *pb,
> struct vif_plug_ctx_in *vif_plug_ctx_in,
> - struct vif_plug_ctx_out *vif_plug_ctx_out,
> - bool can_unplug)
> + struct vif_plug_ctx_out *vif_plug_ctx_out)
> {
> if (vif_plug_iface_touched_this_txn(vif_plug_ctx_out, pb->logical_port))
> {
> return true;
> @@ -482,7 +481,7 @@ vif_plug_handle_lport_vif(const struct sbrec_port_binding
> *pb,
> if (lport_can_bind_on_this_chassis(vif_plug_ctx_in->chassis_rec, pb)) {
> handled &= consider_plug_lport(pb, lbinding,
> vif_plug_ctx_in, vif_plug_ctx_out);
> - } else if (can_unplug && lbinding && lbinding->iface) {
> + } else if (lbinding && lbinding->iface) {
> handled &= consider_unplug_iface(lbinding->iface, pb,
> vif_plug_ctx_in, vif_plug_ctx_out);
> }
> @@ -492,8 +491,7 @@ vif_plug_handle_lport_vif(const struct sbrec_port_binding
> *pb,
> static bool
> vif_plug_handle_iface(const struct ovsrec_interface *iface_rec,
> struct vif_plug_ctx_in *vif_plug_ctx_in,
> - struct vif_plug_ctx_out *vif_plug_ctx_out,
> - bool can_unplug)
> + struct vif_plug_ctx_out *vif_plug_ctx_out)
> {
> bool handled = true;
> const char *vif_plug_type = smap_get(&iface_rec->external_ids,
> @@ -513,10 +511,8 @@ vif_plug_handle_iface(const struct ovsrec_interface
> *iface_rec,
> * consider updating it */
> handled &= consider_plug_lport(pb, lbinding,
> vif_plug_ctx_in, vif_plug_ctx_out);
> - } else if (can_unplug
> - && (!pb
> - || !lport_can_bind_on_this_chassis(
> - vif_plug_ctx_in->chassis_rec, pb))) {
> + } else if (!pb || !lport_can_bind_on_this_chassis(
> + vif_plug_ctx_in->chassis_rec, pb)) {
> /* No lport for this interface or it is destined for different
> chassis,
> * consuder unplugging it */
> handled &= consider_unplug_iface(iface_rec, pb,
> @@ -525,31 +521,17 @@ vif_plug_handle_iface(const struct ovsrec_interface
> *iface_rec,
> return handled;
> }
>
> -/* On initial startup or on IDL reconnect, several rounds of the main loop
> may
> - * run before data is actually loaded in the IDL, primarily depending on
> - * conditional monitoring status and other events that could trigger main
> loop
> - * runs during this period. Until we find a reliable way to determine the
> - * completeness of the initial data downloading we need this counter so that
> we
> - * do not erronously unplug ports because the data is just not loaded yet.
> - */
> void
> vif_plug_run(struct vif_plug_ctx_in *vif_plug_ctx_in,
> struct vif_plug_ctx_out *vif_plug_ctx_out)
> {
> - bool delay_plug = daemon_started_recently();
> - if (delay_plug) {
> - VLOG_DBG("vif_plug_run: daemon started recently, will not unplug "
> - "ports in this iteration.");
> - }
> -
> if (!vif_plug_ctx_in->chassis_rec) {
> return;
> }
> const struct ovsrec_interface *iface_rec;
> OVSREC_INTERFACE_TABLE_FOR_EACH (iface_rec,
> vif_plug_ctx_in->iface_table) {
> - vif_plug_handle_iface(iface_rec, vif_plug_ctx_in, vif_plug_ctx_out,
> - !delay_plug);
> + vif_plug_handle_iface(iface_rec, vif_plug_ctx_in, vif_plug_ctx_out);
> }
>
> struct sbrec_port_binding *target =
> @@ -564,8 +546,7 @@ vif_plug_run(struct vif_plug_ctx_in *vif_plug_ctx_in,
> vif_plug_ctx_in->sbrec_port_binding_by_requested_chassis) {
> enum en_lport_type lport_type = get_lport_type(pb);
> if (lport_type == LP_VIF) {
> - vif_plug_handle_lport_vif(pb, vif_plug_ctx_in, vif_plug_ctx_out,
> - !delay_plug);
> + vif_plug_handle_lport_vif(pb, vif_plug_ctx_in, vif_plug_ctx_out);
> }
> }
> sbrec_port_binding_index_destroy_row(target);
> diff --git a/controller/vif-plug.h b/controller/vif-plug.h
> index 7a1978e38..76063591b 100644
> --- a/controller/vif-plug.h
> +++ b/controller/vif-plug.h
> @@ -71,6 +71,7 @@ void vif_plug_clear_changed(struct shash
> *deleted_iface_ids);
> void vif_plug_finish_changed(struct shash *changed_iface_ids);
> void vif_plug_clear_deleted(struct shash *deleted_iface_ids);
> void vif_plug_finish_deleted(struct shash *changed_iface_ids);
> +void vif_plug_reset_idl_prime_counter(void);
>
> #ifdef __cplusplus
> }
> diff --git a/lib/inc-proc-eng.c b/lib/inc-proc-eng.c
> index 2e2b31033..575b774ae 100644
> --- a/lib/inc-proc-eng.c
> +++ b/lib/inc-proc-eng.c
> @@ -312,17 +312,6 @@ engine_has_run(void)
> return false;
> }
>
> -bool
> -engine_has_updated(void)
> -{
> - for (size_t i = 0; i < engine_n_nodes; i++) {
> - if (engine_nodes[i]->state == EN_UPDATED) {
> - return true;
> - }
> - }
> - return false;
> -}
> -
> bool
> engine_aborted(void)
> {
> diff --git a/lib/inc-proc-eng.h b/lib/inc-proc-eng.h
> index dc365dc18..9bfab1f7c 100644
> --- a/lib/inc-proc-eng.h
> +++ b/lib/inc-proc-eng.h
> @@ -238,10 +238,6 @@ bool engine_node_changed(struct engine_node *node);
> /* Return true if the engine has run in the last iteration. */
> bool engine_has_run(void);
>
> -/* Return true if the engine has any update in any node, i.e. any input
> - * has changed; false if nothing has changed. */
> -bool engine_has_updated(void);
> -
> /* Returns true if during the last engine run we had to abort processing. */
> bool engine_aborted(void);
>
> diff --git a/lib/ovn-util.c b/lib/ovn-util.c
> index d80db179a..616999eab 100644
> --- a/lib/ovn-util.c
> +++ b/lib/ovn-util.c
> @@ -883,55 +883,3 @@ get_bridge(const struct ovsrec_bridge_table
> *bridge_table, const char *br_name)
> }
> return NULL;
> }
> -
> -#define DAEMON_STARTUP_DELAY_SEED 20
> -#define DAEMON_STARTUP_DELAY_MS 10000
> -
> -static int64_t startup_ts;
> -static int startup_delay = DAEMON_STARTUP_DELAY_SEED;
> -
> -/* Used by debug command only, for tests. */
> -static bool ignore_startup_delay = false;
> -
> -OVS_CONSTRUCTOR(startup_ts_initializer) {
> - startup_ts = time_wall_msec();
> -}
> -
> -int64_t
> -daemon_startup_ts(void)
> -{
> - return startup_ts;
> -}
> -
> -void
> -daemon_started_recently_countdown(void)
> -{
> - if (startup_delay > 0) {
> - startup_delay--;
> - }
> -}
> -
> -void
> -daemon_started_recently_ignore(void)
> -{
> - ignore_startup_delay = true;
> -}
> -
> -bool
> -daemon_started_recently(void)
> -{
> - if (ignore_startup_delay) {
> - return false;
> - }
> -
> - VLOG_DBG("startup_delay: %d, startup_ts: %"PRId64, startup_delay,
> - startup_ts);
> -
> - /* Ensure that at least an amount of updates has been handled. */
> - if (startup_delay) {
> - return true;
> - }
> -
> - /* Ensure that at least an amount of time has passed. */
> - return time_wall_msec() - startup_ts <= DAEMON_STARTUP_DELAY_MS;
> -}
> diff --git a/lib/ovn-util.h b/lib/ovn-util.h
> index 145f974ed..b3905ef7b 100644
> --- a/lib/ovn-util.h
> +++ b/lib/ovn-util.h
> @@ -309,9 +309,5 @@ struct ovsrec_bridge_table;
> const struct ovsrec_bridge *get_bridge(const struct ovsrec_bridge_table *,
> const char *br_name);
>
> -void daemon_started_recently_countdown(void);
> -void daemon_started_recently_ignore(void);
> -bool daemon_started_recently(void);
> -int64_t daemon_startup_ts(void);
>
> #endif /* OVN_UTIL_H */
> diff --git a/tests/ovn-controller.at b/tests/ovn-controller.at
> index 3c3fb31c7..2d765f31a 100644
> --- a/tests/ovn-controller.at
> +++ b/tests/ovn-controller.at
> @@ -114,7 +114,6 @@ check_patches \
> 'br-eth0 patch-localnet2-to-br-int patch-br-int-to-localnet2'
>
> # Delete the mapping and the ovn-bridge-mapping patch ports should go away.
> -check ovn-appctl -t ovn-controller debug/ignore-startup-delay
> AT_CHECK([ovs-vsctl remove Open_vSwitch . external-ids ovn-bridge-mappings])
> check_bridge_mappings
> check_patches
> @@ -2268,6 +2267,7 @@ OVS_WAIT_UNTIL([test 6 = $(as hv1 ovs-ofctl dump-flows
> br-int | grep -c $(port_b
> OVN_CLEANUP([hv1])
> AT_CLEANUP
>
> +OVN_FOR_EACH_NORTHD([
> AT_SETUP([ovn-controller - restart should not delete patch ports])
>
> ovn_start
> @@ -2337,3 +2337,4 @@ done
> AT_CHECK([grep "deleted interface patch" hv1/ovs-vswitchd.log], [1],
> [ignore])
> OVN_CLEANUP([hv1])
> AT_CLEANUP
> +])
> diff --git a/tests/ovn.at b/tests/ovn.at
> index bba2c9c1d..21a99089f 100644
> --- a/tests/ovn.at
> +++ b/tests/ovn.at
> @@ -31681,7 +31681,6 @@ OVS_WAIT_UNTIL([
> test x43 = x$(as hv1 ovs-vsctl get Interface ${iface1_uuid} mtu_request)
> ])
>
> -as hv1 check ovn-appctl -t ovn-controller debug/ignore-startup-delay
> # Check that pointing requested-chassis somewhere else will unplug the port
> check ovn-nbctl --wait=hv set Logical_Switch_Port lsp1 \
> options:requested-chassis=non-existent-chassis
> @@ -31689,7 +31688,6 @@ OVS_WAIT_UNTIL([
> ! as hv1 ovs-vsctl get Interface ${iface1_uuid} _uuid
> ])
>
> -as hv2 check ovn-appctl -t ovn-controller debug/ignore-startup-delay
> # Check that removing an lport will unplug it
> AT_CHECK([test x${iface2_uuid} = x$(as hv2 ovs-vsctl get Interface
> ${iface2_uuid} _uuid)], [0], [])
> check ovn-nbctl --wait=hv lsp-del ${lsp2_uuid}
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev