Hi Numan! A few nits I thought of mentioning [inline]
> On Nov 17, 2020, at 6:57 AM, [email protected] wrote: > > From: Numan Siddique <[email protected]> > > OVN recommends updating/upgrading ovn-controllers first and then > ovn-northd and OVN DB ovsdb-servers. This is to ensure that any > new functionality specified by the database or logical flows created > by ovn-northd is understood by ovn-controller. > > However certain deployments may upgrade ovn-northd services first and > then ovn-controllers. In a large scal deployment, this can result in > downtime during upgrades as old ovn-controllers may not understand > new logical flows or new actions added by ovn-northd. > > Even with upgrading ovn-controllers first, can result in ovn-controllers > rejecting some of the logical flows if an existing OVN action is > changed. One such example is ct_commit action which recently was updated > to take new arguments. > > To avoid such downtimes during upgrades, this patch adds the > functionality of pinning ovn-controller and ovn-northd to a specific > version. An internal OVN version is generated and this version is stored > by ovn-northd in the Southbound SB_Global table's > options:northd_internal_version. > > When ovn-controller notices that the internal version has changed, it > stops handling the database changes - both Southbound and OVS. All the > existing OF flows are preserved. When ovn-controller is upgraded to the > same version as ovn-northd services, it will process the database > changes. > > This feature is made optional and disabled by default. Any CMS can > enable it by configuring the OVS local database with the option - > ovn-pin-to-northd=false. ^^=false or =true^^ ? > > Signed-off-by: Numan Siddique <[email protected]> > --- > > v1 -> v2 > ---- > * Added test cases and missing documentation. > * Renamed the ovsdb option from ignore_northd_verison > to ovn-pin-to-northd. > * Added NEWS item. > > NEWS | 3 + > controller/ovn-controller.8.xml | 11 +++ > controller/ovn-controller.c | 58 ++++++++++++- > lib/ovn-util.c | 17 ++++ > lib/ovn-util.h | 4 + > northd/ovn-northd.c | 18 +++- > tests/ovn.at | 147 ++++++++++++++++++++++++++++++++ > 7 files changed, 251 insertions(+), 7 deletions(-) > > diff --git a/NEWS b/NEWS > index 6010230679..bb95724b36 100644 > --- a/NEWS > +++ b/NEWS > @@ -8,6 +8,9 @@ Post-v20.09.0 > server. > - Support other_config:vlan-passthru=true to allow VLAN tagged incoming > traffic. > + - Support version pinning between ovn-northd and ovn-controller as an > + option. If the option is enabled and the versions don't match, > + ovn-controller will not process any DB changes. > > OVN v20.09.0 - 28 Sep 2020 > -------------------------- > diff --git a/controller/ovn-controller.8.xml b/controller/ovn-controller.8.xml > index 16bc47b205..357edd1f5c 100644 > --- a/controller/ovn-controller.8.xml > +++ b/controller/ovn-controller.8.xml > @@ -233,6 +233,17 @@ > The boolean flag indicates if the chassis is used as an > interconnection gateway. > </dd> > + > + <dt><code>external_ids:ovn-pin-to-northd</code></dt> > + <dd> > + The boolean flag indicates if <code>ovn-controller</code> needs to > + be pinned to the exact <code>ovn-northd</code> version. If this > + flag is set to true and the <code>ovn-northd's</code> version > (reported > + in the Southbound database) doesn't match with the > + <code>ovn-controller's</code> internal version, then it will stop > + processing the Southbound and local Open vSwitch database changes. > + The default value is considered false if this option is not defined. > + </dd> > </dl> > > <p> > diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c > index a06cae3ccb..a7344ea9c4 100644 > --- a/controller/ovn-controller.c > +++ b/controller/ovn-controller.c > @@ -2136,6 +2136,37 @@ struct ovn_controller_exit_args { > bool *restart; > }; > > +static bool > +pin_to_northd(struct ovsdb_idl *ovs_idl) > +{ > + const struct ovsrec_open_vswitch *cfg = > ovsrec_open_vswitch_first(ovs_idl); > + return !cfg ? false : smap_get_bool(&cfg->external_ids, > + "ovn-pin-to-northd", false); > +} > + > +/* Returns false if the northd internal version and ovn-controller internal > + * version doesn't match. > + */ > +static bool > +check_northd_version(const struct sbrec_sb_global *sb, const char > *my_version) > +{ > + if (!sb) { > + return false; > + } > + > + const char *northd_version = > + smap_get_def(&sb->options, "northd_internal_version", ""); > + > + bool mismatch = strcmp(northd_version, my_version); > + if (mismatch) { > + static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1); > + VLOG_WARN_RL(&rl, "controller version - %s mismatch with northd " > + "version - %s", my_version, northd_version); > + } > + > + return mismatch; > +} > + > int > main(int argc, char *argv[]) > { > @@ -2428,10 +2459,14 @@ main(int argc, char *argv[]) > .enable_lflow_cache = true > }; > > + char *ovn_internal_version = ovn_get_internal_version(); > + VLOG_INFO("OVN internal version is : [%s]", ovn_internal_version); > + > /* Main loop. */ > exiting = false; > restart = false; > bool sb_monitor_all = false; > + bool version_mismatch_with_northd = false; > while (!exiting) { > /* If we're paused just run the unixctl server and skip most of the > * processing loop. > @@ -2442,8 +2477,6 @@ main(int argc, char *argv[]) > goto loop_done; > } > > - engine_init_run(); > - > struct ovsdb_idl_txn *ovs_idl_txn = ovsdb_idl_loop_run(&ovs_idl_loop); > unsigned int new_ovs_cond_seqno > = ovsdb_idl_get_condition_seqno(ovs_idl_loop.idl); > @@ -2473,6 +2506,22 @@ main(int argc, char *argv[]) > ovnsb_cond_seqno = new_ovnsb_cond_seqno; > } > > + bool version_mismatched = false; > + if (pin_to_northd(ovs_idl_loop.idl)) { > + version_mismatched = check_northd_version( > + sbrec_sb_global_first(ovnsb_idl_loop.idl), > + ovn_internal_version); > + } else { > + version_mismatched = false; > + } NIT: I think the else block is redundant, since that is the value that version_mismatched is initialized with. > + > + if (version_mismatch_with_northd != version_mismatched) { > + version_mismatch_with_northd = version_mismatched; > + engine_set_force_recompute(true); > + } > + > + engine_init_run(); > + > struct engine_context eng_ctx = { > .ovs_idl_txn = ovs_idl_txn, > .ovnsb_idl_txn = ovnsb_idl_txn, > @@ -2481,7 +2530,8 @@ main(int argc, char *argv[]) > > engine_set_context(&eng_ctx); > > - if (ovsdb_idl_has_ever_connected(ovnsb_idl_loop.idl)) { > + if (ovsdb_idl_has_ever_connected(ovnsb_idl_loop.idl) && > + !version_mismatch_with_northd) { > /* Contains the transport zones that this Chassis belongs to */ > struct sset transport_zones = SSET_INITIALIZER(&transport_zones); > sset_from_delimited_string(&transport_zones, > @@ -2770,6 +2820,7 @@ loop_done: > } > } > > + free(ovn_internal_version); > unixctl_server_destroy(unixctl); > lflow_destroy(); > ofctrl_destroy(); > @@ -2822,6 +2873,7 @@ parse_options(int argc, char *argv[]) > > case 'V': > ovs_print_version(OFP15_VERSION, OFP15_VERSION); > + printf("SB DB Schema %s\n", sbrec_get_db_version()); > exit(EXIT_SUCCESS); > > VLOG_OPTION_HANDLERS > diff --git a/lib/ovn-util.c b/lib/ovn-util.c > index 18aac8da34..9dc9ade3b4 100644 > --- a/lib/ovn-util.c > +++ b/lib/ovn-util.c > @@ -20,6 +20,7 @@ > #include <unistd.h> > > #include "daemon.h" > +#include "include/ovn/actions.h" > #include "openvswitch/ofp-parse.h" > #include "openvswitch/vlog.h" > #include "ovn-dirs.h" > @@ -713,3 +714,19 @@ ip_address_and_port_from_lb_key(const char *key, char > **ip_address, > *addr_family = ss.ss_family; > return true; > } > + > +#define N_OVN_ACTIONS N_OVNACTS > +BUILD_ASSERT_DECL(N_OVN_ACTIONS == 47); > + > +/* Increment this for any logical flow changes or if existing OVN action is > + * modified. */ > +#define OVN_INTERNAL_MINOR_VER 0 > + > +/* Returns the OVN version. The caller must free the returned value. */ > +char * > +ovn_get_internal_version(void) > +{ > + return xasprintf("%s-%s-%d.%d", OVN_PACKAGE_VERSION, > + sbrec_get_db_version(), > + N_OVN_ACTIONS, OVN_INTERNAL_MINOR_VER); > +} > diff --git a/lib/ovn-util.h b/lib/ovn-util.h > index 3496673b26..7edd301802 100644 > --- a/lib/ovn-util.h > +++ b/lib/ovn-util.h > @@ -221,4 +221,8 @@ char *str_tolower(const char *orig); > bool ip_address_and_port_from_lb_key(const char *key, char **ip_address, > uint16_t *port, int *addr_family); > > +/* Returns the internal OVN version. The caller must free the returned > + * value. */ > +char *ovn_get_internal_version(void); > + > #endif > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c > index 4d4190cb9b..2475605cc0 100644 > --- a/northd/ovn-northd.c > +++ b/northd/ovn-northd.c > @@ -12062,7 +12062,8 @@ ovnnb_db_run(struct northd_context *ctx, > struct ovsdb_idl_loop *sb_loop, > struct hmap *datapaths, struct hmap *ports, > struct ovs_list *lr_list, > - int64_t loop_start_time) > + int64_t loop_start_time, > + const char *ovn_internal_version) > { > if (!ctx->ovnsb_txn || !ctx->ovnnb_txn) { > return; > @@ -12139,6 +12140,8 @@ ovnnb_db_run(struct northd_context *ctx, > smap_replace(&options, "max_tunid", max_tunid); > free(max_tunid); > > + smap_replace(&options, "northd_internal_version", ovn_internal_version); > + > nbrec_nb_global_verify_options(nb); > nbrec_nb_global_set_options(nb, &options); > > @@ -12746,7 +12749,8 @@ ovnsb_db_run(struct northd_context *ctx, > static void > ovn_db_run(struct northd_context *ctx, > struct ovsdb_idl_index *sbrec_chassis_by_name, > - struct ovsdb_idl_loop *ovnsb_idl_loop) > + struct ovsdb_idl_loop *ovnsb_idl_loop, > + const char *ovn_internal_version) > { > struct hmap datapaths, ports; > struct ovs_list lr_list; > @@ -12756,7 +12760,8 @@ ovn_db_run(struct northd_context *ctx, > > int64_t start_time = time_wall_msec(); > ovnnb_db_run(ctx, sbrec_chassis_by_name, ovnsb_idl_loop, > - &datapaths, &ports, &lr_list, start_time); > + &datapaths, &ports, &lr_list, start_time, > + ovn_internal_version); > ovnsb_db_run(ctx, ovnsb_idl_loop, &ports, start_time); > destroy_datapaths_and_ports(&datapaths, &ports, &lr_list); > } > @@ -13113,6 +13118,9 @@ main(int argc, char *argv[]) > unixctl_command_register("sb-connection-status", "", 0, 0, > ovn_conn_show, ovnsb_idl_loop.idl); > > + char *ovn_internal_version = ovn_get_internal_version(); > + VLOG_INFO("OVN internal version is : [%s]", ovn_internal_version); > + > /* Main loop. */ > exiting = false; > state.had_lock = false; > @@ -13154,7 +13162,8 @@ main(int argc, char *argv[]) > } > > if (ovsdb_idl_has_lock(ovnsb_idl_loop.idl)) { > - ovn_db_run(&ctx, sbrec_chassis_by_name, &ovnsb_idl_loop); > + ovn_db_run(&ctx, sbrec_chassis_by_name, &ovnsb_idl_loop, > + ovn_internal_version); > if (ctx.ovnsb_txn) { > check_and_add_supported_dhcp_opts_to_sb_db(&ctx); > check_and_add_supported_dhcpv6_opts_to_sb_db(&ctx); > @@ -13216,6 +13225,7 @@ main(int argc, char *argv[]) > } > } > > + free(ovn_internal_version); > unixctl_server_destroy(unixctl); > ovsdb_idl_loop_destroy(&ovnnb_idl_loop); > ovsdb_idl_loop_destroy(&ovnsb_idl_loop); > diff --git a/tests/ovn.at b/tests/ovn.at > index c0219bbd47..42a0f9a00f 100644 > --- a/tests/ovn.at > +++ b/tests/ovn.at > @@ -22687,3 +22687,150 @@ AT_CHECK([test "$encap_rec_mvtep" = > "$encap_rec_mvtep1"], [0], []) > > OVN_CLEANUP([hv1]) > AT_CLEANUP > + > +AT_SETUP([ovn -- check ovn-northd and ovn-controller version pinning]) > +ovn_start > + > +net_add n1 > +sim_add hv1 > +as hv1 > +ovs-vsctl add-br br-phys > +ovn_attach n1 br-phys 192.168.0.10 > + > +check ovn-nbctl ls-add sw0 > +check ovn-nbctl lsp-add sw0 sw0-p1 > +check ovn-nbctl lsp-add sw0 sw0-p2 > + > +as hv1 > +ovs-vsctl \ > + -- add-port br-int vif1 \ > + -- set Interface vif1 external_ids:iface-id=sw0-p1 \ > + ofport-request=1 > +ovs-vsctl \ > + -- add-port br-int vif2 \ > + -- set Interface vif2 external_ids:iface-id=sw0-p2 \ > + ofport-request=2 > + > +# Wait for port to be bound. > +wait_row_count Chassis 1 name=hv1 > +ch=$(fetch_column Chassis _uuid name=hv1) > +wait_row_count Port_Binding 1 logical_port=sw0-p1 chassis=$ch > +wait_row_count Port_Binding 1 logical_port=sw0-p2 chassis=$ch > + > +northd_version=$(ovn-sbctl get SB_Global . options:northd_internal_version | > sed s/\"//g) > +echo "northd version = $northd_version" > +AT_CHECK([grep -c $northd_version hv1/ovn-controller.log], [0], [1 > +]) > + > +# Stop ovn-northd so that we can modify the northd_version. > +as northd > +OVS_APP_EXIT_AND_WAIT([ovn-northd]) > + > +as northd-backup > +OVS_APP_EXIT_AND_WAIT([ovn-northd]) > + > +check ovn-sbctl set SB_Global . options:northd_internal_version=foo > + > +as hv1 > +check ovs-vsctl set interface vif2 external_ids:iface-id=foo > + > +# ovn-controller should release the lport sw0-p2 since ovn-pin-to-northd > +# is not true. > +wait_row_count Port_Binding 1 logical_port=sw0-p2 'chassis=[[]]' > + > +as hv1 ovs-ofctl dump-flows br-int table=0 > offlows_table0.txt > +AT_CAPTURE_FILE([offlows_table0.txt]) > +AT_CHECK_UNQUOTED([grep -c "in_port=2" offlows_table0.txt], [1], [dnl > +0 > +]) > + > +echo > +echo "__file__:__line__: Pin ovn-controller to ovn-northd version." > + > +as hv1 > +check ovs-vsctl set open . external_ids:ovn-pin-to-northd=true > + > +OVS_WAIT_UNTIL( > + [test 1 = $(grep -c "controller version - $northd_version mismatch with > northd version - foo" hv1/ovn-controller.log) > +]) > + > +as hv1 > +check ovs-vsctl set interface vif2 external_ids:iface-id=sw0-p2 > + > +# ovn-controller should not claim sw0-p2 since there is version mismatch > +as hv1 ovn-appctl -t ovn-controller recompute > +wait_row_count Port_Binding 1 logical_port=sw0-p2 'chassis=[[]]' > + > +as hv1 ovs-ofctl dump-flows br-int table=0 > offlows_table0.txt > +AT_CAPTURE_FILE([offlows_table0.txt]) > +AT_CHECK_UNQUOTED([grep -c "in_port=2" offlows_table0.txt], [1], [dnl > +0 > +]) > + > +check ovn-sbctl set SB_Global . > options:northd_internal_version=$northd_version > + > +# It should claim sw0-p2 > +wait_row_count Port_Binding 1 logical_port=sw0-p2 chassis=$ch > + > +as hv1 ovs-ofctl dump-flows br-int table=0 > offlows_table0.txt > +AT_CAPTURE_FILE([offlows_table0.txt]) > +AT_CHECK_UNQUOTED([grep -c "in_port=2" offlows_table0.txt], [0], [dnl > +1 > +]) > + > +as hv1 > +ovn_remote=$(ovs-vsctl get open . external_ids:ovn-remote | sed s/\"//g) > +ovs-vsctl set open . external_ids:ovn-remote=unix:foo > +check ovs-vsctl set interface vif2 external_ids:iface-id=foo > + > +# ovn-controller is not connected to the SB DB. Even though it > +# releases sw0-p2, it will not delete the OF flows. > +as hv1 ovs-ofctl dump-flows br-int table=0 > offlows_table0.txt > +AT_CAPTURE_FILE([offlows_table0.txt]) > +AT_CHECK_UNQUOTED([grep -c "in_port=2" offlows_table0.txt], [0], [dnl > +1 > +]) > + > +# Change the version to incorrect one and reconnect to the SB DB. > +check ovn-sbctl set SB_Global . options:northd_internal_version=bar > + > +as hv1 > +check ovs-vsctl set open . external_ids:ovn-remote=$ovn_remote > + > +sleep 1 > + > +as hv1 ovs-ofctl dump-flows br-int table=0 > offlows_table0.txt > +AT_CAPTURE_FILE([offlows_table0.txt]) > +AT_CHECK_UNQUOTED([grep -c "in_port=2" offlows_table0.txt], [0], [dnl > +1 > +]) > + > +wait_row_count Port_Binding 1 logical_port=sw0-p2 chassis=$ch > + > +# Change the ovn-remote to incorrect and set the correct northd version > +# and then change back to the correct ovn-remote > +as hv1 > +check ovs-vsctl set open . external_ids:ovn-remote=unix:foo > + > +check ovn-sbctl set SB_Global . > options:northd_internal_version=$northd_version > + > +as hv1 > +check ovs-vsctl set open . external_ids:ovn-remote=$ovn_remote > + > +wait_row_count Port_Binding 1 logical_port=sw0-p2 'chassis=[[]]' > +as hv1 ovs-ofctl dump-flows br-int table=0 > offlows_table0.txt > +AT_CAPTURE_FILE([offlows_table0.txt]) > +AT_CHECK_UNQUOTED([grep -c "in_port=2" offlows_table0.txt], [1], [dnl > +0 > +]) > + > +as hv1 > +OVS_APP_EXIT_AND_WAIT([ovn-controller]) > + > +as ovn-sb > +OVS_APP_EXIT_AND_WAIT([ovsdb-server]) > + > +as ovn-nb > +OVS_APP_EXIT_AND_WAIT([ovsdb-server]) > + > +AT_CLEANUP > -- > 2.28.0 > > _______________________________________________ > dev mailing list > [email protected] > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
