On 11/18/20 2:08 PM, Numan Siddique wrote:
> On Wed, Nov 18, 2020 at 5:23 PM Dumitru Ceara <[email protected]> wrote:
>>
>> On 11/17/20 12:57 PM, [email protected] wrote:
>>> From: Numan Siddique <[email protected]>
>>>
>>> OVN recommends updating/upgrading ovn-controllers first and then
>>> ovn-northd and OVN DB ovsdb-servers. This is to ensure that any
>>> new functionality specified by the database or logical flows created
>>> by ovn-northd is understood by ovn-controller.
>>>
>>> However certain deployments may upgrade ovn-northd services first and
>>> then ovn-controllers. In a large scal deployment, this can result in
>>> downtime during upgrades as old ovn-controllers may not understand
>>> new logical flows or new actions added by ovn-northd.
>>>
>>> Even with upgrading ovn-controllers first, can result in ovn-controllers
>>> rejecting some of the logical flows if an existing OVN action is
>>> changed. One such example is ct_commit action which recently was updated
>>> to take new arguments.
>>>
>>> To avoid such downtimes during upgrades, this patch adds the
>>> functionality of pinning ovn-controller and ovn-northd to a specific
>>> version. An internal OVN version is generated and this version is stored
>>> by ovn-northd in the Southbound SB_Global table's
>>> options:northd_internal_version.
>>>
>>> When ovn-controller notices that the internal version has changed, it
>>> stops handling the database changes - both Southbound and OVS. All the
>>> existing OF flows are preserved. When ovn-controller is upgraded to the
>>> same version as ovn-northd services, it will process the database
>>> changes.
>>>
>>> This feature is made optional and disabled by default. Any CMS can
>>> enable it by configuring the OVS local database with the option -
>>> ovn-pin-to-northd=false.
>>>
>>> Signed-off-by: Numan Siddique <[email protected]>
>>> ---
>>
>> Hi Numan,
>>
>> Next to Flavio's comments I have some too.
>>
>
> Thanks Flavio and Dumitru for the reviews.
>
>
>>>
>>> v1 -> v2
>>> ----
>>> * Added test cases and missing documentation.
>>> * Renamed the ovsdb option from ignore_northd_verison
>>> to ovn-pin-to-northd.
>>> * Added NEWS item.
>>>
>>> NEWS | 3 +
>>> controller/ovn-controller.8.xml | 11 +++
>>> controller/ovn-controller.c | 58 ++++++++++++-
>>> lib/ovn-util.c | 17 ++++
>>> lib/ovn-util.h | 4 +
>>> northd/ovn-northd.c | 18 +++-
>>> tests/ovn.at | 147 ++++++++++++++++++++++++++++++++
>>> 7 files changed, 251 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/NEWS b/NEWS
>>> index 6010230679..bb95724b36 100644
>>> --- a/NEWS
>>> +++ b/NEWS
>>> @@ -8,6 +8,9 @@ Post-v20.09.0
>>> server.
>>> - Support other_config:vlan-passthru=true to allow VLAN tagged incoming
>>> traffic.
>>> + - Support version pinning between ovn-northd and ovn-controller as an
>>> + option. If the option is enabled and the versions don't match,
>>> + ovn-controller will not process any DB changes.
>>>
>>> OVN v20.09.0 - 28 Sep 2020
>>> --------------------------
>>> diff --git a/controller/ovn-controller.8.xml
>>> b/controller/ovn-controller.8.xml
>>> index 16bc47b205..357edd1f5c 100644
>>> --- a/controller/ovn-controller.8.xml
>>> +++ b/controller/ovn-controller.8.xml
>>> @@ -233,6 +233,17 @@
>>> The boolean flag indicates if the chassis is used as an
>>> interconnection gateway.
>>> </dd>
>>> +
>>> + <dt><code>external_ids:ovn-pin-to-northd</code></dt>
>>
>> This still seems misleading to me. We don't actually pin to a northd
>> version, we pin to an OVN internal version. What if we call it
>> "ovn-pin-to-version"?
>
> It's true that we pin to the OVN internal version, but my idea is that
> ovn-northd writes
> its internal version to SB_Global. And ovn-controller reads the northd
> version and compares
> with its own version.
>
> Few other points
> - I think ovn-pin-to-northd (or ovn-pin-to-northd-version) makes
> sense from a CMS point of view.
> - CMS doesn't need to know about the internal version mechanism used.
> - In production deployments when northd services are updated first,
> ovn-northd will update its new
> internal version to SB_Global.
> - And the idea of this patch is to pin the controller to the same
> northd version.
So I guess this translates to "make ovn-controller reject SB contents if
they were created by the wrong northd version".
>
> So I'd prefer to call it ovn-pin-to-northd.
>
With that in mind, and considering your explanation, what about
"ovn-match-northd-version"?
>
>>
>>> + <dd>
>>> + The boolean flag indicates if <code>ovn-controller</code> needs to
>>> + be pinned to the exact <code>ovn-northd</code> version. If this
>>> + flag is set to true and the <code>ovn-northd's</code> version
>>> (reported
>>> + in the Southbound database) doesn't match with the
>>> + <code>ovn-controller's</code> internal version, then it will stop
>>> + processing the Southbound and local Open vSwitch database changes.
>>> + The default value is considered false if this option is not
>>> defined.
>>> + </dd>
>>> </dl>
>>>
>>> <p>
>>> diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
>>> index a06cae3ccb..a7344ea9c4 100644
>>> --- a/controller/ovn-controller.c
>>> +++ b/controller/ovn-controller.c
>>> @@ -2136,6 +2136,37 @@ struct ovn_controller_exit_args {
>>> bool *restart;
>>> };
>>>
>>> +static bool
>>> +pin_to_northd(struct ovsdb_idl *ovs_idl)
>>> +{
>>> + const struct ovsrec_open_vswitch *cfg =
>>> ovsrec_open_vswitch_first(ovs_idl);
>>> + return !cfg ? false : smap_get_bool(&cfg->external_ids,
>>> + "ovn-pin-to-northd", false);
>>> +}
>>> +
>>> +/* Returns false if the northd internal version and ovn-controller internal
>>> + * version doesn't match.
>>> + */
>>> +static bool
>>> +check_northd_version(const struct sbrec_sb_global *sb, const char
>>> *my_version)
>>> +{
>>> + if (!sb) {
>>> + return false;
>>> + }
>>> +
>>> + const char *northd_version =
>>> + smap_get_def(&sb->options, "northd_internal_version", "");
>>> +
>>> + bool mismatch = strcmp(northd_version, my_version);
>>> + if (mismatch) {
>>> + static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
>>> + VLOG_WARN_RL(&rl, "controller version - %s mismatch with northd "
>>> + "version - %s", my_version, northd_version);
>>> + }
>>> +
>>> + return mismatch;
>>> +}
>>> +
>>> int
>>> main(int argc, char *argv[])
>>> {
>>> @@ -2428,10 +2459,14 @@ main(int argc, char *argv[])
>>> .enable_lflow_cache = true
>>> };
>>>
>>> + char *ovn_internal_version = ovn_get_internal_version();
>>> + VLOG_INFO("OVN internal version is : [%s]", ovn_internal_version);
>>> +
>>> /* Main loop. */
>>> exiting = false;
>>> restart = false;
>>> bool sb_monitor_all = false;
>>> + bool version_mismatch_with_northd = false;
>>> while (!exiting) {
>>> /* If we're paused just run the unixctl server and skip most of the
>>> * processing loop.
>>> @@ -2442,8 +2477,6 @@ main(int argc, char *argv[])
>>> goto loop_done;
>>> }
>>>
>>> - engine_init_run();
>>> -
>>
>> I'm not sure why it's better to move engine_init_run() lower. I think
>> it can stay here.
>
> Ack. I'm fine either way.
>
>>
>>> struct ovsdb_idl_txn *ovs_idl_txn =
>>> ovsdb_idl_loop_run(&ovs_idl_loop);
>>> unsigned int new_ovs_cond_seqno
>>> = ovsdb_idl_get_condition_seqno(ovs_idl_loop.idl);
>>> @@ -2473,6 +2506,22 @@ main(int argc, char *argv[])
>>> ovnsb_cond_seqno = new_ovnsb_cond_seqno;
>>> }
>>>
>>> + bool version_mismatched = false;
>>> + if (pin_to_northd(ovs_idl_loop.idl)) {
>>> + version_mismatched = check_northd_version(
>>> + sbrec_sb_global_first(ovnsb_idl_loop.idl),
>>> + ovn_internal_version);
>>> + } else {
>>> + version_mismatched = false;
>>> + }
>>> +
>>> + if (version_mismatch_with_northd != version_mismatched) {
>>> + version_mismatch_with_northd = version_mismatched;
>>> + engine_set_force_recompute(true);
>>> + }
>>> +
>>> + engine_init_run();
>>> +
>>
>> I think we can make the version checking more contained, what do you
>> think about the following incremental?
>>
>> https://github.com/dceara/ovn/commit/29c382137eb974aff3ea080b8098b2919189e6e3
>
> It looks fine to me. I'd replace check_version with
> check_northd_version based on the reasons I gave above.
>
Ok.
>
>>
>>> struct engine_context eng_ctx = {
>>> .ovs_idl_txn = ovs_idl_txn,
>>> .ovnsb_idl_txn = ovnsb_idl_txn,
>>> @@ -2481,7 +2530,8 @@ main(int argc, char *argv[])
>>>
>>> engine_set_context(&eng_ctx);
>>>
>>> - if (ovsdb_idl_has_ever_connected(ovnsb_idl_loop.idl)) {
>>> + if (ovsdb_idl_has_ever_connected(ovnsb_idl_loop.idl) &&
>>> + !version_mismatch_with_northd) {
>>> /* Contains the transport zones that this Chassis belongs to */
>>> struct sset transport_zones =
>>> SSET_INITIALIZER(&transport_zones);
>>> sset_from_delimited_string(&transport_zones,
>>> @@ -2770,6 +2820,7 @@ loop_done:
>>> }
>>> }
>>>
>>> + free(ovn_internal_version);
>>> unixctl_server_destroy(unixctl);
>>> lflow_destroy();
>>> ofctrl_destroy();
>>> @@ -2822,6 +2873,7 @@ parse_options(int argc, char *argv[])
>>>
>>> case 'V':
>>> ovs_print_version(OFP15_VERSION, OFP15_VERSION);
>>> + printf("SB DB Schema %s\n", sbrec_get_db_version());
>>> exit(EXIT_SUCCESS);
>>>
>>> VLOG_OPTION_HANDLERS
>>> diff --git a/lib/ovn-util.c b/lib/ovn-util.c
>>> index 18aac8da34..9dc9ade3b4 100644
>>> --- a/lib/ovn-util.c
>>> +++ b/lib/ovn-util.c
>>> @@ -20,6 +20,7 @@
>>> #include <unistd.h>
>>>
>>> #include "daemon.h"
>>> +#include "include/ovn/actions.h"
>>> #include "openvswitch/ofp-parse.h"
>>> #include "openvswitch/vlog.h"
>>> #include "ovn-dirs.h"
>>> @@ -713,3 +714,19 @@ ip_address_and_port_from_lb_key(const char *key, char
>>> **ip_address,
>>> *addr_family = ss.ss_family;
>>> return true;
>>> }
>>> +
>>> +#define N_OVN_ACTIONS N_OVNACTS
>>> +BUILD_ASSERT_DECL(N_OVN_ACTIONS == 47);
>>> +
>>> +/* Increment this for any logical flow changes or if existing OVN action is
>>> + * modified. */
>>
>> Except for this comment here (that can be easily overlooked), we don't
>> mention anywhere else that the minor version should be incremented when
>> changing existing OVN actions.
>>
>> I'm not sure if we could automatically generate the minor version but in
>> any case we should make this more visible (both for contributors and
>> reviewers).
>
> Maybe we should never update an existing action (from now on ) ? If
> there is a need we can
> add a new action ?
>
> Or can we come up with a way to version the ovnacts ? Right now we
> have a 'pad' field in the struct ovnact.
> I think we can rename it to version. Whenever an action is updated we
> need to increment this field.
> And the OVN_INTERNAL_MINOR_VER will be the sum of all the versions of ovnacts
> ?
There would still be no easy way to enforce this, except at review time
but it's probably better than with a hardcoded OVN_INTERNAL_MINOR_VER value.
>
> If this sounds reasonable I can look into this as a follow up patch.
>
I think this would be nice to have.
>>
>> I'd suggest at least updating the comment in actions.h where we define
>> OVNACTS.
>
> Ack.
>
>>
>> Maybe we should also mention this in one of the internals/*.rst docs.
>> Is Documentation/internals/submitting-patches.rst the best place to
>> mention this? I see we have a section for "Feature Deprecation
>> Guidelines", maybe we can add one for "OVN compatibility" or "OVN Upgrade"?
>
> I'll see if we can add more documentation about it here.
>
Thanks!
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev