On Thu, Sep 7, 2023 at 9:07 AM Han Zhou <[email protected]> wrote:
>
>
>
> On Wed, Sep 6, 2023 at 12:33 AM Ales Musil <[email protected]> wrote:
> >
> >
> >
> > On Wed, Sep 6, 2023 at 8:28 AM Han Zhou <[email protected]> wrote:
> >>
> >> skip_snat and force_snat were added in ct_lb/ct_lb_mark action by
commit
> >> c1d6b8ac to set skip_snat and force_snat bits in ct-lable/mark. It is
> >> not related to the ct_lb_related feature. This patch removes the check.
> >>
> >> Without this fix, the skip_snat/force_snat features are broken when
> >> northd is running in "backward compatible" mode, when there is any
> >> ovn-controller running an older version that doesn't support
> >> "ct_lb_related". northd would not add the skip_snat/force_snat in the
> >> ct_lb action in such case, and the relevant bits won't be set in CT
> >> (even on nodes running ovn-controller that supports the
> >> skip_snat/force_snat in ct_lb action), but those CT bits are required
to
> >> be matched in another logical flows so that the relevant register flags
> >> can be set (the behavior introduced by the commit ce46a1bacf69):
> >>
> >> match=(ct.est && !ct.rel && !ct.new && ct_mark.natted &&
ct_mark.force_snat == 1), action=(flags.force_snat_for_lb = 1; next;)
> >> match=(ct.est && !ct.rel && !ct.new && ct_mark.natted &&
ct_mark.skip_snat == 1), action=(flags.skip_snat_for_lb = 1; next;)
> >>
> >> Because of this, the skip_snat/force_snat doesn't work in "backward
> >> compatible" mode. With this fix, the feature continues to work between
> >> "backward compatible" mode northd and all nodes that has the ct_lb
> >> skip_snat/force_snat support, while nodes running older version of
> >> ovn-controller would report syntax error on such logical flows. While
> >> this may not sound perfect, but if users follow the suggested upgrade
> >> order so that ovn-controllers are upgraded before ovn-northd, there is
> >> no problem. The biggest benefit of this fix is that when there is a bad
> >> node that fails upgrading ovn-controller, the skip_snat/force_snat
> >> features are not broken.
> >>
> >> Alternatively, we could fix the problem by reverting commit
ce46a1bacf69.
> >> However, there were already several fixes and refactors for the related
> >> code on top of that, it is not straightforward. The code would become
> >> more complex and the value of the backward compatibility for a
> >> northd-first upgrade order is not obvious for this feature which was
> >> introduced 2 release ago. In addition, there are other places when the
> >> ct_lb skip_snat/force_snat are used without checking any chassis
feature
> >> support, such as in build_lb_affinity_lr_flows(). So, based on all the
> >> above considerations, simply removing the feature compatiblity check
> >> seems to be a more reasonable fix.
> >>
> >> Fixes: c1d6b8ac34eb ("northd: Store skip_snat and force_snat in
ct_label/mark")
> >> Fixes: ce46a1bacf69 ("northd: Use generic ct.est flows for LR LBs")
> >> Signed-off-by: Han Zhou <[email protected]>
> >> ---
> >>  northd/northd.c     | 5 ++---
> >>  tests/ovn-northd.at | 4 ++--
> >>  2 files changed, 4 insertions(+), 5 deletions(-)
> >>
> >> diff --git a/northd/northd.c b/northd/northd.c
> >> index 3eaa43f07a1f..cede7f98d61d 100644
> >> --- a/northd/northd.c
> >> +++ b/northd/northd.c
> >> @@ -4098,13 +4098,12 @@ build_lb_vip_actions(struct ovn_lb_vip *lb_vip,
> >>      const char *enclose = is_lb_action ? ");" : "";
> >>
> >>      if (!ls_dp) {
> >> -        bool flag_supported = is_lb_action && features->ct_lb_related;
> >>          ds_put_format(skip_snat_action, "flags.skip_snat_for_lb = 1;
%s%s",
> >>                        ds_cstr(action),
> >> -                      flag_supported ? "; skip_snat);" : enclose);
> >> +                      is_lb_action ? "; skip_snat);" : enclose);
> >>          ds_put_format(force_snat_action, "flags.force_snat_for_lb =
1; %s%s",
> >>                        ds_cstr(action),
> >> -                      flag_supported ? "; force_snat);" : enclose);
> >> +                      is_lb_action ? "; force_snat);" : enclose);
> >>      }
> >>
> >>      ds_put_cstr(action, enclose);
> >> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> >> index 23dbe111fb7b..939a31956615 100644
> >> --- a/tests/ovn-northd.at
> >> +++ b/tests/ovn-northd.at
> >> @@ -8461,7 +8461,7 @@ AT_CHECK([ovn-sbctl lflow-list | grep -e natted
-e ct_lb], [0], [dnl
> >>
> >>  check ovn-nbctl --wait=sb set logical_router lr
options:lb_force_snat_ip="42.42.42.1"
> >>  AT_CHECK([ovn-sbctl lflow-list | grep lr_in_dnat], [0], [dnl
> >> -  table=7 (lr_in_dnat         ), priority=110  , match=(ct.new &&
!ct.rel && ip4 && ip4.dst == 66.66.66.66), action=(flags.force_snat_for_lb
= 1; ct_lb(backends=42.42.42.2);)
> >> +  table=7 (lr_in_dnat         ), priority=110  , match=(ct.new &&
!ct.rel && ip4 && ip4.dst == 66.66.66.66), action=(flags.force_snat_for_lb
= 1; ct_lb(backends=42.42.42.2; force_snat);)
> >>    table=7 (lr_in_dnat         ), priority=70   , match=(ct.est &&
!ct.rel && !ct.new && ct_label.natted && ct_label.force_snat == 1),
action=(flags.force_snat_for_lb = 1; next;)
> >>    table=7 (lr_in_dnat         ), priority=70   , match=(ct.est &&
!ct.rel && !ct.new && ct_label.natted && ct_label.skip_snat == 1),
action=(flags.skip_snat_for_lb = 1; next;)
> >>    table=7 (lr_in_dnat         ), priority=50   , match=(ct.est &&
!ct.rel && !ct.new && ct_label.natted), action=(next;)
> >> @@ -8471,7 +8471,7 @@ check ovn-nbctl remove logical_router lr options
lb_force_snat_ip
> >>
> >>  check ovn-nbctl --wait=sb set load_balancer lb-test
options:skip_snat="true"
> >>  AT_CHECK([ovn-sbctl lflow-list | grep lr_in_dnat], [0], [dnl
> >> -  table=7 (lr_in_dnat         ), priority=110  , match=(ct.new &&
!ct.rel && ip4 && ip4.dst == 66.66.66.66), action=(flags.skip_snat_for_lb =
1; ct_lb(backends=42.42.42.2);)
> >> +  table=7 (lr_in_dnat         ), priority=110  , match=(ct.new &&
!ct.rel && ip4 && ip4.dst == 66.66.66.66), action=(flags.skip_snat_for_lb =
1; ct_lb(backends=42.42.42.2; skip_snat);)
> >>    table=7 (lr_in_dnat         ), priority=70   , match=(ct.est &&
!ct.rel && !ct.new && ct_label.natted && ct_label.force_snat == 1),
action=(flags.force_snat_for_lb = 1; next;)
> >>    table=7 (lr_in_dnat         ), priority=70   , match=(ct.est &&
!ct.rel && !ct.new && ct_label.natted && ct_label.skip_snat == 1),
action=(flags.skip_snat_for_lb = 1; next;)
> >>    table=7 (lr_in_dnat         ), priority=50   , match=(ct.est &&
!ct.rel && !ct.new && ct_label.natted), action=(next;)
> >> --
> >> 2.38.1
> >>
> >> _______________________________________________
> >> dev mailing list
> >> [email protected]
> >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >>
> >
> > Hi Han,
> >
> > thank you for fixing this, it's a bit unfortunate there is so much
involved in the backward compatibility for northd upgrade first. On the
other hand I wonder if we should pursue those feature flags in the future
if it's clearly stated that the upgrade order is controller first.
> >
> > Acked-by: Ales Musil <[email protected]>
> >
>
> Thanks Ales. I applied this to main.
> It is a bug fix but it is only related to a use case with a special
upgrade order. Please let me know if this needs to be backported.
>
> Regards,
> Han
>

Sorry that after running regression tests I fixed only the "Load balancer
ct_lb_mark backwards compatibility" test but missed the "Load balancer CT
related backwards compatibility" test, and somehow forgot to rerun the
whole test before submitting the patch ...
I just applied a followup commit to fix the CI:
bfc39651b1bc: ovn-northd.at: Fix "Load balancer CT related backwards
compatibility".

I didn't bother sending for another review for this small fix just to avoid
leaving the main branch in a broken state. Sorry again.

Regards,
Han

> > Thanks,
> > Ales
> > --
> >
> > Ales Musil
> >
> > Senior Software Engineer - OVN Core
> >
> > Red Hat EMEA
> >
> > [email protected]    IM: amusil
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to