On Wed, Sep 6, 2023 at 12:33 AM Ales Musil <[email protected]> wrote: > > > > On Wed, Sep 6, 2023 at 8:28 AM Han Zhou <[email protected]> wrote: >> >> skip_snat and force_snat were added in ct_lb/ct_lb_mark action by commit >> c1d6b8ac to set skip_snat and force_snat bits in ct-lable/mark. It is >> not related to the ct_lb_related feature. This patch removes the check. >> >> Without this fix, the skip_snat/force_snat features are broken when >> northd is running in "backward compatible" mode, when there is any >> ovn-controller running an older version that doesn't support >> "ct_lb_related". northd would not add the skip_snat/force_snat in the >> ct_lb action in such case, and the relevant bits won't be set in CT >> (even on nodes running ovn-controller that supports the >> skip_snat/force_snat in ct_lb action), but those CT bits are required to >> be matched in another logical flows so that the relevant register flags >> can be set (the behavior introduced by the commit ce46a1bacf69): >> >> match=(ct.est && !ct.rel && !ct.new && ct_mark.natted && ct_mark.force_snat == 1), action=(flags.force_snat_for_lb = 1; next;) >> match=(ct.est && !ct.rel && !ct.new && ct_mark.natted && ct_mark.skip_snat == 1), action=(flags.skip_snat_for_lb = 1; next;) >> >> Because of this, the skip_snat/force_snat doesn't work in "backward >> compatible" mode. With this fix, the feature continues to work between >> "backward compatible" mode northd and all nodes that has the ct_lb >> skip_snat/force_snat support, while nodes running older version of >> ovn-controller would report syntax error on such logical flows. While >> this may not sound perfect, but if users follow the suggested upgrade >> order so that ovn-controllers are upgraded before ovn-northd, there is >> no problem. The biggest benefit of this fix is that when there is a bad >> node that fails upgrading ovn-controller, the skip_snat/force_snat >> features are not broken. >> >> Alternatively, we could fix the problem by reverting commit ce46a1bacf69. >> However, there were already several fixes and refactors for the related >> code on top of that, it is not straightforward. The code would become >> more complex and the value of the backward compatibility for a >> northd-first upgrade order is not obvious for this feature which was >> introduced 2 release ago. In addition, there are other places when the >> ct_lb skip_snat/force_snat are used without checking any chassis feature >> support, such as in build_lb_affinity_lr_flows(). So, based on all the >> above considerations, simply removing the feature compatiblity check >> seems to be a more reasonable fix. >> >> Fixes: c1d6b8ac34eb ("northd: Store skip_snat and force_snat in ct_label/mark") >> Fixes: ce46a1bacf69 ("northd: Use generic ct.est flows for LR LBs") >> Signed-off-by: Han Zhou <[email protected]> >> --- >> northd/northd.c | 5 ++--- >> tests/ovn-northd.at | 4 ++-- >> 2 files changed, 4 insertions(+), 5 deletions(-) >> >> diff --git a/northd/northd.c b/northd/northd.c >> index 3eaa43f07a1f..cede7f98d61d 100644 >> --- a/northd/northd.c >> +++ b/northd/northd.c >> @@ -4098,13 +4098,12 @@ build_lb_vip_actions(struct ovn_lb_vip *lb_vip, >> const char *enclose = is_lb_action ? ");" : ""; >> >> if (!ls_dp) { >> - bool flag_supported = is_lb_action && features->ct_lb_related; >> ds_put_format(skip_snat_action, "flags.skip_snat_for_lb = 1; %s%s", >> ds_cstr(action), >> - flag_supported ? "; skip_snat);" : enclose); >> + is_lb_action ? "; skip_snat);" : enclose); >> ds_put_format(force_snat_action, "flags.force_snat_for_lb = 1; %s%s", >> ds_cstr(action), >> - flag_supported ? "; force_snat);" : enclose); >> + is_lb_action ? "; force_snat);" : enclose); >> } >> >> ds_put_cstr(action, enclose); >> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at >> index 23dbe111fb7b..939a31956615 100644 >> --- a/tests/ovn-northd.at >> +++ b/tests/ovn-northd.at >> @@ -8461,7 +8461,7 @@ AT_CHECK([ovn-sbctl lflow-list | grep -e natted -e ct_lb], [0], [dnl >> >> check ovn-nbctl --wait=sb set logical_router lr options:lb_force_snat_ip="42.42.42.1" >> AT_CHECK([ovn-sbctl lflow-list | grep lr_in_dnat], [0], [dnl >> - table=7 (lr_in_dnat ), priority=110 , match=(ct.new && !ct.rel && ip4 && ip4.dst == 66.66.66.66), action=(flags.force_snat_for_lb = 1; ct_lb(backends=42.42.42.2);) >> + table=7 (lr_in_dnat ), priority=110 , match=(ct.new && !ct.rel && ip4 && ip4.dst == 66.66.66.66), action=(flags.force_snat_for_lb = 1; ct_lb(backends=42.42.42.2; force_snat);) >> table=7 (lr_in_dnat ), priority=70 , match=(ct.est && !ct.rel && !ct.new && ct_label.natted && ct_label.force_snat == 1), action=(flags.force_snat_for_lb = 1; next;) >> table=7 (lr_in_dnat ), priority=70 , match=(ct.est && !ct.rel && !ct.new && ct_label.natted && ct_label.skip_snat == 1), action=(flags.skip_snat_for_lb = 1; next;) >> table=7 (lr_in_dnat ), priority=50 , match=(ct.est && !ct.rel && !ct.new && ct_label.natted), action=(next;) >> @@ -8471,7 +8471,7 @@ check ovn-nbctl remove logical_router lr options lb_force_snat_ip >> >> check ovn-nbctl --wait=sb set load_balancer lb-test options:skip_snat="true" >> AT_CHECK([ovn-sbctl lflow-list | grep lr_in_dnat], [0], [dnl >> - table=7 (lr_in_dnat ), priority=110 , match=(ct.new && !ct.rel && ip4 && ip4.dst == 66.66.66.66), action=(flags.skip_snat_for_lb = 1; ct_lb(backends=42.42.42.2);) >> + table=7 (lr_in_dnat ), priority=110 , match=(ct.new && !ct.rel && ip4 && ip4.dst == 66.66.66.66), action=(flags.skip_snat_for_lb = 1; ct_lb(backends=42.42.42.2; skip_snat);) >> table=7 (lr_in_dnat ), priority=70 , match=(ct.est && !ct.rel && !ct.new && ct_label.natted && ct_label.force_snat == 1), action=(flags.force_snat_for_lb = 1; next;) >> table=7 (lr_in_dnat ), priority=70 , match=(ct.est && !ct.rel && !ct.new && ct_label.natted && ct_label.skip_snat == 1), action=(flags.skip_snat_for_lb = 1; next;) >> table=7 (lr_in_dnat ), priority=50 , match=(ct.est && !ct.rel && !ct.new && ct_label.natted), action=(next;) >> -- >> 2.38.1 >> >> _______________________________________________ >> dev mailing list >> [email protected] >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev >> > > Hi Han, > > thank you for fixing this, it's a bit unfortunate there is so much involved in the backward compatibility for northd upgrade first. On the other hand I wonder if we should pursue those feature flags in the future if it's clearly stated that the upgrade order is controller first. > > Acked-by: Ales Musil <[email protected]> >
Thanks Ales. I applied this to main. It is a bug fix but it is only related to a use case with a special upgrade order. Please let me know if this needs to be backported. Regards, Han > Thanks, > Ales > -- > > Ales Musil > > Senior Software Engineer - OVN Core > > Red Hat EMEA > > [email protected] IM: amusil _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
