On Thu, Apr 6, 2023 at 4:42 PM Dumitru Ceara <[email protected]> wrote:

> On 4/6/23 16:26, Ales Musil wrote:
> > On Thu, Apr 6, 2023 at 1:17 PM Dumitru Ceara <[email protected]> wrote:
> >
> >> On 3/29/23 16:57, Numan Siddique wrote:
> >>> On Wed, Mar 29, 2023 at 6:15 AM Dumitru Ceara <[email protected]>
> wrote:
> >>>>
> >>>> On 3/29/23 11:20, Ales Musil wrote:
> >>>>> There are essentially three problems with the current
> >>>>> combination of DGP + SNAT + LB:
> >>>>>
> >>>>> 1) The first packet is being SNATed in common zone due
> >>>>> to a problem with pinctrl not preserving ct_mark/ct_label.
> >>>>> The commit would create a SNAT entry within the same with DNAT
> >>>>> entry.
> >>>>>
> >>>>> 2) The unSNAT for reply always happened in common zone because of
> >>>>> the loopback check which would be triggered only when we loop
> >>>>> the packet through the LR. Now there are two possibilities how
> >>>>> the reply packet would be handled:
> >>>>>
> >>>>> a) If the entry for SNAT in common zone did not time out yet, the
> >>>>> packet would be passed through unSNAT in common zone which would
> >>>>> be fine and continue on. However, the unDNAT wouldn't work due to
> >>>>> the limitation of CT not capable of doing unSNAT/unDNAT on the same
> >>>>> packet twice in the same zone. So the reply would arrive to
> >>>>> the original interface, but with wrong source address.
> >>>>>
> >>>>> b) If the entry for SNAT timed out it would loop and do unSNAT
> >> correctly
> >>>>> in separate zone and then also unDNAT. This is not possible anymore
> >> with
> >>>>> a recent change 8c341b9d (northd: Drop packets destined to router
> >> owned NAT IP for DGP).
> >>>>> The reply would be dropped before looping after that change co the
> >> traffic
> >>>>> would never arrive to the original interface.
> >>>>>
> >>>>> 3) The unDNAT was happening only if the DGP was outport meaning
> >>>>> the reply traffic was routed out, but in the opposite case
> >>>>> the unDNAT was happening only because of the looping which made
> >>>>> outport=inport. That's why it worked before introduction of explicit
> >> drop.
> >>>>>
> >>>>> In order to fix all those issues do two changes:
> >>>>>
> >>>>> 1) Include inport in the unDNAT match, so that we have both
> >>>>> routing directions covered e.g. (inport == "dgp_port" || outport ==
> >> "dpg_port").
> >>>>>
> >>>>> 2) Always use the separate zone for SNAT and DNAT. As the common
> >>>>> zone was needed for HWOL make the common zone optional with
> >>>>> configuration option called "use_common_zone". This option is
> >>>>> by default "false" and can be specified per LR. Use of separate
> >>>>> zones also eliminates the need for the flag propagation
> >>>>> in "lr_out_chk_dnat_local" stage, removing the match on
> >> ct_mark/ct_label.
> >>>>>
> >>>>> Reported-at: https://bugzilla.redhat.com/2161281
> >>>>> Signed-off-by: Ales Musil <[email protected]>
> >>>>> ---
> >>>>> v2: Fix flaky system test.
> >>>>> v3: Rebase on top of current main.
> >>>>> v4: Rebase on top of current main.
> >>>>>     Move the system test to system-ovn-kmod
> >>>>>     to unblock the failures with userspace.
> >>>>> v5: Rebase on top of current main.
> >>>>> v6: Rebase on top of current main.
> >>>>>     Change the config to a global option instead.
> >>>>
> >>>> I know this was discussed on v5 [0][1] but I wanted to raise another
> >> point:
> >>>>
> >>>> The common zone was added to make deployments more "HWOL-friendly".
> >>>> Turning it into a global option means we can never have hybrid
> clusters,
> >>>> with only some of the nodes capable of offloading traffic efficiently.
> >>>>
> >>>> Do you foresee this becoming a problem?
> >>>
> >>> Having this option per router also is not going to solve this problem
> >> right ?
> >>> As the router would be distributed.
> >>>
> >>> From RedHat OSP perspective (after talking to Daniel) ,  I think this
> >>> would be a global option during deployment
> >>> and it's easier for the openstack installer to set this in NB_Global
> >>> rather than neutron setting this option for every router.
> >>>
> >>> I think it should be fine to have a global option.   In the case of
> >>> hybrid clusters you mentioned,  I suppose CMS should
> >>> enable using common zones.
> >>
> >> OK, sounds reasonable, I guess.
> >>
> >> Ales, I finally got a chance to read through the patch and I have a few
> >> comments below.  Overall the patch looks in good shape but please
> >> address those and then I think we can accept it.
> >>
> >> Regards,
> >> Dumitru
> >>
> >>>
> >>> Thanks
> >>> Numan
> >>>
> >>>>
> >>>> Thanks,
> >>>> Dumitru
> >>>>
> >>>> [0]
> >> https://mail.openvswitch.org/pipermail/ovs-dev/2023-March/403184.html
> >>>> [1]
> >> https://mail.openvswitch.org/pipermail/ovs-dev/2023-March/403187.html
> >>>>
> >>>>> ---
> >>>>>  northd/northd.c          | 496
> ++++++++++++++++++++-------------------
> >>>>>  northd/ovn-northd.8.xml  |  90 +------
> >>>>>  ovn-nb.xml               |  10 +
> >>>>>  tests/ovn-northd.at      | 213 ++++++++++++-----
> >>>>>  tests/ovn.at             |   3 +
> >>>>>  tests/system-ovn-kmod.at | 166 +++++++++++++
> >>>>>  tests/system-ovn.at      | 117 ---------
> >>>>>  7 files changed, 601 insertions(+), 494 deletions(-)
> >>>>>
> >>>>> diff --git a/northd/northd.c b/northd/northd.c
> >>>>> index 7a10e4dcf..2e4e99f11 100644
> >>>>> --- a/northd/northd.c
> >>>>> +++ b/northd/northd.c
> >>>>> @@ -66,6 +66,9 @@ static bool check_lsp_is_up;
> >>>>>
> >>>>>  static bool install_ls_lb_from_router;
> >>>>>
> >>>>> +/* Use common zone for SNAT and DNAT if this option is set to
> "true".
> >> */
> >>>>> +static bool use_common_zone = false;
> >>>>> +
> >>>>>  /* MAC allocated for service monitor usage. Just one mac is
> allocated
> >>>>>   * for this purpose and ovn-controller's on each chassis will make
> use
> >>>>>   * of this mac when sending out the packets to monitor the services
> >>>>> @@ -10662,6 +10665,8 @@ build_distr_lrouter_nat_flows_for_lb(struct
> >> lrouter_nat_lb_flows_ctx *ctx,
> >>>>>                                       enum lrouter_nat_lb_flow_type
> >> type,
> >>>>>                                       struct ovn_datapath *od)
> >>>>>  {
> >>>>> +    struct ovn_port *dgp = od->l3dgw_ports[0];
> >>>>> +
> >>>>>      const char *undnat_action;
> >>>>>
> >>>>>      switch (type) {
> >>>>> @@ -10673,9 +10678,12 @@ build_distr_lrouter_nat_flows_for_lb(struct
> >> lrouter_nat_lb_flows_ctx *ctx,
> >>>>>          break;
> >>>>>      case LROUTER_NAT_LB_FLOW_NORMAL:
> >>>>>      case LROUTER_NAT_LB_FLOW_MAX:
> >>>>> -        undnat_action = od->is_gw_router ? "ct_dnat;" :
> >> "ct_dnat_in_czone;";
> >>>>> +        undnat_action = !od->is_gw_router && use_common_zone
> >>>>> +                        ? "ct_dnat_in_czone;"
> >>>>> +                        : "ct_dnat;";
> >>
> >> I think it's more clear if we have a helper function that we use in all
> >> places where we build the dnat action, e.g.:
> >>
> >> static const char *
> >> lrouter_get_dnat_action(const struct ovn_datapath *od)
> >> {
> >>     return !od->is_gw_router && use_common_zone
> >>            ? "ct_dnat_in_czone;"
> >>            : "ct_dnat;";
> >> }
> >>
> >
> > There are only two places and both require slightly different
> > string in the end. This one is with ";" the other is without.
> > Which to me seems a bit more hassle for the helper function for two cases
> > WDYT?
> >
>
> You're right about the helper to get the action.  It's not that helpful.
>  However, in my previous email I had started typing the comment above
> with the goal of writing a helper function to determine if the router
> should use the common zone or not but then I changed my mind.  Maybe a
> helper like that is still useful?
>
> static bool
> lrouter_use_common_zone(const struct ovn_datapath *od)
> {
>     return !od->is_gw_router && use_common_zone;
> }
>
> Could we use this in all places where we currently check
> 'use_common_zone' directly?
>
>
I sort of did that, but not for every place. But your suggestion is probably
better, it would fit into more places. Let's see what do you think about
the use in v7, I can always spin v8 if needed.

Thanks,
Ales

-- 

Ales Musil

Senior Software Engineer - OVN Core

Red Hat EMEA <https://www.redhat.com>

[email protected]    IM: amusil
<https://red.ht/sig>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to