On Tue, Apr 11, 2023 at 3:44 PM Dumitru Ceara <dce...@redhat.com> wrote:

> On 4/6/23 19:03, Ales Musil wrote:
> > There are essentially three problems with the current
> > combination of DGP + SNAT + LB:
> >
> > 1) The first packet is being SNATed in common zone due
> > to a problem with pinctrl not preserving ct_mark/ct_label.
> > The commit would create a SNAT entry within the same with DNAT
> > entry.
> >
> > 2) The unSNAT for reply always happened in common zone because of
> > the loopback check which would be triggered only when we loop
> > the packet through the LR. Now there are two possibilities how
> > the reply packet would be handled:
> >
> > a) If the entry for SNAT in common zone did not time out yet, the
> > packet would be passed through unSNAT in common zone which would
> > be fine and continue on. However, the unDNAT wouldn't work due to
> > the limitation of CT not capable of doing unSNAT/unDNAT on the same
> > packet twice in the same zone. So the reply would arrive to
> > the original interface, but with wrong source address.
> >
> > b) If the entry for SNAT timed out it would loop and do unSNAT correctly
> > in separate zone and then also unDNAT. This is not possible anymore with
> > a recent change 8c341b9d (northd: Drop packets destined to router owned
> NAT IP for DGP).
> > The reply would be dropped before looping after that change co the
> traffic
> > would never arrive to the original interface.
> >
> > 3) The unDNAT was happening only if the DGP was outport meaning
> > the reply traffic was routed out, but in the opposite case
> > the unDNAT was happening only because of the looping which made
> > outport=inport. That's why it worked before introduction of explicit
> drop.
> >
> > In order to fix all those issues do two changes:
> >
> > 1) Include inport in the unDNAT match, so that we have both
> > routing directions covered e.g. (inport == "dgp_port" || outport ==
> "dpg_port").
> >
> > 2) Always use the separate zone for SNAT and DNAT. As the common
> > zone was needed for HWOL make the common zone optional with
> > configuration option called "use_common_zone". This option is
> > by default "false" and can be specified per LR. Use of separate
> > zones also eliminates the need for the flag propagation
> > in "lr_out_chk_dnat_local" stage, removing the match on ct_mark/ct_label.
> >
> > The "SNAT in separate zone from DNAT" system test is moved to run only
> > with kernel datapath. The reason is that this test doesn't work with
> > userspace datapath due to recirculation limit, currently set to 6 [0].
> >
> > [0]
> https://github.com/openvswitch/ovs/blob/9d840923d32124fe427de76e8234c49d64e4bb77/lib/dpif-netdev.c#L102
> > Reported-at: https://bugzilla.redhat.com/2161281
> > Signed-off-by: Ales Musil <amu...@redhat.com>
> > ---
>
> Hi Ales,
>
> Thanks for this new revision!
>

Hi Dumitru,

thank you for the review, everything should be addressed in v9.

Thanks,
Ales


>
> [...]
>
> > @@ -13846,17 +13838,17 @@ build_lrouter_out_undnat_flow(struct hmap
> *lflows, struct ovn_datapath *od,
> >          ds_put_format(match, " && is_chassis_resident(%s)",
> >                        l3dgw_port->cr_port->json_key);
> >      }
> > -    ds_clear(actions);
> > +
> >      if (distributed) {
> >          ds_put_format(actions, "eth.src = "ETH_ADDR_FMT"; ",
> >                        ETH_ADDR_ARGS(mac));
> >      }
> >
> > -    if (lrouter_dnat_and_snat_is_stateless(nat)) {
> > +    if (stateless) {
> >          ds_put_format(actions, "next;");
> >      } else {
> > -        ds_put_format(actions,
> > -                      od->is_gw_router ? "ct_dnat;" :
> "ct_dnat_in_czone;");
> > +        ds_put_cstr(actions,
> > +                    use_common_zone ? "ct_dnat_in_czone;" : "ct_dnat;");
>
> This misses a lrouter_use_common_zone() call.  We shouldn't use the
> "common zone" for gw routers.
>
> >      }
> >
> >      ovn_lflow_add_with_hint(lflows, od, S_ROUTER_OUT_UNDNAT, 100,
>
> [...]
>
> >  static void
> > @@ -14414,18 +14423,36 @@ build_lrouter_nat_defrag_and_lb(struct
> ovn_datapath *od, struct hmap *lflows,
> >          int cidr_bits;
> >          struct ovn_port *l3dgw_port;
> >
> > +        bool stateless = lrouter_dnat_and_snat_is_stateless(nat);
> > +
> >          if (lrouter_check_nat_entry(od, nat, lr_ports, &mask, &is_v6,
> >                                      &cidr_bits,
> >                                      &mac, &distributed, &l3dgw_port) <
> 0) {
> >              continue;
> >          }
> >
> > -        /* S_ROUTER_IN_UNSNAT */
> > -        build_lrouter_in_unsnat_flow(lflows, od, nat, match, actions,
> distributed,
> > -                                     is_v6, l3dgw_port);
> > +        /* S_ROUTER_IN_UNSNAT
> > +         * Ingress UNSNAT table: It is for already established
> connections'
> > +         * reverse traffic. i.e., SNAT has already been done in egress
> > +         * pipeline and now the packet has entered the ingress pipeline
> as
> > +         * part of a reply. We undo the SNAT here.
> > +         *
> > +         * Undoing SNAT has to happen before DNAT processing.  This is
> > +         * because when the packet was DNATed in ingress pipeline, it
> did
> > +         * not know about the possibility of eventual additional SNAT in
> > +         * egress pipeline. */
> > +        if (lrouter_use_common_zone(od) && !stateless) {
> > +            build_lrouter_in_unsnat_in_czone_flow(lflows, od, nat,
> match,
> > +                                                  distributed, is_v6,
> > +                                                  l3dgw_port);
> > +        } else {
> > +            build_lrouter_in_unsnat_flow(lflows, od, nat, match,
> distributed,
> > +                                         is_v6, l3dgw_port, stateless);
> > +        }
>
> Now that we have the two explicit branches I realized that I find this
> part a bit confusing.  Stateless NAT doesn't need any zone, we modify
> the packet inline.  I think I'd rewrite this as:
>
> if (stateless) {
>     build_lrouter_in_unsnat_stateless_flow(...);
> } else if (lrouter_use_common_zone(od)) {
>     build_lrouter_in_unsnat_in_czone_flow(...);
> } else {
>     build_lrouter_in_unsnat_flow(lflows, od, nat, match, distributed,
>                                  is_v6, l3dgw_port);
> }
>
> >          /* S_ROUTER_IN_DNAT */
> > -        build_lrouter_in_dnat_flow(lflows, od, nat, match, actions,
> distributed,
> > -                                   cidr_bits, is_v6, l3dgw_port);
> > +        build_lrouter_in_dnat_flow(lflows, od, nat, match, actions,
> > +                                   distributed, cidr_bits, is_v6,
> l3dgw_port,
> > +                                   stateless);
> >
> >          /* ARP resolve for NAT IPs. */
> >          if (od->is_gw_router) {
> > @@ -14494,16 +14521,29 @@ build_lrouter_nat_defrag_and_lb(struct
> ovn_datapath *od, struct hmap *lflows,
> >              }
> >          }
> >
> > -        /* S_ROUTER_OUT_DNAT_LOCAL */
> > -        build_lrouter_out_is_dnat_local(lflows, od, nat, match, actions,
> > -                                        distributed, is_v6, l3dgw_port);
> > +        if (use_common_zone) {
> > +            /* S_ROUTER_OUT_DNAT_LOCAL */
> > +            build_lrouter_out_is_dnat_local(lflows, od, nat, match,
> actions,
> > +                                            distributed, is_v6,
> l3dgw_port);
> > +        }
> >
> >          /* S_ROUTER_OUT_UNDNAT */
> > -        build_lrouter_out_undnat_flow(lflows, od, nat, match, actions,
> distributed,
> > -                                      mac, is_v6, l3dgw_port);
> > -        /* S_ROUTER_OUT_SNAT */
> > -        build_lrouter_out_snat_flow(lflows, od, nat, match, actions,
> distributed,
> > -                                    mac, cidr_bits, is_v6, l3dgw_port);
> > +        build_lrouter_out_undnat_flow(lflows, od, nat, match, actions,
> > +                                      distributed, mac, is_v6,
> l3dgw_port,
> > +                                      stateless);
> > +        /* S_ROUTER_OUT_SNAT
> > +         * Egress SNAT table: Packets enter the egress pipeline with
> > +         * source ip address that needs to be SNATted to a external ip
> > +         * address. */
> > +        if (lrouter_use_common_zone(od) && !stateless) {
> > +            build_lrouter_out_snat_in_czone_flow(lflows, od, nat, match,
> > +                                                 actions, distributed,
> mac,
> > +                                                 cidr_bits, is_v6,
> l3dgw_port);
> > +        } else {
> > +            build_lrouter_out_snat_flow(lflows, od, nat, match, actions,
> > +                                        distributed, mac, cidr_bits,
> is_v6,
> > +                                        l3dgw_port, stateless);
> > +        }
>
> Similar comment about treating stateless rules separately as above for
> in_unsnat.
>
> Thanks,
> Dumitru
>
>

-- 

Ales Musil

Senior Software Engineer - OVN Core

Red Hat EMEA <https://www.redhat.com>

amu...@redhat.com    IM: amusil
<https://red.ht/sig>
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to