On Tue, Apr 11, 2023 at 3:44 PM Dumitru Ceara <dce...@redhat.com> wrote:
> On 4/6/23 19:03, Ales Musil wrote: > > There are essentially three problems with the current > > combination of DGP + SNAT + LB: > > > > 1) The first packet is being SNATed in common zone due > > to a problem with pinctrl not preserving ct_mark/ct_label. > > The commit would create a SNAT entry within the same with DNAT > > entry. > > > > 2) The unSNAT for reply always happened in common zone because of > > the loopback check which would be triggered only when we loop > > the packet through the LR. Now there are two possibilities how > > the reply packet would be handled: > > > > a) If the entry for SNAT in common zone did not time out yet, the > > packet would be passed through unSNAT in common zone which would > > be fine and continue on. However, the unDNAT wouldn't work due to > > the limitation of CT not capable of doing unSNAT/unDNAT on the same > > packet twice in the same zone. So the reply would arrive to > > the original interface, but with wrong source address. > > > > b) If the entry for SNAT timed out it would loop and do unSNAT correctly > > in separate zone and then also unDNAT. This is not possible anymore with > > a recent change 8c341b9d (northd: Drop packets destined to router owned > NAT IP for DGP). > > The reply would be dropped before looping after that change co the > traffic > > would never arrive to the original interface. > > > > 3) The unDNAT was happening only if the DGP was outport meaning > > the reply traffic was routed out, but in the opposite case > > the unDNAT was happening only because of the looping which made > > outport=inport. That's why it worked before introduction of explicit > drop. > > > > In order to fix all those issues do two changes: > > > > 1) Include inport in the unDNAT match, so that we have both > > routing directions covered e.g. (inport == "dgp_port" || outport == > "dpg_port"). > > > > 2) Always use the separate zone for SNAT and DNAT. As the common > > zone was needed for HWOL make the common zone optional with > > configuration option called "use_common_zone". This option is > > by default "false" and can be specified per LR. Use of separate > > zones also eliminates the need for the flag propagation > > in "lr_out_chk_dnat_local" stage, removing the match on ct_mark/ct_label. > > > > The "SNAT in separate zone from DNAT" system test is moved to run only > > with kernel datapath. The reason is that this test doesn't work with > > userspace datapath due to recirculation limit, currently set to 6 [0]. > > > > [0] > https://github.com/openvswitch/ovs/blob/9d840923d32124fe427de76e8234c49d64e4bb77/lib/dpif-netdev.c#L102 > > Reported-at: https://bugzilla.redhat.com/2161281 > > Signed-off-by: Ales Musil <amu...@redhat.com> > > --- > > Hi Ales, > > Thanks for this new revision! > Hi Dumitru, thank you for the review, everything should be addressed in v9. Thanks, Ales > > [...] > > > @@ -13846,17 +13838,17 @@ build_lrouter_out_undnat_flow(struct hmap > *lflows, struct ovn_datapath *od, > > ds_put_format(match, " && is_chassis_resident(%s)", > > l3dgw_port->cr_port->json_key); > > } > > - ds_clear(actions); > > + > > if (distributed) { > > ds_put_format(actions, "eth.src = "ETH_ADDR_FMT"; ", > > ETH_ADDR_ARGS(mac)); > > } > > > > - if (lrouter_dnat_and_snat_is_stateless(nat)) { > > + if (stateless) { > > ds_put_format(actions, "next;"); > > } else { > > - ds_put_format(actions, > > - od->is_gw_router ? "ct_dnat;" : > "ct_dnat_in_czone;"); > > + ds_put_cstr(actions, > > + use_common_zone ? "ct_dnat_in_czone;" : "ct_dnat;"); > > This misses a lrouter_use_common_zone() call. We shouldn't use the > "common zone" for gw routers. > > > } > > > > ovn_lflow_add_with_hint(lflows, od, S_ROUTER_OUT_UNDNAT, 100, > > [...] > > > static void > > @@ -14414,18 +14423,36 @@ build_lrouter_nat_defrag_and_lb(struct > ovn_datapath *od, struct hmap *lflows, > > int cidr_bits; > > struct ovn_port *l3dgw_port; > > > > + bool stateless = lrouter_dnat_and_snat_is_stateless(nat); > > + > > if (lrouter_check_nat_entry(od, nat, lr_ports, &mask, &is_v6, > > &cidr_bits, > > &mac, &distributed, &l3dgw_port) < > 0) { > > continue; > > } > > > > - /* S_ROUTER_IN_UNSNAT */ > > - build_lrouter_in_unsnat_flow(lflows, od, nat, match, actions, > distributed, > > - is_v6, l3dgw_port); > > + /* S_ROUTER_IN_UNSNAT > > + * Ingress UNSNAT table: It is for already established > connections' > > + * reverse traffic. i.e., SNAT has already been done in egress > > + * pipeline and now the packet has entered the ingress pipeline > as > > + * part of a reply. We undo the SNAT here. > > + * > > + * Undoing SNAT has to happen before DNAT processing. This is > > + * because when the packet was DNATed in ingress pipeline, it > did > > + * not know about the possibility of eventual additional SNAT in > > + * egress pipeline. */ > > + if (lrouter_use_common_zone(od) && !stateless) { > > + build_lrouter_in_unsnat_in_czone_flow(lflows, od, nat, > match, > > + distributed, is_v6, > > + l3dgw_port); > > + } else { > > + build_lrouter_in_unsnat_flow(lflows, od, nat, match, > distributed, > > + is_v6, l3dgw_port, stateless); > > + } > > Now that we have the two explicit branches I realized that I find this > part a bit confusing. Stateless NAT doesn't need any zone, we modify > the packet inline. I think I'd rewrite this as: > > if (stateless) { > build_lrouter_in_unsnat_stateless_flow(...); > } else if (lrouter_use_common_zone(od)) { > build_lrouter_in_unsnat_in_czone_flow(...); > } else { > build_lrouter_in_unsnat_flow(lflows, od, nat, match, distributed, > is_v6, l3dgw_port); > } > > > /* S_ROUTER_IN_DNAT */ > > - build_lrouter_in_dnat_flow(lflows, od, nat, match, actions, > distributed, > > - cidr_bits, is_v6, l3dgw_port); > > + build_lrouter_in_dnat_flow(lflows, od, nat, match, actions, > > + distributed, cidr_bits, is_v6, > l3dgw_port, > > + stateless); > > > > /* ARP resolve for NAT IPs. */ > > if (od->is_gw_router) { > > @@ -14494,16 +14521,29 @@ build_lrouter_nat_defrag_and_lb(struct > ovn_datapath *od, struct hmap *lflows, > > } > > } > > > > - /* S_ROUTER_OUT_DNAT_LOCAL */ > > - build_lrouter_out_is_dnat_local(lflows, od, nat, match, actions, > > - distributed, is_v6, l3dgw_port); > > + if (use_common_zone) { > > + /* S_ROUTER_OUT_DNAT_LOCAL */ > > + build_lrouter_out_is_dnat_local(lflows, od, nat, match, > actions, > > + distributed, is_v6, > l3dgw_port); > > + } > > > > /* S_ROUTER_OUT_UNDNAT */ > > - build_lrouter_out_undnat_flow(lflows, od, nat, match, actions, > distributed, > > - mac, is_v6, l3dgw_port); > > - /* S_ROUTER_OUT_SNAT */ > > - build_lrouter_out_snat_flow(lflows, od, nat, match, actions, > distributed, > > - mac, cidr_bits, is_v6, l3dgw_port); > > + build_lrouter_out_undnat_flow(lflows, od, nat, match, actions, > > + distributed, mac, is_v6, > l3dgw_port, > > + stateless); > > + /* S_ROUTER_OUT_SNAT > > + * Egress SNAT table: Packets enter the egress pipeline with > > + * source ip address that needs to be SNATted to a external ip > > + * address. */ > > + if (lrouter_use_common_zone(od) && !stateless) { > > + build_lrouter_out_snat_in_czone_flow(lflows, od, nat, match, > > + actions, distributed, > mac, > > + cidr_bits, is_v6, > l3dgw_port); > > + } else { > > + build_lrouter_out_snat_flow(lflows, od, nat, match, actions, > > + distributed, mac, cidr_bits, > is_v6, > > + l3dgw_port, stateless); > > + } > > Similar comment about treating stateless rules separately as above for > in_unsnat. > > Thanks, > Dumitru > > -- Ales Musil Senior Software Engineer - OVN Core Red Hat EMEA <https://www.redhat.com> amu...@redhat.com IM: amusil <https://red.ht/sig> _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev