On 4/6/23 19:03, Ales Musil wrote:
> There are essentially three problems with the current
> combination of DGP + SNAT + LB:
>
> 1) The first packet is being SNATed in common zone due
> to a problem with pinctrl not preserving ct_mark/ct_label.
> The commit would create a SNAT entry within the same with DNAT
> entry.
>
> 2) The unSNAT for reply always happened in common zone because of
> the loopback check which would be triggered only when we loop
> the packet through the LR. Now there are two possibilities how
> the reply packet would be handled:
>
> a) If the entry for SNAT in common zone did not time out yet, the
> packet would be passed through unSNAT in common zone which would
> be fine and continue on. However, the unDNAT wouldn't work due to
> the limitation of CT not capable of doing unSNAT/unDNAT on the same
> packet twice in the same zone. So the reply would arrive to
> the original interface, but with wrong source address.
>
> b) If the entry for SNAT timed out it would loop and do unSNAT correctly
> in separate zone and then also unDNAT. This is not possible anymore with
> a recent change 8c341b9d (northd: Drop packets destined to router owned NAT
> IP for DGP).
> The reply would be dropped before looping after that change co the traffic
> would never arrive to the original interface.
>
> 3) The unDNAT was happening only if the DGP was outport meaning
> the reply traffic was routed out, but in the opposite case
> the unDNAT was happening only because of the looping which made
> outport=inport. That's why it worked before introduction of explicit drop.
>
> In order to fix all those issues do two changes:
>
> 1) Include inport in the unDNAT match, so that we have both
> routing directions covered e.g. (inport == "dgp_port" || outport ==
> "dpg_port").
>
> 2) Always use the separate zone for SNAT and DNAT. As the common
> zone was needed for HWOL make the common zone optional with
> configuration option called "use_common_zone". This option is
> by default "false" and can be specified per LR. Use of separate
> zones also eliminates the need for the flag propagation
> in "lr_out_chk_dnat_local" stage, removing the match on ct_mark/ct_label.
>
> The "SNAT in separate zone from DNAT" system test is moved to run only
> with kernel datapath. The reason is that this test doesn't work with
> userspace datapath due to recirculation limit, currently set to 6 [0].
>
> [0]https://github.com/openvswitch/ovs/blob/9d840923d32124fe427de76e8234c49d64e4bb77/lib/dpif-netdev.c#L102
> Reported-at: https://bugzilla.redhat.com/2161281
> Signed-off-by: Ales Musil <[email protected]>
> ---
Hi Ales,
Thanks for this new revision!
[...]
> @@ -13846,17 +13838,17 @@ build_lrouter_out_undnat_flow(struct hmap *lflows,
> struct ovn_datapath *od,
> ds_put_format(match, " && is_chassis_resident(%s)",
> l3dgw_port->cr_port->json_key);
> }
> - ds_clear(actions);
> +
> if (distributed) {
> ds_put_format(actions, "eth.src = "ETH_ADDR_FMT"; ",
> ETH_ADDR_ARGS(mac));
> }
>
> - if (lrouter_dnat_and_snat_is_stateless(nat)) {
> + if (stateless) {
> ds_put_format(actions, "next;");
> } else {
> - ds_put_format(actions,
> - od->is_gw_router ? "ct_dnat;" : "ct_dnat_in_czone;");
> + ds_put_cstr(actions,
> + use_common_zone ? "ct_dnat_in_czone;" : "ct_dnat;");
This misses a lrouter_use_common_zone() call. We shouldn't use the
"common zone" for gw routers.
> }
>
> ovn_lflow_add_with_hint(lflows, od, S_ROUTER_OUT_UNDNAT, 100,
[...]
> static void
> @@ -14414,18 +14423,36 @@ build_lrouter_nat_defrag_and_lb(struct ovn_datapath
> *od, struct hmap *lflows,
> int cidr_bits;
> struct ovn_port *l3dgw_port;
>
> + bool stateless = lrouter_dnat_and_snat_is_stateless(nat);
> +
> if (lrouter_check_nat_entry(od, nat, lr_ports, &mask, &is_v6,
> &cidr_bits,
> &mac, &distributed, &l3dgw_port) < 0) {
> continue;
> }
>
> - /* S_ROUTER_IN_UNSNAT */
> - build_lrouter_in_unsnat_flow(lflows, od, nat, match, actions,
> distributed,
> - is_v6, l3dgw_port);
> + /* S_ROUTER_IN_UNSNAT
> + * Ingress UNSNAT table: It is for already established connections'
> + * reverse traffic. i.e., SNAT has already been done in egress
> + * pipeline and now the packet has entered the ingress pipeline as
> + * part of a reply. We undo the SNAT here.
> + *
> + * Undoing SNAT has to happen before DNAT processing. This is
> + * because when the packet was DNATed in ingress pipeline, it did
> + * not know about the possibility of eventual additional SNAT in
> + * egress pipeline. */
> + if (lrouter_use_common_zone(od) && !stateless) {
> + build_lrouter_in_unsnat_in_czone_flow(lflows, od, nat, match,
> + distributed, is_v6,
> + l3dgw_port);
> + } else {
> + build_lrouter_in_unsnat_flow(lflows, od, nat, match, distributed,
> + is_v6, l3dgw_port, stateless);
> + }
Now that we have the two explicit branches I realized that I find this
part a bit confusing. Stateless NAT doesn't need any zone, we modify
the packet inline. I think I'd rewrite this as:
if (stateless) {
build_lrouter_in_unsnat_stateless_flow(...);
} else if (lrouter_use_common_zone(od)) {
build_lrouter_in_unsnat_in_czone_flow(...);
} else {
build_lrouter_in_unsnat_flow(lflows, od, nat, match, distributed,
is_v6, l3dgw_port);
}
> /* S_ROUTER_IN_DNAT */
> - build_lrouter_in_dnat_flow(lflows, od, nat, match, actions,
> distributed,
> - cidr_bits, is_v6, l3dgw_port);
> + build_lrouter_in_dnat_flow(lflows, od, nat, match, actions,
> + distributed, cidr_bits, is_v6, l3dgw_port,
> + stateless);
>
> /* ARP resolve for NAT IPs. */
> if (od->is_gw_router) {
> @@ -14494,16 +14521,29 @@ build_lrouter_nat_defrag_and_lb(struct ovn_datapath
> *od, struct hmap *lflows,
> }
> }
>
> - /* S_ROUTER_OUT_DNAT_LOCAL */
> - build_lrouter_out_is_dnat_local(lflows, od, nat, match, actions,
> - distributed, is_v6, l3dgw_port);
> + if (use_common_zone) {
> + /* S_ROUTER_OUT_DNAT_LOCAL */
> + build_lrouter_out_is_dnat_local(lflows, od, nat, match, actions,
> + distributed, is_v6, l3dgw_port);
> + }
>
> /* S_ROUTER_OUT_UNDNAT */
> - build_lrouter_out_undnat_flow(lflows, od, nat, match, actions,
> distributed,
> - mac, is_v6, l3dgw_port);
> - /* S_ROUTER_OUT_SNAT */
> - build_lrouter_out_snat_flow(lflows, od, nat, match, actions,
> distributed,
> - mac, cidr_bits, is_v6, l3dgw_port);
> + build_lrouter_out_undnat_flow(lflows, od, nat, match, actions,
> + distributed, mac, is_v6, l3dgw_port,
> + stateless);
> + /* S_ROUTER_OUT_SNAT
> + * Egress SNAT table: Packets enter the egress pipeline with
> + * source ip address that needs to be SNATted to a external ip
> + * address. */
> + if (lrouter_use_common_zone(od) && !stateless) {
> + build_lrouter_out_snat_in_czone_flow(lflows, od, nat, match,
> + actions, distributed, mac,
> + cidr_bits, is_v6,
> l3dgw_port);
> + } else {
> + build_lrouter_out_snat_flow(lflows, od, nat, match, actions,
> + distributed, mac, cidr_bits, is_v6,
> + l3dgw_port, stateless);
> + }
Similar comment about treating stateless rules separately as above for
in_unsnat.
Thanks,
Dumitru
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev