On Fri, Mar 10, 2023 at 2:25 PM Simon Horman <[email protected]>
wrote:

> On Thu, Mar 09, 2023 at 07:16:04AM +0100, Ales Musil wrote:
> > There are essentially three problems with the current
> > combination of DGP + SNAT + LB:
> >
> > 1) The first packet is being SNATed in common zone due
> > to a problem with pinctrl not preserving ct_mark/ct_label.
> > The commit would create a SNAT entry within the same with DNAT
> > entry.
> >
> > 2) The unSNAT for reply always happened in common zone because of
> > the loopback check which would be triggered only when we loop
> > the packet through the LR. Now there are two possibilities how
> > the reply packet would be handled:
> >
> > a) If the entry for SNAT in common zone did not time out yet, the
> > packet would be passed through unSNAT in common zone which would
> > be fine and continue on. However, the unDNAT wouldn't work due to
> > the limitation of CT not capable of doing unSNAT/unDNAT on the same
> > packet twice in the same zone. So the reply would arrive to
> > the original interface, but with wrong source address.
> >
> > b) If the entry for SNAT timed out it would loop and do unSNAT correctly
> > in separate zone and then also unDNAT. This is not possible anymore with
> > a recent change 8c341b9d (northd: Drop packets destined to router owned
> NAT IP for DGP).
> > The reply would be dropped before looping after that change co the
> traffic
> > would never arrive to the original interface.
> >
> > 3) The unDNAT was happening only if the DGP was outport meaning
> > the reply traffic was routed out, but in the opposite case
> > the unDNAT was happening only because of the looping which made
> > outport=inport. That's why it worked before introduction of explicit
> drop.
> >
> > In order to fix all those issues do two changes:
> >
> > 1) Include inport in the unDNAT match, so that we have both
> > routing directions covered e.g. (inport == "dgp_port" || outport ==
> "dpg_port").
> >
> > 2) Always use the separate zone for SNAT and DNAT. As the common
> > zone was needed for HWOL make the common zone optional with
> > configuration option called "use_common_zone". This option is
> > by default "false" and can be specified per LR. Use of separate
> > zones also eliminates the need for the flag propagation
> > in "lr_out_chk_dnat_local" stage, removing the match on ct_mark/ct_label.
> >
> > Reported-at: https://bugzilla.redhat.com/2161281
> > Signed-off-by: Ales Musil <[email protected]>
> > ---
> > v2: Fix flaky system test.
> > v3: Rebase on top of current main.
>
> I am seeing consistent failure of system-tests with this version :(
>
> 237: SNAT in separate zone from DNAT -- ovn-northd -- parallelization=yes
> -- ovn_monitor_all=yes FAILED (system-ovn.at:8702)
> 238: SNAT in separate zone from DNAT -- ovn-northd -- parallelization=yes
> -- ovn_monitor_all=no FAILED (system-ovn.at:8702)
> 239: SNAT in separate zone from DNAT -- ovn-northd -- parallelization=no
> -- ovn_monitor_all=yes FAILED (system-ovn.at:8702)
> 240: SNAT in separate zone from DNAT -- ovn-northd -- parallelization=no
> -- ovn_monitor_all=no FAILED (system-ovn.at:8702)
>
> Link:
> https://github.com/horms/ovn/actions/runs/4383139214/jobs/7676416462#step:13:3734
>
>
I was wondering why it was always green for me and this is the system tests
over userspace datapath.
It's nice that we are catching bugs with that already. I will take a look
at why it is failing. But IMO it shouldn't
be as this patch doesn't have anything specific per datapath type.

Thanks,
Ales

-- 

Ales Musil

Senior Software Engineer - OVN Core

Red Hat EMEA <https://www.redhat.com>

[email protected]    IM: amusil
<https://red.ht/sig>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to