On 4/22/24 10:55, Ales Musil wrote: > On Mon, Apr 22, 2024 at 9:45 AM Martin Kalcok <[email protected]> > wrote: > >> wrt the failed ovn-kubernetes tests, they seemed to have passed >> successfully in my branch [0]. Is it possible that the tests are unstable? >> >> [0] https://github.com/mkalcok/ovn/actions/runs/8752915096 >> > > Hi Martin, > > yeah unfortunately the test was broken and AFAIK it should be fixed now [0]. > > >> >> On Fri, Apr 19, 2024 at 2:33 PM Martin Kalcok <[email protected]> >> wrote: >> >>> This change fixes bug that breaks ability of machines from external >>> networks, to communicate with machines in SNATed networks (specifically >>> when using a Distributed router). >>> >>> Currently when a machine (S1) on an external network tries to talk >>> over TCP with a machine (A1) in a network that has enabled SNAT, the >>> connection is established successfully. However after the three-way >>> handshake, any packets that come from the A1 machine will have their >>> source address translated by the Distributed router, breaking the >>> communication. >>> >>> Existing rule in `build_lrouter_out_snat_flow` that decides which >>> packets should be SNATed already tries to avoid SNATing packets in >>> reply direction with `(!ct.trk || !ct.rpl)`. However, previous stages >>> in the distributed LR egress pipeline do not initiate the CT state. >>> >>> Additionally we need to commit new connections that originate from >>> external networks into CT, so that the packets in the reply direction >>> (back to the external network) can be properly identified. >>> >>> Rationale: >>> >>> In my original RFC [0], there were questions about the motivation for >>> fixing this issue. I'll try to summarize why I think this is a bug >>> that should be fixed. >>> >>> 1. Current implementation for Distributed router already tries to >>> avoid SNATing packets in the reply direction, it's just missing >>> initialized CT states to make proper decisions. >>> >>> 2. This same scenario works with Gateway Router. I tested with >>> following setup: >>> >>> foo -- R1 -- join - R3 -- alice >>> | >>> bar ----------R2 >>> >>> R1 is a Distributed router with SNAT for foo. R2 is a Gateway >>> router with SNAT for bar. R3 is a Gateway router with no SNAT. >>> Using 'alice1' as a client I was able to talk over TCP with >>> 'bar1' but connection with 'foo1' failed. >>> >>> 3. Regarding security and "leaking" of internal IPs. Reading through >>> RFC 4787 [1], 5382 [2] and their update in 7857 [3], the >>> specifications do not seem to mandate that SNAT implementations >>> must filter incoming traffic destined directly to the internal >>> network. Sections like "5. Filtering Behavior" in 4787 and >>> "4.3 Externally Initiated Connections" in 5382 describe only >>> behavior for traffic destined to external IP/Port associated >>> with NAT on the device that performs NAT. >>> >>> Besides, with the current implementation, it's already possible >>> to scan the internal network with pings and TCP syn scanning. >>> >>> 4. We do have customers/clouds that depend on this functionality. >>> This is a scenario that used to work in Openstack with ML2/OVS >>> and migrating those clouds to ML2/OVN would break it. >>> >>> [0] >>> https://mail.openvswitch.org/pipermail/ovs-dev/2024-February/411670.html >>> [1]https://datatracker.ietf.org/doc/html/rfc4787 >>> [2]https://datatracker.ietf.org/doc/html/rfc5382 >>> [3]https://datatracker.ietf.org/doc/html/rfc7857 >>> >>> Signed-off-by: Martin Kalcok <[email protected]> >>> --- >>> northd/northd.c | 66 ++++++++++++++++++++++++++++------- >>> northd/ovn-northd.8.xml | 28 +++++++++++++++ >>> tests/ovn-northd.at | 76 +++++++++++++++++++++++++++++++++++++---- >>> tests/system-ovn.at | 68 ++++++++++++++++++++++++++++++++++++ >>> 4 files changed, 219 insertions(+), 19 deletions(-) >>> >>> diff --git a/northd/northd.c b/northd/northd.c >>> index 37f443e70..2c2eee445 100644 >>> --- a/northd/northd.c >>> +++ b/northd/northd.c >>> @@ -14425,20 +14425,27 @@ build_lrouter_out_is_dnat_local(struct >>> lflow_table *lflows, >>> >>> static void >>> build_lrouter_out_snat_match(struct lflow_table *lflows, >>> - const struct ovn_datapath *od, >>> - const struct nbrec_nat *nat, struct ds >>> *match, >>> - bool distributed_nat, int cidr_bits, bool >>> is_v6, >>> - struct ovn_port *l3dgw_port, >>> - struct lflow_ref *lflow_ref) >>> + const struct ovn_datapath *od, >>> + const struct nbrec_nat *nat, >>> + struct ds *match, >>> + bool distributed_nat, int >>> cidr_bits, >>> + bool is_v6, >>> + struct ovn_port *l3dgw_port, >>> + struct lflow_ref *lflow_ref, >>> + bool is_reverse)
I re-indented this part. [...] > > The change looks good to me, thanks! > > Acked-by: Ales Musil <[email protected]> > > [0] > https://github.com/ovn-org/ovn-kubernetes/commit/236e63c665bb60dd48dd5cc7f8ef2324a061d229 > Applied to main and backported to 24.03. Regards, Dumitru _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
