Re: [ovs-dev] [PATCH ovn] northd: Fix routing loop in LRs with one-to-many SNAT

2021-08-11 Thread Krzysztof Klimonda
Hi Han, Numan,

@Han I'll post v2 that is rebased against master with fixes you've mentioned 
below. Check my comments.

@Numan I've extended system test and added loadbalancer plus a connection check 
- seems to be fine, however I'm having a hell of a time trying to figure out 
how to use ovn-trace with CT (and also with LB) to understand how packets are 
matched by flows.

As far as I understand loadbalancer action ct_lb will perform dnat, changing 
the destination IP and port to backend, and the flow I've installed in 
lr_in_ip_routing will no longer match it.

In version 2 patch, after pausing system test 149 I've tried a trace like that 
to see which flows are matched, but the output doesn't seem to actually show 
related flows, and seem to be showing the default matching (without ct and lb). 
The command I run is:

ovn-trace --db unix:./ovn-sb/ovn-sb.sock --ct new 'inport=="rp-public" && 
eth.src==00:10:10:01:02:13 && eth.dst==00:00:02:01:02:03 && ip4 && 
ip4.src==172.16.1.100 && ip4.dst==172.16.1.254 && ip.ttl==64 && tcp && 
tcp.dst==90' --lb-dst 192.168.1.2:90 

On Wed, Aug 4, 2021, at 20:17, Han Zhou wrote:
> 
> 
> On Mon, Mar 22, 2021 at 2:16 AM Krzysztof Klimonda 
>  wrote:
> >
> > If there are snat entries on the router, and some logical_ip are set to
> > network instead of an IP address then given SNAT is masquerade. In such
> > case ct_snat action is used in lr_in_unsnat table to ensure that the
> > packet is matched against conntrack and destination IP is replaced with
> > one from matching conntrack entry.
> >
> > This however breaks down for new connections sent to router's external IP
> > address. In such case, when packet is checked against conntrack table,
> > there is no match, and its destination IP remains unchanged. This causes a
> > loop in lr_in_ip_routing.
> >
> > This commit installs a new logical flow in lr_in_ip_routing table for
> > routers that have SNAT entry with logical_ip set to network (that being
> > masquerade). This flow drops packets that, after going through conntrack
> > via ct_snat action in lr_in_unsnat table, are not in established or
> > related state (!ct.est && !ct.rel) and which destination IP still matches
> > router's external IP. This prevents vswitchd from looping such packets
> > until their TTL reaches zero, as well as installing bogus flows in
> > datapath that lead to ovs module dropping such packages with "deferred
> > action limit reached, drop recirc action" message.
> >
> > Signed-off-by: Krzysztof Klimonda 
> 
> Thanks for the patch. As mentioned in the discussion ML could you post 
> more details on how the loop happens?
> Please see some more comments below.

As far as I understood it when writing the patch, there is a problem with how 
packets that are not part of conntrack flows are handled by ct_snat action:

When packet arrives and goes through lr_in_unsnat table, the ct_snat action 
does not match it to any existing conntrack flow, and destination IP remains 
unchaged. The packet then goes through lr_in_ip_routing and because destination 
IP is still router IP it is sent to egress only to end up in the same ingress 
pipeline, with the same logical flow in lr_in_unsnat  table, and with the same 
resulting ct_snat action - this results in a loop until ttl is decremented 
enough time.

My idea for the fix was to install a route in the router that would drop 
packets which, after being unSNATted, still have destination IP of the router.

I've attached two ovn-traces that try to illustrate the fact, although I can't 
really figure out how to properly use --ct with ovn-trace to simulate new and 
established connections.

I'm also having some weird system test failures with current master where in my 
tests ping fails to send first few packages against router IP with EBUSY 
(resource temporarily unavailable) being returned from recvfrom 

> 
> > ---
> >  northd/ovn-northd.c |  57 +++
> >  tests/ovn.at|  45 ++
> >  tests/system-ovn.at | 108 
> >  3 files changed, 210 insertions(+)
> >
> > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> > index 4783e43d7..ea7db3d47 100644
> > --- a/northd/ovn-northd.c
> > +++ b/northd/ovn-northd.c
> > @@ -11304,6 +11304,7 @@ build_lrouter_nat_defrag_and_lb(struct ovn_datapath 
> > *od,
> >  ovs_be32 ip, mask;
> >  struct in6_addr ipv6, mask_v6, v6_exact = IN6ADDR_EXACT_INIT;
> >  bool is_v6 = false;
> > +bool is_masquerade = false;
> >  bool stateless = lrouter_nat_is_stateless(nat);
> >  struct nbrec_address_set *allowed_ext_ips =
> >nat->allowed_ext_ips;
> > @@ -11343,9 +11344,15 @@ build_lrouter_nat_defrag_and_lb(struct 
> > ovn_datapath *od,
> >  if (is_v6) {
> >  error = ipv6_parse_masked(nat->logical_ip, , _v6);
> >  cidr_bits = ipv6_count_cidr_bits(_v6);
> > +if (cidr_bits < 128) {
> 

Re: [ovs-dev] [PATCH ovn] northd: Fix routing loop in LRs with one-to-many SNAT

2021-08-04 Thread Han Zhou
On Mon, Mar 22, 2021 at 2:16 AM Krzysztof Klimonda <
kklimo...@syntaxhighlighted.com> wrote:
>
> If there are snat entries on the router, and some logical_ip are set to
> network instead of an IP address then given SNAT is masquerade. In such
> case ct_snat action is used in lr_in_unsnat table to ensure that the
> packet is matched against conntrack and destination IP is replaced with
> one from matching conntrack entry.
>
> This however breaks down for new connections sent to router's external IP
> address. In such case, when packet is checked against conntrack table,
> there is no match, and its destination IP remains unchanged. This causes a
> loop in lr_in_ip_routing.
>
> This commit installs a new logical flow in lr_in_ip_routing table for
> routers that have SNAT entry with logical_ip set to network (that being
> masquerade). This flow drops packets that, after going through conntrack
> via ct_snat action in lr_in_unsnat table, are not in established or
> related state (!ct.est && !ct.rel) and which destination IP still matches
> router's external IP. This prevents vswitchd from looping such packets
> until their TTL reaches zero, as well as installing bogus flows in
> datapath that lead to ovs module dropping such packages with "deferred
> action limit reached, drop recirc action" message.
>
> Signed-off-by: Krzysztof Klimonda 

Thanks for the patch. As mentioned in the discussion ML could you post more
details on how the loop happens?
Please see some more comments below.

> ---
>  northd/ovn-northd.c |  57 +++
>  tests/ovn.at|  45 ++
>  tests/system-ovn.at | 108 
>  3 files changed, 210 insertions(+)
>
> diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> index 4783e43d7..ea7db3d47 100644
> --- a/northd/ovn-northd.c
> +++ b/northd/ovn-northd.c
> @@ -11304,6 +11304,7 @@ build_lrouter_nat_defrag_and_lb(struct
ovn_datapath *od,
>  ovs_be32 ip, mask;
>  struct in6_addr ipv6, mask_v6, v6_exact = IN6ADDR_EXACT_INIT;
>  bool is_v6 = false;
> +bool is_masquerade = false;
>  bool stateless = lrouter_nat_is_stateless(nat);
>  struct nbrec_address_set *allowed_ext_ips =
>nat->allowed_ext_ips;
> @@ -11343,9 +11344,15 @@ build_lrouter_nat_defrag_and_lb(struct
ovn_datapath *od,
>  if (is_v6) {
>  error = ipv6_parse_masked(nat->logical_ip, , _v6);
>  cidr_bits = ipv6_count_cidr_bits(_v6);
> +if (cidr_bits < 128) {
> +is_masquerade = true;
> +}
>  } else {
>  error = ip_parse_masked(nat->logical_ip, , );
>  cidr_bits = ip_count_cidr_bits(mask);
> +if (cidr_bits < 32) {
> +is_masquerade = 32;

Typo? s/32/true

> +}
>  }
>  if (!strcmp(nat->type, "snat")) {
>  if (error) {
> @@ -11396,6 +11403,56 @@ build_lrouter_nat_defrag_and_lb(struct
ovn_datapath *od,
>  build_lrouter_in_dnat_flow(lflows, od, nat, match, actions,
distributed,
> mask, is_v6);
>
> +/* When router have SNAT enabled, and logical_ip is a network
(router
> + * is doing masquerade), then we need to make sure that packets
> + * unrelated to any established connection that still have
router's
> + * external IP as a next hop after going through lr_in_unsnat
table
> + * are dropped properly. Otherwise such packets will loop around
> + * between tables until their ttl reaches zero - this
additionally
> + * causes kernel module to drop such packages due to
recirculation
> + * limit being exceeded.
> + *
> + * Install a logical flow in lr_in_ip_routing table that will
> + * match packet with router's external IP that have no related
> + * conntrack entries and drop them. Flow priority must be higher
> + * than any other flow in lr_in_ip_routing that matches router's
> + * external IP.
> + *
> + * The priority for destination routes is calculated as
> + * (prefix length * 2) + 1, and there is an additional flow
> + * for when BFD is in play with priority + 1. Set priority that
> + * is higher than any other potential routing flow for that
> + * network, that is (prefix length * 2) + offset, where offset
> + * is 1 (dst) + 1 (bfd) + 1. */
> +if (is_masquerade) {
> +uint16_t priority, prefix_length, offset;
> +
> +if (is_v6) {
> +prefix_length = 128;
> +} else {
> +prefix_length = 32;
> +}
> +offset = 3;
> +priority = (prefix_length * 2) + offset;
> +
> +ds_clear(match);
> +
> +if (is_v6) {
> +ds_put_format(match,
> +  "ct.new && 

Re: [ovs-dev] [PATCH ovn] northd: Fix routing loop in LRs with one-to-many SNAT

2021-03-23 Thread Ben Pfaff
On Mon, Mar 22, 2021 at 10:14:46AM +0100, Krzysztof Klimonda wrote:
> If there are snat entries on the router, and some logical_ip are set to
> network instead of an IP address then given SNAT is masquerade. In such
> case ct_snat action is used in lr_in_unsnat table to ensure that the
> packet is matched against conntrack and destination IP is replaced with
> one from matching conntrack entry.
> 
> This however breaks down for new connections sent to router's external IP
> address. In such case, when packet is checked against conntrack table,
> there is no match, and its destination IP remains unchanged. This causes a
> loop in lr_in_ip_routing.
> 
> This commit installs a new logical flow in lr_in_ip_routing table for
> routers that have SNAT entry with logical_ip set to network (that being
> masquerade). This flow drops packets that, after going through conntrack
> via ct_snat action in lr_in_unsnat table, are not in established or
> related state (!ct.est && !ct.rel) and which destination IP still matches
> router's external IP. This prevents vswitchd from looping such packets
> until their TTL reaches zero, as well as installing bogus flows in
> datapath that lead to ovs module dropping such packages with "deferred
> action limit reached, drop recirc action" message.
> 
> Signed-off-by: Krzysztof Klimonda 

Thanks for contributing to OVN.

I see that you've submitted a patch that updates ovn-northd.c, but not
the DDlog implementation of the same logic in ovn_northd.dl.  We would
like to keep these two implementations in sync.  If you need help
figuring out how to write the DDlog code for this change, please let me
know.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn] northd: Fix routing loop in LRs with one-to-many SNAT

2021-03-22 Thread Krzysztof Klimonda
If there are snat entries on the router, and some logical_ip are set to
network instead of an IP address then given SNAT is masquerade. In such
case ct_snat action is used in lr_in_unsnat table to ensure that the
packet is matched against conntrack and destination IP is replaced with
one from matching conntrack entry.

This however breaks down for new connections sent to router's external IP
address. In such case, when packet is checked against conntrack table,
there is no match, and its destination IP remains unchanged. This causes a
loop in lr_in_ip_routing.

This commit installs a new logical flow in lr_in_ip_routing table for
routers that have SNAT entry with logical_ip set to network (that being
masquerade). This flow drops packets that, after going through conntrack
via ct_snat action in lr_in_unsnat table, are not in established or
related state (!ct.est && !ct.rel) and which destination IP still matches
router's external IP. This prevents vswitchd from looping such packets
until their TTL reaches zero, as well as installing bogus flows in
datapath that lead to ovs module dropping such packages with "deferred
action limit reached, drop recirc action" message.

Signed-off-by: Krzysztof Klimonda 
---
 northd/ovn-northd.c |  57 +++
 tests/ovn.at|  45 ++
 tests/system-ovn.at | 108 
 3 files changed, 210 insertions(+)

diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
index 4783e43d7..ea7db3d47 100644
--- a/northd/ovn-northd.c
+++ b/northd/ovn-northd.c
@@ -11304,6 +11304,7 @@ build_lrouter_nat_defrag_and_lb(struct ovn_datapath *od,
 ovs_be32 ip, mask;
 struct in6_addr ipv6, mask_v6, v6_exact = IN6ADDR_EXACT_INIT;
 bool is_v6 = false;
+bool is_masquerade = false;
 bool stateless = lrouter_nat_is_stateless(nat);
 struct nbrec_address_set *allowed_ext_ips =
   nat->allowed_ext_ips;
@@ -11343,9 +11344,15 @@ build_lrouter_nat_defrag_and_lb(struct ovn_datapath 
*od,
 if (is_v6) {
 error = ipv6_parse_masked(nat->logical_ip, , _v6);
 cidr_bits = ipv6_count_cidr_bits(_v6);
+if (cidr_bits < 128) {
+is_masquerade = true;
+}
 } else {
 error = ip_parse_masked(nat->logical_ip, , );
 cidr_bits = ip_count_cidr_bits(mask);
+if (cidr_bits < 32) {
+is_masquerade = 32;
+}
 }
 if (!strcmp(nat->type, "snat")) {
 if (error) {
@@ -11396,6 +11403,56 @@ build_lrouter_nat_defrag_and_lb(struct ovn_datapath 
*od,
 build_lrouter_in_dnat_flow(lflows, od, nat, match, actions, 
distributed,
mask, is_v6);
 
+/* When router have SNAT enabled, and logical_ip is a network (router
+ * is doing masquerade), then we need to make sure that packets
+ * unrelated to any established connection that still have router's
+ * external IP as a next hop after going through lr_in_unsnat table
+ * are dropped properly. Otherwise such packets will loop around
+ * between tables until their ttl reaches zero - this additionally
+ * causes kernel module to drop such packages due to recirculation
+ * limit being exceeded.
+ *
+ * Install a logical flow in lr_in_ip_routing table that will
+ * match packet with router's external IP that have no related
+ * conntrack entries and drop them. Flow priority must be higher
+ * than any other flow in lr_in_ip_routing that matches router's
+ * external IP.
+ *
+ * The priority for destination routes is calculated as
+ * (prefix length * 2) + 1, and there is an additional flow
+ * for when BFD is in play with priority + 1. Set priority that
+ * is higher than any other potential routing flow for that
+ * network, that is (prefix length * 2) + offset, where offset
+ * is 1 (dst) + 1 (bfd) + 1. */
+if (is_masquerade) {
+uint16_t priority, prefix_length, offset;
+
+if (is_v6) {
+prefix_length = 128;
+} else {
+prefix_length = 32;
+}
+offset = 3;
+priority = (prefix_length * 2) + offset;
+
+ds_clear(match);
+
+if (is_v6) {
+ds_put_format(match,
+  "ct.new && ip6.dst == %s",
+  nat->external_ip);
+} else {
+ds_put_format(match,
+  "ct.new && ip4.dst == %s",
+  nat->external_ip);
+}
+
+ovn_lflow_add_unique_with_hint(lflows, od,
+   S_ROUTER_IN_IP_ROUTING, priority,
+   ds_cstr(match),