Hello Dumitru,

sorry to bother you, did i understand right your comment regarding the third 
rules? about the need to take into account hairpin_snat_ip ?
On 04.04.2025 13:55, Dumitru Ceara wrote:

On 4/3/25 6:57 PM, Aleksandra Rukomoinikova wrote:


On 03.04.2025 11:58, Dumitru Ceara wrote:
/Hi, Dumitru/
Thank you for your time, I will respond to your comments below.



Hi Alexandra,



On 3/23/25 3:07 PM, Vladislav Odintsov wrote:


Hi Dumitru, Venu,



Hi Vladislav,



thanks for your time spent digging into this!

The code within current patch sends reply traffic to conntrack only if it sees 
that it needs to do that (LB is created within the datapath). If there is no 
LB, so traffic will not be sent to conntrack and HW-offload will probably work.
I’m worried why load balancer is created even if it doesn’t work in such 
scenario? Shouldn’t that be fixed from the orchestrator side (not to create LB)?



I guess I'm a bit confused by the statement above.  I don't see all
reply traffic being sent to conntrack.  Instead the patch only sends
traffic destined to the LB VIP to conntrack.

@@ -7752,8 +7806,7 @@ build_lb_rules_pre_stateful(struct lflow_table *lflows,
          }
          ds_put_cstr(action, "ct_lb_mark;");

-        ds_put_format(match, REGBIT_CONNTRACK_NAT" == 1 && %s.dst == %s",
-                      ip_match, lb_vip->vip_str);
+        ds_put_format(match, "%s.dst == %s", ip_match, lb_vip->vip_str);
          if (lb_vip->port_str) {
              ds_put_format(match, " && %s.dst == %s", lb->proto,
                            lb_vip->port_str);
@@ -7763,6 +7816,22 @@ build_lb_rules_pre_stateful(struct lflow_table *lflows,
              lflows, lb_dps->nb_ls_map, ods_size(ls_datapaths),
              S_SWITCH_IN_PRE_STATEFUL, 120, ds_cstr(match), ds_cstr(action),
              &lb->nlb->header_, lb_dps->lflow_ref);
+
+        /* Pass all VIP traffic to the conntrack to support
+           load balancers in the case of stateless acl. */
+        if (lb_vip->port_str && lb->proto) {
+            ds_clear(action);
+            ds_clear(match);
+
+            ds_put_cstr(action, "ct_lb_mark;");
+            ds_put_format(match, "%s && %s.dst == %s",
+                          lb->proto, ip_match, lb_vip->vip_str);
+
+            ovn_lflow_add_with_dp_group(
+                lflows, lb_dps->nb_ls_map, ods_size(ls_datapaths),
+                S_SWITCH_IN_PRE_STATEFUL, 115, ds_cstr(match), ds_cstr(action),
+                &lb->nlb->header_, lb_dps->lflow_ref);
+        }
      }
  }

Also, the code above generates logical flows in in_pre_stateful, at priority
115, that can never be matched.

E.g., "make sandbox" and then:
$ ovn-nbctl ls-add sw1
$ ovn-nbctl lb-add lb 1.1.1.1:80 2.2.2.2:80 tcp
$ ovn-nbctl ls-lb-add sw1 lb
$ ovn-nbctl acl-add sw1 from-lport 1002 'ip4 || ip6'  allow-stateless

Dump the in_pre_stateful relevant logical flows:

   table=6 (ls_in_pre_stateful ), priority=120  , match=(ip4.dst == 1.1.1.1 && 
tcp.dst == 80), action=(reg4 = 1.1.1.1; reg2[0..15] = 80; ct_lb_mark;)
   table=6 (ls_in_pre_stateful ), priority=115  , match=(reg0[2] == 1 && 
ip.is_frag), action=(reg0[19] = 1; ct_lb_mark;)
   table=6 (ls_in_pre_stateful ), priority=115  , match=(tcp && ip4.dst == 
1.1.1.1), action=(ct_lb_mark;)

The first flow match (prio 120) is a super set of the match of the third



I had missed the tcp.dst == 80 match here, it's not a super set.  But
still, please see my comment on LB hairpin below.



flow match (prio 115).  That means the third flow will never be hit.

This seems a bit weird.

Regards,
Dumitru



regards,
Vladislav Odintsov



On 23 Mar 2025, at 11:56, Dumitru Ceara via dev 
<ovs-dev@openvswitch.org><mailto:ovs-dev@openvswitch.org> wrote:

On 3/22/25 2:01 AM, Venu Iyer wrote:


Sorry for the much delayed response Dumitru; I am just catching up on this and 
will look at the patch and follow-ups.
Just a quick response below.



Hi Venu,



________________________________
From: Dumitru Ceara <dce...@redhat.com><mailto:dce...@redhat.com>
Sent: Wednesday, March 19, 2025 6:00 AM
To: Aleksandra Rukomoinikova 
<ARukomoinikova@k2.cloud><mailto:ARukomoinikova@k2.cloud>; 
d...@openvswitch.org<mailto:d...@openvswitch.org> 
<d...@openvswitch.org><mailto:d...@openvswitch.org>
Cc: Venu Iyer <venugop...@nvidia.com><mailto:venugop...@nvidia.com>; Han Zhou 
<hz...@ovn.org><mailto:hz...@ovn.org>; Numan Siddique 
<num...@ovn.org><mailto:num...@ovn.org>
Subject: Re: [PATCH ovn v2] northd: Make work load balancer with stateless ACL.

External email: Use caution opening links or attachments




On 3/18/25 5:08 PM, Aleksandra Rukomoinikova wrote:
Hi Dumitru,



Hi Alexandra,



I want to clarify a little my question that I asked yesterday at the meeting,
it was formulated incorrectly by me

in a situation when we have stateless acl and related one (or lb configured)

before patch [0] (that is, before version v23.03.0 judging by the log) there 
was a lookup in ingress
and in the egress of all packages in the conntrack

because the packages fell under:

table=4 (ls_in_pre_acl), priority=100, match=(ip), action=(reg0[0] = 1; next;)
table=0 (ls_out_pre_lb), priority=100, match=(ip), action=(reg0[2] = 1; next;)

and then the packages were not committed because:
table=8 (ls_in_acl), priority=2002, match=(stateless acl match), action=(next;)

That is, we were cheating traffic to the contract, but did not commit, an 
invalid state was returned.

That is, such a situation arose when using stateless acls, which are higher 
priority than statefull ones
or also when using lb.

A similar situation is now occurring in my patch, I bypass it this way:

Despite the enabled ct_match_inv option, we add it only if there is no 
stateless acl + lb bundle

What do you think?

[0] - 
https://github.com/ovn-org/ovn/commit/a0f82efdd9dfd3ef2d9606c1890e353df1097a51.



Thanks for the additional details!

I'm CC-ing Venu, Han and Numan for more input on this.  Venu, you added
this restriction that doesn't allow traffic matched by stateless ACLs to
be load balanced, what is your opinion with respect to Alexandra's
suggested change below?



[0] Removed support for using load balancers in conjunction with stateless ACL.
This commit removes the ability to use load balancers alongside stateless ACL.
If a load balancer is created, the datapath is no longer fully stateless.
Therefore, to avoid traffic being directed to the contract, it is recommended
to refrain from creating a load balancer entirely.



This was an issue with ovn-k8s where the logical switch has LB for the service 
IPs  ..
and this made configuring any stateless rules impossible - e.g if there  is a 
application
with unidirection UDP flows such flows will not be offloaded without stateless.





Commit [0] ensures the separation of stateful and stateless scenarios
in the absence of load balancers, without altering the functionality
of load balancers themselves.

When a logical switch is configured with stateless ACL and a load balancer,
the check for the `REGBIT_CONNTRACK_NAT` flag in the `pre_lb` stage of
the ingress pipeline becomes redundant. Traffic directed to the load balancer
must be processed through the conntrack.

To ensure proper load balancer operation, a rule must be added to match
the load balancer's VIP address and its protocol (if applicable). This rule
is added to the datapath group and does not negatively impact performance.
Packets matching this rule would still be directed to the contract via
lower-priority rules in the absence of stateless ACL. However, with stateless 
ACL,
this rule enables load balancing when the client balances traffic to itself.

In the egress pipeline, the stateless register should only be set if no
load balancers are present on the datapath. This maintains a clear separation
between Stateful and Stateless modes when using ACL.
If a user creates a load balancer on a logical switch, they should be aware
that the traffic will no longer be fully stateless.


I think statelesss and LB should not overlap and, hence, should be able to
coexist. Maybe Han can provide some input as well.



Ah, I see, so the problem was for unidirectional traffic that needs to
be offloaded (which doesn't happen if it's forwarded on
ct_state=+trk+new).

I guess Alexandra's change here doesn't suffice though:

@@ -7763,6 +7816,22 @@  build_lb_rules_pre_stateful(struct lflow_table *lflows,
             lflows, lb_dps->nb_ls_map, ods_size(ls_datapaths),
             S_SWITCH_IN_PRE_STATEFUL, 120, ds_cstr(match), ds_cstr(action),
             &lb->nlb->header_, lb_dps->lflow_ref);
+
+        /* Pass all VIP traffic to the conntrack to support
+           load balancers in the case of stateless acl. */
+        if (lb_vip->port_str && lb->proto) {
+            ds_clear(action);
+            ds_clear(match);
+
+            ds_put_cstr(action, "ct_lb_mark;");
+            ds_put_format(match, "%s && %s.dst == %s",
+                          lb->proto, ip_match, lb_vip->vip_str);
+
+            ovn_lflow_add_with_dp_group(
+                lflows, lb_dps->nb_ls_map, ods_size(ls_datapaths),
+                S_SWITCH_IN_PRE_STATEFUL, 115, ds_cstr(match), ds_cstr(action),
+                &lb->nlb->header_, lb_dps->lflow_ref);
+        }
     }

This will send to conntrack in the original direction only traffic that is
destined to the LB VIP.

But for replies we'll send everything to conntrack:

@@ -5894,7 +5894,11 @@  build_stateless_filter(const struct ovn_datapath *od,



you said that you don't see how all the replay traffic ends up in the conntrack:

this flow will be added to the egress pipeline only if there are no lb,
and it is responsible for sending packets to the conntrack.




I still don't understand, sorry.  Let's assume we have a TCP LB with
VIP=42.42.42.42:80 and backends=10.10.10.3:8080 configured on a switch;
and a client LSP (10.10.10.2) and a server LSP (10.10.10.3).

The client tries to connect to the LB VIP:

1. TCP SYN:
   ip_src=10.10.10.2, tcp_src=x, ip_dst=42.42.42.42, tcp_dst=80

2. TCP SYN is DNATed by the OVN LB:
   ip_src=10.10.10.2, tcp_src=x, ip_dst=10.10.10.3, tcp_dst=8080

3. The SYN reaches the server and the server replies with SYN+ACK:
   ip_src=10.10.10.3, tcp_src=8080, ip_dst=10.10.10.2, tcp_dst=x

This packet should be unDNATed.  How is this packet matched by the flow
you mentioned above?  That flow matches on ip.dst == lb.vip.



                                 action,
                                 &acl->header_,
                                 lflow_ref);
-    } else {
+    } else if (!od->nbs->n_load_balancer){
+        /* For cases when we have statefull ACLs but no load
+           balancer configured on logical switch - we should
+           completely bypass conntrack on egress, otherwise
+           it is necessary to check the balanced traffic. */
         ovn_lflow_add_with_hint(lflows, od, S_SWITCH_OUT_PRE_ACL,
                                 acl->priority + OVN_ACL_PRI_OFFSET,
                                 acl->match,

So, I'm afraid we are also going to need something that restricts traffic
that is sent to conntrack in the reply direction.  E.g., logical flows
that match on traffic originated by load balancer backends.

We need to be careful when we do that though.  We don't want to create
a logical flow explosion (there can be quite a few backends so a logical
flow per backend might be very costly for both ovn-northd and the SB DB).



regarding the added flows in ingress:

   table=6 (ls_in_pre_stateful ), priority=120  , match=(ip4.dst == 1.1.1.1 && 
tcp.dst == 80), action=(reg4 = 1.1.1.1; reg2[0..15] = 80; ct_lb_mark;)
   table=6 (ls_in_pre_stateful ), priority=115  , match=(reg0[2] == 1 && 
ip.is_frag), action=(reg0[19] = 1; ct_lb_mark;)
   table=6 (ls_in_pre_stateful ), priority=115  , match=(tcp && ip4.dst == 
1.1.1.1), action=(ct_lb_mark;)

priorities overlapped, i think, i should fix it like this ?

   table=6 (ls_in_pre_stateful ), priority=125  , match=(ip4.dst == 1.1.1.1 && 
tcp.dst == 80), action=(reg4 = 1.1.1.1; reg2[0..15] = 80; ct_lb_mark;)
   table=6 (ls_in_pre_stateful ), priority=120  , match=(reg0[2] == 1 && 
ip.is_frag), action=(reg0[19] = 1; ct_lb_mark;)
   table=6 (ls_in_pre_stateful ), priority=115  , match=(tcp && ip4.dst == 
1.1.1.1), action=(ct_lb_mark;)

the first rule will work for lb in general, the third rule is important
in balancing from the client to itself to establish a connection before
dnat occurs.



I'm not sure I follow.  For hairpin traffic, client -> LB -> client,
there's an additional SNAT happening after DNAT in the original
direction.  The SNAT IP is not necessarily the LB VIP, it's configurable
through LB.options.hairpin_snat_ip.  And we already support this kind of
connection so I'm not sure why we need that third flow.  We already have
LB hairpin system tests for the regular case.  Would it be possible to
add new ones for the case when we have (stateless) ACLs too?



Sorry again for not being on top of this.



No worries, thanks for the reply!

Regards,
Dumitru



-venu



Also in case of lb configured with stateless ACLs we no longer take into account
ct.inv packets in egress. They will be dropped further, at the hypervisor level.

[0] - ovn-org@a0f82ef.

Signed-off-by: Alexandra Rukomoinikova 
<arukomoinikova@k2.cloud><mailto:arukomoinikova@k2.cloud><mailto:arukomoinikova@k2.cloud><mailto:arukomoinikova@k2.cloud>
---
northd/northd.c         |  79 ++++++++++++++++++++++++++++--
northd/ovn-northd.8.xml |  13 ++++-
tests/ovn-northd.at     | 106 +++++++++++++++++++++++++++++++++++-----
tests/ovn.at            |   8 +--
tests/system-ovn.at     |  61 ++++++++++++++++++++---
5 files changed, 238 insertions(+), 29 deletions(-)


The "237: system-ovn-kmod.at:1031 Load Balancer LS hairpin IPv4 UDP -
larger than MTU" test is failing with this patch applied, see CI run here:
https://github.com/ovsrobot/ovn/actions/runs/13928303850/job/38978852134#step:9:4712

Something is weird because the test in question doesn't use stateless
ACLs at all.  So there must be a bug that's introduced by your patch.

I did a cursory review of the rest of the patch and shared some more
comments below.

However, I'd like to hear from Venu/Han/Numan first, if possible, before
moving towards a v3.



diff --git a/northd/northd.c b/northd/northd.c
index 1d3e132d4..fab420784 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -5894,7 +5894,11 @@ build_stateless_filter(const struct ovn_datapath *od,
                                 action,
                                 &acl->header_,
                                 lflow_ref);
-    } else {
+    } else if (!od->nbs->n_load_balancer){
+        /* For cases when we have statefull ACLs but no load
+           balancer configured on logical switch - we should
+           completely bypass conntrack on egress, otherwise
+           it is necessary to check the balanced traffic. */
         ovn_lflow_add_with_hint(lflows, od, S_SWITCH_OUT_PRE_ACL,
                                 acl->priority + OVN_ACL_PRI_OFFSET,
                                 acl->match,
@@ -7355,6 +7359,49 @@ choose_max_acl_tier(const struct ls_stateful_record 
*ls_stateful_rec,
     }
}

+/* In the case of stateless ACLs and load balancers, all traffic
+ * is passed to conntrack during egress pipeline. Conntrack will
+ * mark the packets as 'invalid,' but since stateless systems do
+ * not rely on conntrack state, these invalid packets will be
+ * discarded during processing on the hypervisor level.
+ */
+static bool
+stateless_inv_match(const struct ls_stateful_record *ls_stateful_rec,
+                    const struct ovn_datapath *od,
+                    const struct ls_port_group_table *ls_port_groups)
+{
+    bool inv_match = true;


There's no need for this variable, we always return after setting its value.



+
+    if (ls_stateful_rec->has_lb_vip) {
+        for (size_t i = 0; i < od->nbs->n_acls; i++) {
+            const struct nbrec_acl *acl = od->nbs->acls[i];
+            if (!strcmp(acl->action, "allow-stateless")) {
+                inv_match = false;
+                return inv_match;


Just return false here.

However, a cleaner and slightly more efficient way of doing this would
be to add a "has_stateless_acl" field to "struct ls_stateful_record" and
populate it in ls_stateful_record_set_acl_flags().



+            }
+        }
+
+        const struct ls_port_group *ls_pg =
+            ls_port_group_table_find(ls_port_groups, od->nbs);
+        if (!ls_pg) {
+            return inv_match;
+        }
+
+        const struct ls_port_group_record *ls_pg_rec;
+            HMAP_FOR_EACH (ls_pg_rec, key_node, &ls_pg->nb_pgs) {
+            for (size_t i = 0; i < ls_pg_rec->nb_pg->n_acls; i++) {
+                const struct nbrec_acl *acl = ls_pg_rec->nb_pg->acls[i];
+                if (!strcmp(acl->action, "allow-stateless")) {
+                    inv_match = false;
+                    return inv_match;
+                }
+            }
+        }
+    }
+
+    return inv_match;
+}
+
static void
build_acls(const struct ls_stateful_record *ls_stateful_rec,
            const struct ovn_datapath *od,
@@ -7457,8 +7504,15 @@ build_acls(const struct ls_stateful_record 
*ls_stateful_rec,
          *
          * This is enforced at a higher priority than ACLs can be defined. */
         ds_clear(&match);
-        ds_put_format(&match, "%s(ct.est && ct.rpl && ct_mark.blocked == 1)",
-                      use_ct_inv_match ? "ct.inv || " : "");
+
+        if (use_ct_inv_match && stateless_inv_match(ls_stateful_rec,
+                                                    od, ls_port_groups)) {
+            ds_put_cstr(&match, "ct.inv || (ct.est && ct.rpl && "
+                                "ct_mark.blocked == 1)");
+        } else {
+            ds_put_cstr(&match, "ct.est && ct.rpl && ct_mark.blocked == 1");
+        }


If we go this way we should probably also update the "use_ct_inv_match"
documentation in ovn-nb.xml.

This is also a very noticeable behavior change, we should at least
mention it in NEWS if we accept this patch.



+
         ovn_lflow_add(lflows, od, S_SWITCH_IN_ACL_EVAL, UINT16_MAX - 3,
                       ds_cstr(&match), REGBIT_ACL_VERDICT_DROP " = 1; next;",
                       lflow_ref);
@@ -7752,8 +7806,7 @@ build_lb_rules_pre_stateful(struct lflow_table *lflows,
         }
         ds_put_cstr(action, "ct_lb_mark;");

-        ds_put_format(match, REGBIT_CONNTRACK_NAT" == 1 && %s.dst == %s",
-                      ip_match, lb_vip->vip_str);
+        ds_put_format(match, "%s.dst == %s", ip_match, lb_vip->vip_str);
         if (lb_vip->port_str) {
             ds_put_format(match, " && %s.dst == %s", lb->proto,
                           lb_vip->port_str);
@@ -7763,6 +7816,22 @@ build_lb_rules_pre_stateful(struct lflow_table *lflows,
             lflows, lb_dps->nb_ls_map, ods_size(ls_datapaths),
             S_SWITCH_IN_PRE_STATEFUL, 120, ds_cstr(match), ds_cstr(action),
             &lb->nlb->header_, lb_dps->lflow_ref);
+
+        /* Pass all VIP traffic to the conntrack to support
+           load balancers in the case of stateless acl. */
+        if (lb_vip->port_str && lb->proto) {


lb->proto is always non-NULL, it can be omitted from the condition.

What about load balancers that have a port specified?



+            ds_clear(action);
+            ds_clear(match);
+
+            ds_put_cstr(action, "ct_lb_mark;");
+            ds_put_format(match, "%s && %s.dst == %s",
+                          lb->proto, ip_match, lb_vip->vip_str);


Why not restrict this flow to only match traffic destined to the LB
port, if the LB has a port configured?



+
+            ovn_lflow_add_with_dp_group(
+                lflows, lb_dps->nb_ls_map, ods_size(ls_datapaths),
+                S_SWITCH_IN_PRE_STATEFUL, 115, ds_cstr(match), ds_cstr(action),
+                &lb->nlb->header_, lb_dps->lflow_ref);
+        }
     }
}

diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
index 155ba8a49..2719ac77b 100644
--- a/northd/ovn-northd.8.xml
+++ b/northd/ovn-northd.8.xml
@@ -507,7 +507,7 @@
       <ref table="Logical_Switch_Port" db="OVN_Northbound"/>.  Multicast, IPv6
       Neighbor Discovery and MLD traffic also skips stateful ACLs. For
       "allow-stateless" ACLs, a flow is added to bypass setting the hint for
-      connection tracker processing when there are stateful ACLs or LB rules;
+      connection tracker processing when there are stateful ACLs without LB;
       <code>REGBIT_ACL_STATELESS</code> is set for traffic matching stateless
       ACL flows.
     </p>
@@ -624,6 +624,14 @@
          <code>ct_lb_mark;</code> action.
       </li>

+      <li>
+         A priority-115 flow sends all packet directed to VIP to connection
+         tracker. Packets that match this rule would still be subject to
+         connection tracking via lower-priority rules in the absence of
+         stateless ACLs. However, with stateless ACLs in place, this rule
+         enables load balancing when the client balances traffic to itself.
+      </li>
+
       <li>
          A priority-100 flow sends the packets to connection tracker based
          on a hint provided by the previous tables
@@ -770,7 +778,8 @@
       </li>
       <li>
         <code>allow-stateless</code> ACLs translate into logical flows that set
-        the allow bit and advance to the next table.
+        the allow bit and advance to the next table. In case of load balancers
+        with stateless ACLs in this table we allow ct.inv packets go further.
       </li>
       <li>
         <code>reject</code> ACLs translate into logical flows with that set the
diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
index 419130aa8..c6540f90d 100644
--- a/tests/ovn-northd.at
+++ b/tests/ovn-northd.at
@@ -3747,10 +3747,18 @@ for direction in from to; do
done
check ovn-nbctl --wait=sb sync

-# TCP packets should not go to conntrack for load balancing.
+# TCP packets should go to conntrack for load balancing.
flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_tcp}"
AT_CHECK_UNQUOTED([ovn_trace --ct new --ct new --minimal ls "${flow}"], [0], 
[dnl
-output("lsp2");
+ct_lb_mark {
+    ct_lb_mark {
+        reg0[[6]] = 0;
+        reg0[[12]] = 0;
+        ct_lb_mark /* default (use --ct to customize) */ {
+            output("lsp2");
+        };
+    };
+};
])

# UDP packets still go to conntrack.
@@ -3883,10 +3891,18 @@ for direction in from to; do
done
check ovn-nbctl --wait=sb sync

-# TCP packets should not go to conntrack for load balancing.
+# TCP packets should go to conntrack for load balancing.
flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_tcp}"
AT_CHECK_UNQUOTED([ovn_trace --ct new --ct new --minimal ls "${flow}"], [0], 
[dnl
-output("lsp2");
+ct_lb_mark {
+    ct_lb_mark {
+        reg0[[6]] = 0;
+        reg0[[12]] = 0;
+        ct_lb_mark /* default (use --ct to customize) */ {
+            output("lsp2");
+        };
+    };
+};
])

# UDP packets still go to conntrack.
@@ -4770,8 +4786,10 @@ check_stateful_flows() {
   table=??(ls_in_pre_stateful ), priority=100  , match=(reg0[[0]] == 1), 
action=(ct_next;)
   table=??(ls_in_pre_stateful ), priority=110  , match=(reg0[[2]] == 1), 
action=(ct_lb_mark;)
   table=??(ls_in_pre_stateful ), priority=115  , match=(reg0[[2]] == 1 && 
ip.is_frag), action=(reg0[[19]] = 1; ct_lb_mark;)
-  table=??(ls_in_pre_stateful ), priority=120  , match=(reg0[[2]] == 1 && 
ip4.dst == 10.0.0.10 && tcp.dst == 80), action=(reg4 = 10.0.0.10; reg2[[0..15]] 
= 80; ct_lb_mark;)
-  table=??(ls_in_pre_stateful ), priority=120  , match=(reg0[[2]] == 1 && 
ip4.dst == 10.0.0.20 && tcp.dst == 80), action=(reg4 = 10.0.0.20; reg2[[0..15]] 
= 80; ct_lb_mark;)
+  table=??(ls_in_pre_stateful ), priority=115  , match=(tcp && ip4.dst == 
10.0.0.10), action=(ct_lb_mark;)
+  table=??(ls_in_pre_stateful ), priority=115  , match=(tcp && ip4.dst == 
10.0.0.20), action=(ct_lb_mark;)
+  table=??(ls_in_pre_stateful ), priority=120  , match=(ip4.dst == 10.0.0.10 
&& tcp.dst == 80), action=(reg4 = 10.0.0.10; reg2[[0..15]] = 80; ct_lb_mark;)
+  table=??(ls_in_pre_stateful ), priority=120  , match=(ip4.dst == 10.0.0.20 
&& tcp.dst == 80), action=(reg4 = 10.0.0.20; reg2[[0..15]] = 80; ct_lb_mark;)
])

     AT_CHECK([grep "ls_in_lb " sw0flows | ovn_strip_lflows], [0], [dnl
@@ -5065,16 +5083,16 @@ AT_CAPTURE_FILE([sw0flows])

AT_CHECK([grep -w "ls_in_acl_eval" sw0flows | grep 6553 | ovn_strip_lflows], 
[0], [dnl
   table=??(ls_in_acl_eval     ), priority=65532, match=(!ct.est && ct.rel && 
!ct.new && ct_mark.blocked == 0), action=(reg0[[17]] = 1; reg8[[16]] = 1; 
ct_commit_nat;)
-  table=??(ls_in_acl_eval     ), priority=65532, match=((ct.est && ct.rpl && 
ct_mark.blocked == 1)), action=(reg8[[17]] = 1; next;)
   table=??(ls_in_acl_eval     ), priority=65532, match=(ct.est && !ct.rel && 
!ct.new && ct.rpl && ct_mark.blocked == 0), action=(reg0[[9]] = 0; reg0[[10]] = 
0; reg0[[17]] = 1; reg8[[16]] = 1; next;)
+  table=??(ls_in_acl_eval     ), priority=65532, match=(ct.est && ct.rpl && 
ct_mark.blocked == 1), action=(reg8[[17]] = 1; next;)
   table=??(ls_in_acl_eval     ), priority=65532, match=(ct.est && 
ct_mark.allow_established == 1), action=(reg0[[21]] = 1; reg8[[16]] = 1; next;)
   table=??(ls_in_acl_eval     ), priority=65532, match=(nd || nd_ra || nd_rs 
|| mldv1 || mldv2), action=(reg8[[16]] = 1; next;)
])

AT_CHECK([grep -w "ls_out_acl_eval" sw0flows | grep 6553 | ovn_strip_lflows], 
[0], [dnl
   table=??(ls_out_acl_eval    ), priority=65532, match=(!ct.est && ct.rel && 
!ct.new && ct_mark.blocked == 0), action=(reg8[[16]] = 1; ct_commit_nat;)
-  table=??(ls_out_acl_eval    ), priority=65532, match=((ct.est && ct.rpl && 
ct_mark.blocked == 1)), action=(reg8[[17]] = 1; next;)
   table=??(ls_out_acl_eval    ), priority=65532, match=(ct.est && !ct.rel && 
!ct.new && ct.rpl && ct_mark.blocked == 0), action=(reg8[[16]] = 1; next;)
+  table=??(ls_out_acl_eval    ), priority=65532, match=(ct.est && ct.rpl && 
ct_mark.blocked == 1), action=(reg8[[17]] = 1; next;)
   table=??(ls_out_acl_eval    ), priority=65532, match=(ct.est && 
ct_mark.allow_established == 1), action=(reg8[[16]] = 1; next;)
   table=??(ls_out_acl_eval    ), priority=65532, match=(nd || nd_ra || nd_rs 
|| mldv1 || mldv2), action=(reg8[[16]] = 1; next;)
])
@@ -14333,7 +14351,7 @@ ovn-sbctl dump-flows s1 > s1flows
AT_CAPTURE_FILE([s1flows])

AT_CHECK([grep "ls_in_pre_stateful" s1flows | ovn_strip_lflows | grep 
"30.0.0.1"], [0], [dnl
-  table=??(ls_in_pre_stateful ), priority=120  , match=(reg0[[2]] == 1 && 
ip4.dst == 30.0.0.1), action=(reg4 = 30.0.0.1; ct_lb_mark;)
+  table=??(ls_in_pre_stateful ), priority=120  , match=(ip4.dst == 30.0.0.1), 
action=(reg4 = 30.0.0.1; ct_lb_mark;)
])
AT_CHECK([grep "ls_in_lb" s1flows | ovn_strip_lflows | grep "30.0.0.1"], [0], 
[dnl
   table=??(ls_in_lb           ), priority=110  , match=(ct.new && ip4.dst == 
30.0.0.1), action=(reg4 = 30.0.0.1; 
ct_lb_mark(backends=172.16.0.103,172.16.0.102,172.16.0.101);)
@@ -14447,7 +14465,7 @@ ovn-sbctl dump-flows s1 > s1flows
AT_CAPTURE_FILE([s1flows])

AT_CHECK([grep "ls_in_pre_stateful" s1flows | ovn_strip_lflows | grep 
"2001:db8:cccc::1"], [0], [dnl
-  table=??(ls_in_pre_stateful ), priority=120  , match=(reg0[[2]] == 1 && 
ip6.dst == 2001:db8:cccc::1), action=(xxreg1 = 2001:db8:cccc::1; ct_lb_mark;)
+  table=??(ls_in_pre_stateful ), priority=120  , match=(ip6.dst == 
2001:db8:cccc::1), action=(xxreg1 = 2001:db8:cccc::1; ct_lb_mark;)
])
AT_CHECK([grep "ls_in_lb" s1flows | ovn_strip_lflows | grep 
"2001:db8:cccc::1"], [0], [dnl
   table=??(ls_in_lb           ), priority=110  , match=(ct.new && ip6.dst == 
2001:db8:cccc::1), action=(xxreg1 = 2001:db8:cccc::1; 
ct_lb_mark(backends=2001:db8:aaaa:3::103,2001:db8:aaaa:3::102,2001:db8:aaaa:3::101);)
@@ -14558,7 +14576,7 @@ ovn-sbctl dump-flows s1 > s1flows
AT_CAPTURE_FILE([s1flows])

AT_CHECK([grep "ls_in_pre_stateful" s1flows | ovn_strip_lflows | grep 
"30.0.0.1"], [0], [dnl
-  table=??(ls_in_pre_stateful ), priority=120  , match=(reg0[[2]] == 1 && 
ip4.dst == 30.0.0.1), action=(reg4 = 30.0.0.1; ct_lb_mark;)
+  table=??(ls_in_pre_stateful ), priority=120  , match=(ip4.dst == 30.0.0.1), 
action=(reg4 = 30.0.0.1; ct_lb_mark;)
])
AT_CHECK([grep "ls_in_lb" s1flows | ovn_strip_lflows | grep "30.0.0.1"], [0], 
[dnl
   table=??(ls_in_lb           ), priority=110  , match=(ct.new && ip4.dst == 
30.0.0.1), action=(reg4 = 30.0.0.1; 
ct_lb_mark(backends=172.16.0.103,172.16.0.102,172.16.0.101);)
@@ -16753,3 +16771,69 @@ check grep -q "Bad configuration: The peer of the 
switch port 'ls-lrp1' (LRP pee

AT_CLEANUP
])
+
+OVN_FOR_EACH_NORTHD_NO_HV([
+AT_SETUP([Load Balancers with stateless ACLs])
+ovn_start ovn-northd
+
+AS_BOX([create logical switches and ports])
+check ovn-nbctl ls-add sw0
+check ovn-nbctl --wait=sb lsp-add sw0 sw0-p1 -- lsp-set-addresses sw0-p1 \
+"00:00:00:00:00:02 10.0.0.2"
+
+check ovn-nbctl --wait=sb lsp-add sw0 sw0-p2 -- lsp-set-addresses sw0-p2 \
+"00:00:00:00:00:03 10.0.0.3"
+
+AS_BOX([create stateless ACLs])
+check ovn-nbctl acl-add sw0 from-lport 1001 "ip" allow-stateless
+check ovn-nbctl acl-add sw0 to-lport 1001 "ip" allow-stateless
+
+AS_BOX([create statefull ACLs])
+# check if allow-stateless acls have higher priority we skip conntrack.
+check ovn-nbctl acl-add sw0 from-lport 1000 "ip" allow-related
+check ovn-nbctl acl-add sw0 to-lport 1000 "ip" allow-related
+
+ovn-sbctl dump-flows sw0 > sw0flows
+
+AT_CHECK(
+  [grep -E 'ls_(in|out)_pre_acl' sw0flows | grep reg0 | ovn_strip_lflows], 
[0], [dnl
+  table=??(ls_in_pre_acl      ), priority=100  , match=(ip), action=(reg0[[0]] 
= 1; next;)
+  table=??(ls_in_pre_acl      ), priority=2001 , match=(ip), 
action=(reg0[[16]] = 1; next;)
+  table=??(ls_out_pre_acl     ), priority=100  , match=(ip), action=(reg0[[0]] 
= 1; next;)
+  table=??(ls_out_pre_acl     ), priority=2001 , match=(ip), 
action=(reg0[[16]] = 1; next;)
+])
+
+AT_CHECK(
+  [grep -E 'ls_out_acl_eval' sw0flows | grep 65532 | ovn_strip_lflows], [0], 
[dnl
+  table=??(ls_out_acl_eval    ), priority=65532, match=(!ct.est && ct.rel && 
!ct.new && !ct.inv && ct_mark.blocked == 0), action=(reg8[[16]] = 1; 
ct_commit_nat;)
+  table=??(ls_out_acl_eval    ), priority=65532, match=(ct.est && !ct.rel && 
!ct.new && !ct.inv && ct.rpl && ct_mark.blocked == 0), action=(reg8[[16]] = 1; 
next;)
+  table=??(ls_out_acl_eval    ), priority=65532, match=(ct.est && 
ct_mark.allow_established == 1), action=(reg8[[16]] = 1; next;)
+  table=??(ls_out_acl_eval    ), priority=65532, match=(ct.inv || (ct.est && 
ct.rpl && ct_mark.blocked == 1)), action=(reg8[[17]] = 1; next;)
+  table=??(ls_out_acl_eval    ), priority=65532, match=(nd || nd_ra || nd_rs 
|| mldv1 || mldv2), action=(reg8[[16]] = 1; next;)
+])
+
+AS_BOX([create Load Balancer])
+check ovn-nbctl lb-add lb1 10.0.0.4:80 10.0.0.2:80,10.0.0.3:80
+check ovn-nbctl --wait=sb ls-lb-add sw0 lb1
+
+ovn-sbctl dump-flows sw0 > sw0flows
+
+AT_CHECK(
+  [grep -E 'ls_(in|out)_pre_acl' sw0flows | grep reg0 | ovn_strip_lflows], 
[0], [dnl
+  table=??(ls_in_pre_acl      ), priority=100  , match=(ip), action=(reg0[[0]] 
= 1; next;)
+  table=??(ls_in_pre_acl      ), priority=2001 , match=(ip), 
action=(reg0[[16]] = 1; next;)
+  table=??(ls_out_pre_acl     ), priority=100  , match=(ip), action=(reg0[[0]] 
= 1; next;)
+])
+
+# we do not match conntrack invalide packets in case of load balancers with 
stateless ACLs
+AT_CHECK(
+  [grep -E 'ls_out_acl_eval' sw0flows | grep 65532 | ovn_strip_lflows], [0], 
[dnl
+  table=??(ls_out_acl_eval    ), priority=65532, match=(!ct.est && ct.rel && 
!ct.new && !ct.inv && ct_mark.blocked == 0), action=(reg8[[16]] = 1; 
ct_commit_nat;)
+  table=??(ls_out_acl_eval    ), priority=65532, match=(ct.est && !ct.rel && 
!ct.new && !ct.inv && ct.rpl && ct_mark.blocked == 0), action=(reg8[[16]] = 1; 
next;)
+  table=??(ls_out_acl_eval    ), priority=65532, match=(ct.est && ct.rpl && 
ct_mark.blocked == 1), action=(reg8[[17]] = 1; next;)
+  table=??(ls_out_acl_eval    ), priority=65532, match=(ct.est && 
ct_mark.allow_established == 1), action=(reg8[[16]] = 1; next;)
+  table=??(ls_out_acl_eval    ), priority=65532, match=(nd || nd_ra || nd_rs 
|| mldv1 || mldv2), action=(reg8[[16]] = 1; next;)
+])
+AT_CLEANUP
+])
+
diff --git a/tests/ovn.at b/tests/ovn.at
index afde2576f..3e299711b 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -26345,7 +26345,7 @@ OVS_WAIT_FOR_OUTPUT(
   [ovn-sbctl dump-flows > sbflows
    ovn-sbctl dump-flows sw0 | grep ct_lb_mark | grep priority=120 | sed 
's/table=..//'], 0,
   [dnl
-  (ls_in_pre_stateful ), priority=120  , match=(reg0[[2]] == 1 && ip4.dst == 
10.0.0.10 && tcp.dst == 80), action=(reg4 = 10.0.0.10; reg2[[0..15]] = 80; 
ct_lb_mark;)
+  (ls_in_pre_stateful ), priority=120  , match=(ip4.dst == 10.0.0.10 && 
tcp.dst == 80), action=(reg4 = 10.0.0.10; reg2[[0..15]] = 80; ct_lb_mark;)
   (ls_in_lb           ), priority=120  , match=(ct.new && ip4.dst == 10.0.0.10 
&& tcp.dst == 80), action=(reg4 = 10.0.0.10; reg2[[0..15]] = 80; 
ct_lb_mark(backends=10.0.0.3:80,20.0.0.3:80; 
hash_fields="ip_dst,ip_src,tcp_dst,tcp_src");)
])

@@ -26390,7 +26390,7 @@ AT_CHECK(
   [grep "ip4.dst == 10.0.0.10 && tcp.dst == 80" sbflows3 | grep priority=120 |\
    ovn_strip_lflows], [0], [dnl
   table=??(ls_in_lb           ), priority=120  , match=(ct.new && ip4.dst == 
10.0.0.10 && tcp.dst == 80), action=(drop;)
-  table=??(ls_in_pre_stateful ), priority=120  , match=(reg0[[2]] == 1 && 
ip4.dst == 10.0.0.10 && tcp.dst == 80), action=(reg4 = 10.0.0.10; reg2[[0..15]] 
= 80; ct_lb_mark;)
+  table=??(ls_in_pre_stateful ), priority=120  , match=(ip4.dst == 10.0.0.10 
&& tcp.dst == 80), action=(reg4 = 10.0.0.10; reg2[[0..15]] = 80; ct_lb_mark;)
])

AT_CAPTURE_FILE([sbflows4])
@@ -26551,7 +26551,7 @@ OVS_WAIT_FOR_OUTPUT(
   [ovn-sbctl dump-flows > sbflows
    ovn-sbctl dump-flows sw0 | grep ct_lb_mark | grep priority=120 | sed 
's/table=..//'], 0,
   [dnl
-  (ls_in_pre_stateful ), priority=120  , match=(reg0[[2]] == 1 && ip6.dst == 
2002::a && tcp.dst == 80), action=(xxreg1 = 2002::a; reg2[[0..15]] = 80; 
ct_lb_mark;)
+  (ls_in_pre_stateful ), priority=120  , match=(ip6.dst == 2002::a && tcp.dst 
== 80), action=(xxreg1 = 2002::a; reg2[[0..15]] = 80; ct_lb_mark;)
   (ls_in_lb           ), priority=120  , match=(ct.new && ip6.dst == 2002::a 
&& tcp.dst == 80), action=(xxreg1 = 2002::a; reg2[[0..15]] = 80; 
ct_lb_mark(backends=[[2001::3]]:80,[[2002::3]]:80; 
hash_fields="ipv6_dst,ipv6_src,tcp_dst,tcp_src");)
])

@@ -26595,7 +26595,7 @@ AT_CHECK(
   [grep "ip6.dst == 2002::a && tcp.dst == 80" sbflows3 | grep priority=120 |\
    ovn_strip_lflows], [0], [dnl
   table=??(ls_in_lb           ), priority=120  , match=(ct.new && ip6.dst == 
2002::a && tcp.dst == 80), action=(drop;)
-  table=??(ls_in_pre_stateful ), priority=120  , match=(reg0[[2]] == 1 && 
ip6.dst == 2002::a && tcp.dst == 80), action=(xxreg1 = 2002::a; reg2[[0..15]] = 
80; ct_lb_mark;)
+  table=??(ls_in_pre_stateful ), priority=120  , match=(ip6.dst == 2002::a && 
tcp.dst == 80), action=(xxreg1 = 2002::a; reg2[[0..15]] = 80; ct_lb_mark;)
])

AT_CAPTURE_FILE([sbflows4])
diff --git a/tests/system-ovn.at b/tests/system-ovn.at
index dd13575e4..658e3eb26 100644
--- a/tests/system-ovn.at
+++ b/tests/system-ovn.at
@@ -10087,6 +10087,28 @@ sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [dnl

AT_CHECK([ovs-appctl dpctl/flush-conntrack])

+# del acls
+check ovn-nbctl acl-del foo
+
+# add statefull acl


Comments should be sentences: start with a capital letter and end with a
period.  This applies to most comments added by this patch.

Also, I might be wrong but I think the correct version is "stateful"
instead of "statefull".  This remark also applies to multiple places in
this patch.



+check ovn-nbctl acl-add foo from-lport 1 1 allow-related
+check ovn-nbctl --wait=hv acl-add foo to-lport 1 1 allow-related
+
+# add stateless acl
+check ovn-nbctl acl-add foo from-lport 2 1 allow-stateless
+check ovn-nbctl --wait=hv acl-add foo to-lport 2 1 allow-stateless
+
+AT_CHECK([ip netns exec foo1 wget   192.168.2.2 -t 1 -T 1], [0], [ignore], 
[ignore])
+
+# check conntrack zone has no entry
+AT_CHECK([ovs-appctl dpctl/dump-conntrack zone=$zone_id | \
+FORMAT_CT(192.168.2.2) | \
+sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [dnl
+])
+
+AT_CHECK([ovs-appctl dpctl/flush-conntrack])
+
+# when lb configued: pass conntrack for vip traffic.


Typo: configued



# add lb back
check ovn-nbctl ls-lb-add foo lb1

@@ -10096,13 +10118,14 @@ check ovn-nbctl --wait=hv sync
OVS_WAIT_UNTIL([ovs-ofctl -O OpenFlow13 dump-groups br-int | \
grep 'nat(dst=192.168.2.2:80)'])

-# should not dnat so will not be able to connect
-AT_CHECK([ip netns exec foo1 curl 30.30.30.30 --retry 3 --max-time 1], [28], 
[ignore], [ignore])
+# now check with VIP
+AT_CHECK([ip netns exec foo1 wget   30.30.30.30  -t 3 -T 1], [0], [ignore], 
[ignore])

-# check conntrack zone has no tcp entry
+# check conntrack zone has tcp entry
AT_CHECK([ovs-appctl dpctl/dump-conntrack zone=$zone_id | \
FORMAT_CT(30.30.30.30) | \
sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [dnl
+tcp,orig=(src=192.168.1.2,dst=30.30.30.30,sport=<cleared>,dport=<cleared>),reply=(src=192.168.2.2,dst=192.168.1.2,sport=<cleared>,dport=<cleared>),zone=<cleared>,mark=2,protoinfo=(state=<cleared>)
])

AT_CHECK([ovs-appctl dpctl/flush-conntrack])
@@ -10217,6 +10240,29 @@ AT_CHECK([ovs-appctl dpctl/flush-conntrack])
# remove lb
check ovn-nbctl ls-lb-del foo lb1

+# del acls
+check ovn-nbctl acl-del foo
+
+# add statefull acl
+check ovn-nbctl acl-add foo from-lport 1 1 allow-related
+check ovn-nbctl --wait=hv acl-add foo to-lport 1 1 allow-related
+
+# add stateless acl
+check ovn-nbctl acl-add foo from-lport 2 1 allow-stateless
+check ovn-nbctl --wait=hv acl-add foo to-lport 2 1 allow-stateless
+
+AT_CHECK([ip netns exec foo1  wget http://[[fd12::2]] -t 1 -T 1], [0], 
[ignore], [ignore])
+
+# check conntrack zone has no tcp entry
+AT_CHECK([ovs-appctl dpctl/dump-conntrack zone=$zone_id | \
+FORMAT_CT(fd12::2) |  grep -v fe80 | \
+sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [dnl
+])
+
+AT_CHECK([ovs-appctl dpctl/flush-conntrack])
+
+check ovn-nbctl acl-del foo
+
# add stateless acl
check ovn-nbctl acl-add foo from-lport 1 1 allow-stateless
check ovn-nbctl --wait=hv acl-add foo to-lport 1 1 allow-stateless
@@ -10240,13 +10286,14 @@ check ovn-nbctl --wait=hv sync
OVS_WAIT_UNTIL([ovs-ofctl -O OpenFlow13 dump-groups br-int | \
grep 'nat(dst=\[[fd12::2\]]:80)'])

-# should not dnat so will not be able to connect
-AT_CHECK([ip netns exec foo1 curl http://[[fd30::2]] --retry 3 --max-time 1], 
[28], [ignore], [ignore])
-#
-# check conntrack zone has no tcp entry
+# now check with VIP
+AT_CHECK([ip netns exec foo1 wget  http://[[fd30::2]]  -t 3 -T 1], [0], 
[ignore], [ignore])
+
+# check conntrack zone has tcp entry
AT_CHECK([ovs-appctl dpctl/dump-conntrack zone=$zone_id | \
FORMAT_CT(fd30::2) | grep -v fe80 | \
sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [dnl
+tcp,orig=(src=fd11::2,dst=fd30::2,sport=<cleared>,dport=<cleared>),reply=(src=fd12::2,dst=fd11::2,sport=<cleared>,dport=<cleared>),zone=<cleared>,mark=2,protoinfo=(state=<cleared>)
])

AT_CHECK([ovs-appctl dpctl/flush-conntrack])



--
regards,
Alexandra.



Regards,
Dumitru




--
regards,
Alexandra.
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to