On Mon, Aug 17, 2020 at 8:57 PM Dumitru Ceara <[email protected]> wrote:
>
> On 8/17/20 4:32 PM, Numan Siddique wrote:
> > On Wed, Aug 12, 2020 at 5:52 PM Dumitru Ceara <[email protected]> wrote:
> >>
> >> A new configuration option is added to Logical_Switch:
> >> other_config:acl-stateful-bypass. This optional value determines which
> >> traffic should completely bypass connection tracking when ACLs are
> >> processed.
> >>
> >> In specific scenarios CMSs can predetermine which traffic must be
> >> firewalled statefully or not, e.g., UDP vs TCP. However, until now, if
> >> at least one stateful ACL (allow-related) is configured on the switch,
> >> all traffic gets sent to connection tracking. This induces a hit in
> >> performance when forwarding packets that don't need stateful processing.
> >>
> >> Signed-off-by: Dumitru Ceara <[email protected]>
> >
> > Hi Dumitru,
> >
> > Thanks for the patch. I have few comments.
> >
>
> Hi Numan,
>
> Thanks for the review.
>
> > 1. From what I understand, the packet now is not sent to the conntrack
> > in the ingress pipeline, but it
> > is still sent in the egress pipeline
> >
>
> Actually, it shouldn't be sent to conntrack in the egress pipeline
> either if it matches the acl-stateful-bypass filter.
I think if the logical switch has a load balancer, then the packet
will go to conntrack in the egress
pipeline right ?
Thanks
Numan
>
> > This patch breaks for the below OVN resources and ACL configuration.
> >
> > ***
> > ovn-nbctl ls-add sw0
> > ovn-nbctl lsp-add sw0 sw0-port1
> > ovn-nbctl lsp-set-addresses sw0-port1 "50:54:00:00:00:03 10.0.0.3"
> >
> > ovn-nbctl lr-add lr0
> > ovn-nbctl lrp-add lr0 lr0-sw0 00:00:00:00:ff:01 10.0.0.1/24
> > ovn-nbctl lsp-add sw0 sw0-lr0
> > ovn-nbctl lsp-set-type sw0-lr0 router
> > ovn-nbctl lsp-set-addresses sw0-lr0 router
> > ovn-nbctl lsp-set-options sw0-lr0 router-port=lr0-sw0
> >
> > ovn-nbctl ls-add public
> > ovn-nbctl lrp-add lr0 lr0-public 00:00:20:20:12:13 172.16.0.100/24
> > ovn-nbctl lsp-add public public-lr0
> > ovn-nbctl lsp-set-type public-lr0 router
> > ovn-nbctl lsp-set-addresses public-lr0 router
> > ovn-nbctl lsp-set-options public-lr0 router-port=lr0-public
> >
> > # localnet port
> > ovn-nbctl lsp-add public ln-public
> > ovn-nbctl lsp-set-type ln-public localnet
> > ovn-nbctl lsp-set-addresses ln-public unknown
> > ovn-nbctl lsp-set-options ln-public network_name=public
> >
> > # schedule the gw router port to a chassis.
> > ovn-nbctl lrp-set-gateway-chassis lr0-public ovn-gw-1 20
> >
> > # Create NAT entries
> > ovn-nbctl lr-nat-add lr0 snat 172.16.0.100 10.0.0.0/24
> > ovn-nbctl lr-nat-add lr0 dnat_and_snat 172.16.0.110 10.0.0.3 sw0-port1
> > 30:54:00:00:00:03
> >
> > ovn-nbctl pg-add pg0 sw0-port1
> > ovn-nbctl pg-add pg0_drop sw0-port1
> >
> > ovn-nbctl acl-add pg0_drop from-lport 1001 "inport == @pg0_drop && ip" drop
> > ovn-nbctl acl-add pg0_drop to-lport 1001 "outport == @pg0_drop && ip" drop
> >
> > ovn-nbctl acl-add pg0 to-lport 1002 "outport == @pg0 && ip4 && tcp.dst
> > == 80" allow-related
> > ovn-nbctl set logical_switch sw0 other_config:acl-stateful-bypass="tcp"
> > *****
> >
> > Start a nc server on sw0-port1 and connect to the sw0-port1 nc server
> > from outside (i.e the nc client to 172.16.0.110 (North/South
> > traffic)).
> >
> > Without this patch, it works, but with this patch, the reply gets
> > dropped since the packet is not sent to the conntrack now.
> >
>
> I think this needs more documentation on my side. I might be wrong but I
> think we discussed this during the OVN community meeting with Han when I
> first suggested this approach: if the match of an allow-related ACL is a
> superset of the acl-stateful-bypass match then the ACL the ACL is
> implicitly converted to an "allow" ACL.
>
> The CMS should avoid such configurations. I think this is unfortunately
> the only way to go because the packets go to conntrack before the
> IN/OUT_ACL table. in the PRE_STATEFUL stage.
>
> In your case the SYN packet matches the ACL and is allowed while for the
> SYN-ACK packet there's no matching rule except for the drop rule.
It would be great to add more documentation on it.
>
> The configuration that would make this work is:
> ovn-nbctl set logical_switch sw0 other_config:acl-stateful-bypass="tcp"
> ovn-nbctl acl-add pg0_drop from-lport 1001 "inport == @pg0_drop && ip" drop
> ovn-nbctl acl-add pg0_drop to-lport 1001 "outport == @pg0_drop && ip" drop
> ovn-nbctl acl-add pg0 to-lport 1002 "outport == @pg0 && ip4 && tcp.dst
> == 80" allow
> ovn-nbctl acl-add pg0 to-lport 1002 "inport == @pg0 && ip4 && tcp.src ==
> 80" allow
>
> If there would be another allow-related ACL (e.g. for UDP):
> ovn-nbctl acl-add pg0 to-lport 1002 "outport == @pg0 && ip4 && udp.dst
> == 4242" allow-related
>
> Without this patch the two "allow" rules for TCP would implicitly get
> transformed to "allow-related" because all packets would go to conntrack
> and all ACLs are considered stateful.
>
> With this patch, TCP can be processed without any conntrack
> recirculation while for UDP we'd be using conntrack.
>
> >
> > 2. After this patch we see the below lflow for each ACL.
> >
> > ****
> > table=4 (ls_out_acl ), priority=2002 , match=((reg0[7] ==
> > 1 || !ct.trk || (!ct.new && ct.est && !ct.rpl && ct_label.blocked ==
> > 0)) && (ACL MATCH)), action=(next;)
> > ****
> >
> > The above flow (with the match - "reg0[7] == 1 " will result in
> > one extra Open vSwitch flow for every ACL.
> >
> > I am not sure if this can be avoided, but if we can find a way to
> > avoid it would be great. Ignore this comment if this is not possible.
> >
>
> I don't think it can be avoided because we'd have to parse the match
> expression of the ACL to determine if the "acl-stateful-bypass" is a
> subexpression of it.
My initial thought was to add a high prio (may be 65535) flow in n_acl/out_acl
stage with the action - next if reg0[7] ==1 because we don't need to send
the packet to conntrack. But then realized that we need to handle drop ACLs.
>
> >
> > 3. The CMS can set any match in the option -
> > "other_config:acl-stateful-bypass".
> > For example:
> > ovn-nbctl set logical_switch sw0
> > other_config:acl-stateful-bypass="tcp && tcp.dst >= 1000 && tcp.dst <=
> > 1005"
> >
> > Although it works, it seems a bit odd to me that a config option
> > can be a match.
> > Can't we instead add another ACL action ? Maybe "allow-stateless"
> > ? I am not sure if this is feasible, but just a thought.
> >
>
> We could do that but then these "allow-stateless" ACLs would be
> translated to flows in the pre-acl table like the patch does for
> "acl-stateful-bypass" now. So some of the ACLs will generate flows in
> the IN/OUT_ACL table while others will generate flows in the PRE_ACL
> table. Is that acceptable?
I think it should be fine. But I won't press too hard on it.
>
> > 4. A Small minor comment. In the lflow-list, I see below flows. Can
> > you please correct the indentations and spacing.
> > ********
> > table=6 (ls_in_acl ), priority=65535, match=(reg0[7] ==
> > 0&& (ct.inv || (ct.est && ct.rpl && ct_label.blocked == 1))),
> > action=(drop;)
> > table=6 (ls_in_acl ), priority=65535, match=(reg0[7]== 0 &&
> > !ct.est && ct.rel && !ct.new && !ct.inv && ct_label.blocked == 0),
> > action=(next;)
> > table=6 (ls_in_acl ), priority=65535, match=(reg0[7]== 0 &&
> > ct.est && !ct.rel && !ct.new && !ct.inv && ct.rpl && ct_label.blocked
> > == 0), action=(next;)
> > ********
> >
> > Notice - reg0[7] == 0&&... and "reg0[7]== 0 && ....
>
> Thanks, noted, I'll fix it in the next iteration.
Thanks
Numan
>
> Regards,
> Dumitru
>
> >
> > Thanks
> > Numan
> >
> >
> >
> >
> >> ---
> >> NEWS | 3 +
> >> northd/ovn-northd.8.xml | 18 ++++++
> >> northd/ovn-northd.c | 114 ++++++++++++++++++++++++++++---------
> >> ovn-nb.xml | 27 ++++++++-
> >> tests/ovn-northd.at | 129
> >> ++++++++++++++++++++++++++++++++++++++++++
> >> tests/system-common-macros.at | 8 +++
> >> tests/system-ovn.at | 111 ++++++++++++++++++++++++++++++++++++
> >> utilities/ovn-nbctl.c | 10 ++++
> >> 8 files changed, 390 insertions(+), 30 deletions(-)
> >>
> >> diff --git a/NEWS b/NEWS
> >> index a1ce4e8..0ef7905 100644
> >> --- a/NEWS
> >> +++ b/NEWS
> >> @@ -11,6 +11,9 @@ Post-v20.06.0
> >> called Chassis_Private now contains the nb_cfg column which is
> >> updated
> >> by incrementing the value in the NB_Global table, CMSes relying on
> >> this mechanism should update their code to use this new table.
> >> + - Added support for bypassing connection tracking for ACL processing
> >> for
> >> + specific types of traffic through the user supplied
> >> acl-stateful-bypass
> >> + configuration.
> >>
> >> OVN v20.06.0
> >> --------------------------
> >> diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
> >> index ee21c82..fc315f8 100644
> >> --- a/northd/ovn-northd.8.xml
> >> +++ b/northd/ovn-northd.8.xml
> >> @@ -322,6 +322,15 @@
> >> </p>
> >>
> >> <p>
> >> + A priority-105 flow which sets <code>reg0[7] = 1</code> for traffic
> >> + that matches the condition set in
> >> + <ref column="other_config:acl-stateful-bypass"/> and advances to
> >> next
> >> + table. <code>reg0[7]</code> acts as a hint for tables
> >> + <code>Pre-Stateful</code> and <code>ACL</code> to avoid sending this
> >> + traffic to the connection tracker.
> >> + </p>
> >> +
> >> + <p>
> >> This table also has a priority-110 flow with the match
> >> <code>eth.dst == <var>E</var></code> for all logical switch
> >> datapaths to move traffic to the next table. Where <var>E</var>
> >> @@ -1372,6 +1381,15 @@ output;
> >> </p>
> >>
> >> <p>
> >> + A priority-105 flow which sets <code>reg0[7] = 1</code> for traffic
> >> + that matches the condition set in
> >> + <ref column="other_config:acl-stateful-bypass"/> and advances to
> >> next
> >> + table. <code>reg0[7]</code> acts as a hint for tables
> >> + <code>Pre-Stateful</code> and <code>ACL</code> to avoid sending this
> >> + traffic to the connection tracker.
> >> + </p>
> >> +
> >> + <p>
> >> This table also has a priority-110 flow with the match
> >> <code>eth.src == <var>E</var></code> for all logical switch
> >> datapaths to move traffic to the next table. Where <var>E</var>
> >> diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> >> index dc45929..1ff9190 100644
> >> --- a/northd/ovn-northd.c
> >> +++ b/northd/ovn-northd.c
> >> @@ -211,6 +211,7 @@ enum ovn_stage {
> >> #define REGBIT_DNS_LOOKUP_RESULT "reg0[4]"
> >> #define REGBIT_ND_RA_OPTS_RESULT "reg0[5]"
> >> #define REGBIT_HAIRPIN "reg0[6]"
> >> +#define REGBIT_SKIP_ACL_CT "reg0[7]"
> >>
> >> /* Register definitions for switches and routers. */
> >>
> >> @@ -245,11 +246,11 @@ enum ovn_stage {
> >> * OVS register usage:
> >> *
> >> * Logical Switch pipeline:
> >> - * +---------+-------------------------------------+
> >> - * | R0 | REGBIT_{CONNTRACK/DHCP/DNS/HAIRPIN} |
> >> - * +---------+-------------------------------------+
> >> - * | R1 - R9 | UNUSED |
> >> - * +---------+-------------------------------------+
> >> + * +---------+-------------------------------------------------+
> >> + * | R0 | REGBIT_{CONNTRACK/DHCP/DNS/HAIRPIN/SKIP_ACL_CT} |
> >> + * +---------+-------------------------------------------------+
> >> + * | R1 - R9 | UNUSED |
> >> + * +---------+-------------------------------------------------+
> >> *
> >> * Logical Router pipeline:
> >> *
> >> +-----+--------------------------+---+-----------------+---+---------------+
> >> @@ -4713,6 +4714,18 @@ has_stateful_acl(struct ovn_datapath *od)
> >> return false;
> >> }
> >>
> >> +static const char *
> >> +get_stateful_acl_bypass(struct ovn_datapath *od)
> >> +{
> >> + return smap_get(&od->nbs->other_config, "acl-stateful-bypass");
> >> +}
> >> +
> >> +static bool
> >> +has_stateful_acl_bypass(struct ovn_datapath *od)
> >> +{
> >> + return !!get_stateful_acl_bypass(od);
> >> +}
> >> +
> >> static void
> >> build_lswitch_input_port_sec(struct hmap *ports, struct hmap *datapaths,
> >> struct hmap *lflows)
> >> @@ -4926,6 +4939,19 @@ build_pre_acls(struct ovn_datapath *od, struct hmap
> >> *lflows)
> >> "nd || nd_rs || nd_ra || "
> >> "(udp && udp.src == 546 && udp.dst == 547)",
> >> "next;");
> >>
> >> + /* Ingress and Egress Pre-ACL Table (Priority 105).
> >> + *
> >> + * If the logical switch is configured to bypass conntrack for
> >> + * specific types of traffic, skip conntrack for that traffic.
> >> + */
> >> + const char *stateful_bypass = get_stateful_acl_bypass(od);
> >> + if (stateful_bypass) {
> >> + ovn_lflow_add(lflows, od, S_SWITCH_IN_PRE_ACL, 105,
> >> + stateful_bypass, REGBIT_SKIP_ACL_CT" = 1;
> >> next;");
> >> + ovn_lflow_add(lflows, od, S_SWITCH_OUT_PRE_ACL, 105,
> >> + stateful_bypass, REGBIT_SKIP_ACL_CT" = 1;
> >> next;");
> >> + }
> >> +
> >> /* Ingress and Egress Pre-ACL Table (Priority 100).
> >> *
> >> * Regardless of whether the ACL is "from-lport" or "to-lport",
> >> @@ -5260,7 +5286,8 @@ build_reject_acl_rules(struct ovn_datapath *od,
> >> struct hmap *lflows,
> >>
> >> static void
> >> consider_acl(struct hmap *lflows, struct ovn_datapath *od,
> >> - struct nbrec_acl *acl, bool has_stateful)
> >> + struct nbrec_acl *acl, bool has_stateful,
> >> + bool has_stateful_bypass)
> >> {
> >> bool ingress = !strcmp(acl->direction, "from-lport") ? true :false;
> >> enum ovn_stage stage = ingress ? S_SWITCH_IN_ACL : S_SWITCH_OUT_ACL;
> >> @@ -5285,7 +5312,19 @@ consider_acl(struct hmap *lflows, struct
> >> ovn_datapath *od,
> >> struct ds match = DS_EMPTY_INITIALIZER;
> >> struct ds actions = DS_EMPTY_INITIALIZER;
> >>
> >> - /* Commit the connection tracking entry if it's a new
> >> + /* If traffic matched the acl-stateful-bypass rule, we don't
> >> + * need to commit the connection tracking entry.
> >> + */
> >> + if (has_stateful_bypass) {
> >> + ds_put_format(&match, "(" REGBIT_SKIP_ACL_CT "== 1 &&
> >> (%s)",
> >> + acl->match);
> >> + build_acl_log(&actions, acl);
> >> + ds_put_format(&actions, "next;");
> >> + ds_clear(&match);
> >> + ds_clear(&actions);
> >> + }
> >> +
> >> + /* Otherwise commit the connection tracking entry if it's a
> >> new
> >> * connection that matches this ACL. After this commit,
> >> * the reply traffic is allowed by a flow we create at
> >> * priority 65535, defined earlier.
> >> @@ -5297,10 +5336,11 @@ consider_acl(struct hmap *lflows, struct
> >> ovn_datapath *od,
> >> * by ct_commit in the "stateful" stage) to indicate that the
> >> * connection should be allowed to resume.
> >> */
> >> - ds_put_format(&match, "((ct.new && !ct.est)"
> >> - " || (!ct.new && ct.est && !ct.rpl "
> >> - "&& ct_label.blocked == 1)) "
> >> - "&& (%s)", acl->match);
> >> + ds_put_format(&match, REGBIT_SKIP_ACL_CT " == 0 "
> >> + "&& ((ct.new && !ct.est)"
> >> + " || (!ct.new && ct.est && !ct.rpl "
> >> + "&& ct_label.blocked == 1)) "
> >> + "&& (%s)", acl->match);
> >> ds_put_cstr(&actions, REGBIT_CONNTRACK_COMMIT" = 1; ");
> >> build_acl_log(&actions, acl);
> >> ds_put_cstr(&actions, "next;");
> >> @@ -5315,11 +5355,16 @@ consider_acl(struct hmap *lflows, struct
> >> ovn_datapath *od,
> >> * deletion. There is no need to commit here, so we can just
> >> * proceed to the next table. We use this to ensure that this
> >> * connection is still allowed by the currently defined
> >> - * policy. Match untracked packets too. */
> >> + * policy. Match untracked packets too.
> >> + *
> >> + * This flow also allows traffic that matches the
> >> + * acl-stateful-bypass rule.
> >> + */
> >> ds_clear(&match);
> >> ds_clear(&actions);
> >> ds_put_format(&match,
> >> - "(!ct.trk || (!ct.new && ct.est && !ct.rpl"
> >> + "(" REGBIT_SKIP_ACL_CT " == 1 || !ct.trk || "
> >> + "(!ct.new && ct.est && !ct.rpl"
> >> " && ct_label.blocked == 0)) && (%s)",
> >> acl->match);
> >>
> >> @@ -5346,7 +5391,7 @@ consider_acl(struct hmap *lflows, struct
> >> ovn_datapath *od,
> >> /* If the packet is not tracked or not part of an established
> >> * connection, then we can simply reject/drop it. */
> >> ds_put_cstr(&match,
> >> - "(!ct.trk || !ct.est"
> >> + "(" REGBIT_SKIP_ACL_CT " == 1 || !ct.trk ||
> >> !ct.est"
> >> " || (ct.est && ct_label.blocked == 1))");
> >> if (!strcmp(acl->action, "reject")) {
> >> build_reject_acl_rules(od, lflows, stage, acl, &match,
> >> @@ -5373,7 +5418,8 @@ consider_acl(struct hmap *lflows, struct
> >> ovn_datapath *od,
> >> */
> >> ds_clear(&match);
> >> ds_clear(&actions);
> >> - ds_put_cstr(&match, "ct.est && ct_label.blocked == 0");
> >> + ds_put_cstr(&match, REGBIT_SKIP_ACL_CT " == 0 "
> >> + "&& ct.est && ct_label.blocked == 0");
> >> ds_put_cstr(&actions, "ct_commit { ct_label.blocked = 1; };
> >> ");
> >> if (!strcmp(acl->action, "reject")) {
> >> build_reject_acl_rules(od, lflows, stage, acl, &match,
> >> @@ -5478,6 +5524,7 @@ build_acls(struct ovn_datapath *od, struct hmap
> >> *lflows,
> >> struct hmap *port_groups)
> >> {
> >> bool has_stateful = has_stateful_acl(od);
> >> + bool has_stateful_bypass = has_stateful_acl_bypass(od);
> >>
> >> /* Ingress and Egress ACL Table (Priority 0): Packets are allowed by
> >> * default. A related rule at priority 1 is added below if there
> >> @@ -5508,11 +5555,15 @@ build_acls(struct ovn_datapath *od, struct hmap
> >> *lflows,
> >> * Subsequent packets will hit the flow at priority 0 that just
> >> * uses "next;". */
> >> ovn_lflow_add(lflows, od, S_SWITCH_IN_ACL, 1,
> >> - "ip && (!ct.est || (ct.est && ct_label.blocked ==
> >> 1))",
> >> - REGBIT_CONNTRACK_COMMIT" = 1; next;");
> >> + REGBIT_SKIP_ACL_CT " == 0 "
> >> + "&& ip "
> >> + "&& (!ct.est || (ct.est && ct_label.blocked == 1))",
> >> + REGBIT_CONNTRACK_COMMIT" = 1; next;");
> >> ovn_lflow_add(lflows, od, S_SWITCH_OUT_ACL, 1,
> >> - "ip && (!ct.est || (ct.est && ct_label.blocked ==
> >> 1))",
> >> - REGBIT_CONNTRACK_COMMIT" = 1; next;");
> >> + REGBIT_SKIP_ACL_CT " == 0 "
> >> + "&& ip "
> >> + "&& (!ct.est || (ct.est && ct_label.blocked == 1))",
> >> + REGBIT_CONNTRACK_COMMIT" = 1; next;");
> >>
> >> /* Ingress and Egress ACL Table (Priority 65535).
> >> *
> >> @@ -5522,10 +5573,14 @@ build_acls(struct ovn_datapath *od, struct hmap
> >> *lflows,
> >> *
> >> * This is enforced at a higher priority than ACLs can be
> >> defined. */
> >> ovn_lflow_add(lflows, od, S_SWITCH_IN_ACL, UINT16_MAX,
> >> - "ct.inv || (ct.est && ct.rpl && ct_label.blocked ==
> >> 1)",
> >> + REGBIT_SKIP_ACL_CT " == 0"
> >> + "&& (ct.inv "
> >> + "|| (ct.est && ct.rpl && ct_label.blocked ==
> >> 1))",
> >> "drop;");
> >> ovn_lflow_add(lflows, od, S_SWITCH_OUT_ACL, UINT16_MAX,
> >> - "ct.inv || (ct.est && ct.rpl && ct_label.blocked ==
> >> 1)",
> >> + REGBIT_SKIP_ACL_CT " == 0 "
> >> + "&& (ct.inv "
> >> + "|| (ct.est && ct.rpl && ct_label.blocked ==
> >> 1))",
> >> "drop;");
> >>
> >> /* Ingress and Egress ACL Table (Priority 65535).
> >> @@ -5538,11 +5593,13 @@ build_acls(struct ovn_datapath *od, struct hmap
> >> *lflows,
> >> *
> >> * This is enforced at a higher priority than ACLs can be
> >> defined. */
> >> ovn_lflow_add(lflows, od, S_SWITCH_IN_ACL, UINT16_MAX,
> >> - "ct.est && !ct.rel && !ct.new && !ct.inv "
> >> + REGBIT_SKIP_ACL_CT "== 0 "
> >> + "&& ct.est && !ct.rel && !ct.new && !ct.inv "
> >> "&& ct.rpl && ct_label.blocked == 0",
> >> "next;");
> >> ovn_lflow_add(lflows, od, S_SWITCH_OUT_ACL, UINT16_MAX,
> >> - "ct.est && !ct.rel && !ct.new && !ct.inv "
> >> + REGBIT_SKIP_ACL_CT "== 0 "
> >> + "&& ct.est && !ct.rel && !ct.new && !ct.inv "
> >> "&& ct.rpl && ct_label.blocked == 0",
> >> "next;");
> >>
> >> @@ -5558,11 +5615,13 @@ build_acls(struct ovn_datapath *od, struct hmap
> >> *lflows,
> >> * related traffic such as an ICMP Port Unreachable through
> >> * that's generated from a non-listening UDP port. */
> >> ovn_lflow_add(lflows, od, S_SWITCH_IN_ACL, UINT16_MAX,
> >> - "!ct.est && ct.rel && !ct.new && !ct.inv "
> >> + REGBIT_SKIP_ACL_CT "== 0 "
> >> + "&& !ct.est && ct.rel && !ct.new && !ct.inv "
> >> "&& ct_label.blocked == 0",
> >> "next;");
> >> ovn_lflow_add(lflows, od, S_SWITCH_OUT_ACL, UINT16_MAX,
> >> - "!ct.est && ct.rel && !ct.new && !ct.inv "
> >> + REGBIT_SKIP_ACL_CT "== 0 "
> >> + "&& !ct.est && ct.rel && !ct.new && !ct.inv "
> >> "&& ct_label.blocked == 0",
> >> "next;");
> >>
> >> @@ -5578,13 +5637,14 @@ build_acls(struct ovn_datapath *od, struct hmap
> >> *lflows,
> >> /* Ingress or Egress ACL Table (Various priorities). */
> >> for (size_t i = 0; i < od->nbs->n_acls; i++) {
> >> struct nbrec_acl *acl = od->nbs->acls[i];
> >> - consider_acl(lflows, od, acl, has_stateful);
> >> + consider_acl(lflows, od, acl, has_stateful, has_stateful_bypass);
> >> }
> >> struct ovn_port_group *pg;
> >> HMAP_FOR_EACH (pg, key_node, port_groups) {
> >> if (ovn_port_group_ls_find(pg, &od->nbs->header_.uuid)) {
> >> for (size_t i = 0; i < pg->nb_pg->n_acls; i++) {
> >> - consider_acl(lflows, od, pg->nb_pg->acls[i],
> >> has_stateful);
> >> + consider_acl(lflows, od, pg->nb_pg->acls[i], has_stateful,
> >> + has_stateful_bypass);
> >> }
> >> }
> >> }
> >> diff --git a/ovn-nb.xml b/ovn-nb.xml
> >> index 9f3621d..f3e368e 100644
> >> --- a/ovn-nb.xml
> >> +++ b/ovn-nb.xml
> >> @@ -271,9 +271,30 @@
> >> ip addresses.
> >> </column>
> >>
> >> - <column name="acls">
> >> - Access control rules that apply to packets within the logical
> >> switch.
> >> - </column>
> >> + <group title="ACL processing">
> >> + <column name="acls">
> >> + Access control rules that apply to packets within the logical
> >> switch.
> >> + </column>
> >> +
> >> + <column name="other_config" key="acl-stateful-bypass">
> >> + Set this value to a match expression to be used to determine for
> >> which
> >> + traffic the ACL processing should be stateless, without
> >> recirculating
> >> + through connection tracking, regardless of the type of ACL that is
> >> + hit.
> >> +
> >> + In normal operation, whenever an ACL associated to a
> >> Logical_Switch
> >> + has action <code>allow-related</code>, all IP traffic gets sent
> >> + to conntrack and related traffic is allowed.
> >> +
> >> + If <ref column="other_config" key="acl-stateful-bypass"/> is set
> >> to
> >> + <code>E</code> all <code>allow</code> and
> >> <code>allow-related</code>
> >> + ACLs that match packets for which <code>E</code> is true are
> >> applied
> >> + in a stateless way, without recirculating through connection
> >> tracking.
> >> +
> >> + This is useful when some specific types of traffic do not need
> >> + stateful processing.
> >> + </column>
> >> + </group>
> >>
> >> <column name="qos_rules">
> >> QoS marking and metering rules that apply to packets within the
> >> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> >> index 8344c7f..7aa13a1 100644
> >> --- a/tests/ovn-northd.at
> >> +++ b/tests/ovn-northd.at
> >> @@ -1781,3 +1781,132 @@ AT_CHECK([ovn-sbctl lflow-list | grep
> >> "ls_out_pre_lb.*priority=100" | grep reg0
> >> ])
> >>
> >> AT_CLEANUP
> >> +
> >> +AT_SETUP([ovn -- ACL Stateful Bypass])
> >> +ovn_start
> >> +
> >> +ovn-nbctl ls-add ls
> >> +ovn-nbctl lsp-add ls lsp1
> >> +ovn-nbctl lsp-set-addresses lsp1 00:00:00:00:00:01
> >> +ovn-nbctl lsp-add ls lsp2
> >> +ovn-nbctl lsp-set-addresses lsp2 00:00:00:00:00:02
> >> +
> >> +ovn-nbctl acl-add ls from-lport 3 "tcp" allow
> >> +ovn-nbctl acl-add ls from-lport 2 "udp" allow-related
> >> +ovn-nbctl acl-add ls from-lport 1 "ip" drop
> >> +ovn-nbctl --wait=sb sync
> >> +
> >> +flow_eth='eth.src == 00:00:00:00:00:01 && eth.dst == 00:00:00:00:00:02'
> >> +flow_ip='ip.ttl==64 && ip4.src == 42.42.42.1 && ip4.dst == 66.66.66.66'
> >> +flow_tcp='tcp && tcp.dst == 80'
> >> +flow_udp='udp && udp.dst == 80'
> >> +
> >> +# TCP packets should go to conntrack.
> >> +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_tcp}"
> >> +AT_CHECK([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl
> >> +#
> >> tcp,reg14=0x1,vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80,tcp_flags=0
> >> +ct_next(ct_state=new|trk) {
> >> + ct_next(ct_state=new|trk) {
> >> + output("lsp2");
> >> + };
> >> +};
> >> +])
> >> +
> >> +# UDP packets should go to conntrack.
> >> +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_udp}"
> >> +AT_CHECK([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl
> >> +#
> >> udp,reg14=0x1,vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80
> >> +ct_next(ct_state=new|trk) {
> >> + ct_next(ct_state=new|trk) {
> >> + output("lsp2");
> >> + };
> >> +};
> >> +])
> >> +
> >> +# Enable Stateful Bypass for TCP.
> >> +ovn-nbctl set logical_switch ls other_config:acl-stateful-bypass="tcp"
> >> +
> >> +# TCP packets should not go to conntrack anymore.
> >> +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_tcp}"
> >> +AT_CHECK([ovn-trace --minimal ls "${flow}"], [0], [dnl
> >> +#
> >> tcp,reg14=0x1,vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80,tcp_flags=0
> >> +output("lsp2");
> >> +])
> >> +
> >> +# UDP packets still go to conntrack.
> >> +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_udp}"
> >> +AT_CHECK([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl
> >> +#
> >> udp,reg14=0x1,vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80
> >> +ct_next(ct_state=new|trk) {
> >> + ct_next(ct_state=new|trk) {
> >> + output("lsp2");
> >> + };
> >> +};
> >> +])
> >> +
> >> +# Add a load balancer.
> >> +ovn-nbctl lb-add lb-tcp 66.66.66.66:80 42.42.42.2:8080 tcp
> >> +ovn-nbctl lb-add lb-udp 66.66.66.66:80 42.42.42.2:8080 udp
> >> +ovn-nbctl ls-lb-add ls lb-tcp
> >> +ovn-nbctl ls-lb-add ls lb-udp
> >> +
> >> +# Disable Stateful Bypass for TCP.
> >> +ovn-nbctl remove logical_switch ls other_config acl-stateful-bypass
> >> +ovn-nbctl --wait=sb sync
> >> +
> >> +# TCP packets should go to conntrack.
> >> +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_tcp}"
> >> +AT_CHECK([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl
> >> +#
> >> tcp,reg14=0x1,vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80,tcp_flags=0
> >> +ct_next(ct_state=new|trk) {
> >> + ct_lb {
> >> + ct_next(ct_state=new|trk) {
> >> + output("lsp2");
> >> + };
> >> + };
> >> +};
> >> +])
> >> +
> >> +# UDP packets should go to conntrack.
> >> +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_udp}"
> >> +AT_CHECK([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl
> >> +#
> >> udp,reg14=0x1,vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80
> >> +ct_next(ct_state=new|trk) {
> >> + ct_lb {
> >> + ct_next(ct_state=new|trk) {
> >> + output("lsp2");
> >> + };
> >> + };
> >> +};
> >> +])
> >> +
> >> +# Enable Stateful Bypass for TCP.
> >> +ovn-nbctl set logical_switch ls other_config:acl-stateful-bypass="tcp"
> >> +
> >> +# TCP packets should go to conntrack for load balancing.
> >> +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_tcp}"
> >> +AT_CHECK([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl
> >> +#
> >> tcp,reg14=0x1,vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80,tcp_flags=0
> >> +ct_next(ct_state=new|trk) {
> >> + ct_lb {
> >> + ct_next(ct_state=new|trk) {
> >> + output("lsp2");
> >> + };
> >> + };
> >> +};
> >> +])
> >> +
> >> +# UDP packets still go to conntrack.
> >> +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_udp}"
> >> +AT_CHECK([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl
> >> +#
> >> udp,reg14=0x1,vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80
> >> +ct_next(ct_state=new|trk) {
> >> + ct_lb {
> >> + ct_next(ct_state=new|trk) {
> >> + output("lsp2");
> >> + };
> >> + };
> >> +};
> >> +])
> >> +
> >> +AT_CLEANUP
> >> diff --git a/tests/system-common-macros.at b/tests/system-common-macros.at
> >> index c8fa6f0..65904ed 100644
> >> --- a/tests/system-common-macros.at
> >> +++ b/tests/system-common-macros.at
> >> @@ -234,6 +234,14 @@ m4_define([FORMAT_PING], [grep "transmitted" | sed
> >> 's/time.*ms$/time 0ms/'])
> >> #
> >> m4_define([STRIP_MONITOR_CSUM], [grep "csum:" | sed 's/csum:.*/csum:
> >> <skip>/'])
> >>
> >> +# FORMAT_CT_STATE([ip-addr])
> >> +#
> >> +# Strip content from the piped input which would differ from test to test
> >> +# and limit the output to the rows containing 'ip-addr'. Don't strip
> >> state.
> >> +#
> >> +m4_define([FORMAT_CT_STATE],
> >> + [[grep "dst=$1" | sed -e 's/port=[0-9]*/port=<cleared>/g' -e
> >> 's/id=[0-9]*/id=<cleared>/g' | sort | uniq]])
> >> +
> >> # FORMAT_CT([ip-addr])
> >> #
> >> # Strip content from the piped input which would differ from test to test
> >> diff --git a/tests/system-ovn.at b/tests/system-ovn.at
> >> index 40ba6e4..583bca4 100644
> >> --- a/tests/system-ovn.at
> >> +++ b/tests/system-ovn.at
> >> @@ -5397,3 +5397,114 @@ as
> >> OVS_TRAFFIC_VSWITCHD_STOP(["/.*error receiving.*/d
> >> /.*terminating with signal 15.*/d"])
> >> AT_CLEANUP
> >> +
> >> +AT_SETUP([ovn -- ACL Stateful Bypass + Load balancer])
> >> +AT_SKIP_IF([test $HAVE_NC = no])
> >> +AT_KEYWORDS([lb])
> >> +AT_KEYWORDS([conntrack])
> >> +ovn_start
> >> +
> >> +OVS_TRAFFIC_VSWITCHD_START()
> >> +ADD_BR([br-int])
> >> +
> >> +# Set external-ids in br-int needed for ovn-controller
> >> +ovs-vsctl \
> >> + -- set Open_vSwitch . external-ids:system-id=hv1 \
> >> + -- set Open_vSwitch .
> >> external-ids:ovn-remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \
> >> + -- set Open_vSwitch . external-ids:ovn-encap-type=geneve \
> >> + -- set Open_vSwitch . external-ids:ovn-encap-ip=169.0.0.1 \
> >> + -- set bridge br-int fail-mode=secure
> >> other-config:disable-in-band=true
> >> +
> >> +# Start ovn-controller
> >> +start_daemon ovn-controller
> >> +
> >> +# Logical network:
> >> +# One logical switch with a load balancer with one backend.
> >> +# On the LS we add "allow" ACLs for TCP and "allow-related" ACLs for UDP.
> >> +# The "allow-related" ACL normally forces all traffic to go to conntrack.
> >> +# We enable ACL stateful bypass for TCP so TCP traffic should not be
> >> +# sent to conntrack for ACLs (only for LB).
> >> +
> >> +ovn-nbctl ls-add ls
> >> +ovn-nbctl lsp-add ls lsp1
> >> +ovn-nbctl lsp-set-addresses lsp1 00:00:00:00:00:01
> >> +ovn-nbctl lsp-add ls lsp2
> >> +ovn-nbctl lsp-set-addresses lsp2 00:00:00:00:00:02
> >> +
> >> +ovn-nbctl acl-add ls from-lport 3 "tcp" allow
> >> +ovn-nbctl acl-add ls from-lport 2 "udp" allow-related
> >> +ovn-nbctl acl-add ls from-lport 1 "ip" drop
> >> +
> >> +ovn-nbctl lr-add rtr
> >> +ovn-nbctl lrp-add rtr rtr-ls 00:00:00:00:01:00 42.42.42.254/24
> >> +ovn-nbctl lsp-add ls ls-rtr \
> >> + -- lsp-set-type ls-rtr router \
> >> + -- lsp-set-addresses ls-rtr 00:00:00:00:01:00 \
> >> + -- lsp-set-options ls-rtr router-port=rtr-ls
> >> +
> >> +# Add a load balancer.
> >> +ovn-nbctl lb-add lb-tcp 66.66.66.66:80 42.42.42.2:8080 tcp
> >> +ovn-nbctl lb-add lb-udp 66.66.66.66:80 42.42.42.2:8080 udp
> >> +ovn-nbctl ls-lb-add ls lb-tcp
> >> +ovn-nbctl ls-lb-add ls lb-udp
> >> +
> >> +# Enable Stateful Bypass for TCP.
> >> +ovn-nbctl set logical_switch ls other_config:acl-stateful-bypass="tcp"
> >> +
> >> +ADD_NAMESPACES(lsp1)
> >> +ADD_VETH(lsp1, lsp1, br-int, "42.42.42.1/24", "00:00:00:00:00:01", \
> >> + "42.42.42.254")
> >> +
> >> +ADD_NAMESPACES(lsp2)
> >> +ADD_VETH(lsp2, lsp2, br-int, "42.42.42.2/24", "00:00:00:00:00:02", \
> >> + "42.42.42.254")
> >> +
> >> +ovn-nbctl --wait=hv sync
> >> +
> >> +# Start a UDP server on lsp2.
> >> +NETNS_DAEMONIZE([lsp2], [nc -l --no-shutdown -u 42.42.42.2 8080],
> >> [nc2.pid])
> >> +
> >> +# Start a UDP connection.
> >> +NS_CHECK_EXEC([lsp1], [echo "foo" | nc --no-shutdown -u 66.66.66.66 80])
> >> +
> >> +# There should be 2 UDP conntrack entries:
> >> +# - one for the allow-related ACL.
> >> +# - one for the LB dnat.
> >> +OVS_WAIT_UNTIL([test "$(ovs-appctl dpctl/dump-conntrack | grep udp | grep
> >> '42.42.42.1' -c)" = "2"])
> >> +
> >> +AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT_STATE(42.42.42.1) |
> >> grep udp | \
> >> +sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [dnl
> >> +udp,orig=(src=42.42.42.1,dst=42.42.42.2,sport=<cleared>,dport=<cleared>),reply=(src=42.42.42.2,dst=42.42.42.1,sport=<cleared>,dport=<cleared>),zone=<cleared>
> >> +udp,orig=(src=42.42.42.1,dst=66.66.66.66,sport=<cleared>,dport=<cleared>),reply=(src=42.42.42.2,dst=42.42.42.1,sport=<cleared>,dport=<cleared>),zone=<cleared>,labels=0x2
> >> +])
> >> +
> >> +# Start a TCP server on lsp2.
> >> +NETNS_DAEMONIZE([lsp2], [nc -l --no-shutdown 42.42.42.2 8080], [nc0.pid])
> >> +
> >> +# Start a TCP connection.
> >> +NETNS_DAEMONIZE([lsp1], [nc --no-shutdown 66.66.66.66 80], [nc1.pid])
> >> +
> >> +OVS_WAIT_UNTIL([test "$(ovs-appctl dpctl/dump-conntrack | grep tcp | grep
> >> '42.42.42.1' -c)" = "1"])
> >> +
> >> +# There should be only one TCP conntrack entry, for the LB dnat.
> >> +AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT_STATE(42.42.42.1) |
> >> grep tcp | \
> >> +sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [dnl
> >> +tcp,orig=(src=42.42.42.1,dst=66.66.66.66,sport=<cleared>,dport=<cleared>),reply=(src=42.42.42.2,dst=42.42.42.1,sport=<cleared>,dport=<cleared>),zone=<cleared>,labels=0x2,protoinfo=(state=ESTABLISHED)
> >> +])
> >> +
> >> +OVS_APP_EXIT_AND_WAIT([ovn-controller])
> >> +
> >> +as ovn-sb
> >> +OVS_APP_EXIT_AND_WAIT([ovsdb-server])
> >> +
> >> +as ovn-nb
> >> +OVS_APP_EXIT_AND_WAIT([ovsdb-server])
> >> +
> >> +as northd
> >> +OVS_APP_EXIT_AND_WAIT([ovn-northd])
> >> +
> >> +as
> >> +OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query port patch-.*/d
> >> +/connection dropped.*/d"])
> >> +
> >> +AT_CLEANUP
> >> diff --git a/utilities/ovn-nbctl.c b/utilities/ovn-nbctl.c
> >> index d7bb4b4..695b0c3 100644
> >> --- a/utilities/ovn-nbctl.c
> >> +++ b/utilities/ovn-nbctl.c
> >> @@ -2079,6 +2079,16 @@ nbctl_acl_list(struct ctl_context *ctx)
> >> return;
> >> }
> >>
> >> + if (ls) {
> >> + const char *stateful_bypass = smap_get(&ls->other_config,
> >> + "acl-stateful-bypass");
> >> +
> >> + if (stateful_bypass) {
> >> + ds_put_format(&ctx->output, "Stateful bypass rule match:
> >> %s\n",
> >> + stateful_bypass);
> >> + }
> >> + }
> >> +
> >> size_t n_acls = pg ? pg->n_acls : ls->n_acls;
> >> struct nbrec_acl **nb_acls = pg ? pg->acls : ls->acls;
> >>
> >> --
> >> 1.8.3.1
> >>
> >> _______________________________________________
> >> dev mailing list
> >> [email protected]
> >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >>
> >
>
> _______________________________________________
> dev mailing list
> [email protected]
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev