On Mon, Aug 17, 2020 at 8:57 PM Dumitru Ceara <[email protected]> wrote:
>
> On 8/17/20 4:32 PM, Numan Siddique wrote:
> > On Wed, Aug 12, 2020 at 5:52 PM Dumitru Ceara <[email protected]> wrote:
> >>
> >> A new configuration option is added to Logical_Switch:
> >> other_config:acl-stateful-bypass. This optional value determines which
> >> traffic should completely bypass connection tracking when ACLs are
> >> processed.
> >>
> >> In specific scenarios CMSs can predetermine which traffic must be
> >> firewalled statefully or not, e.g., UDP vs TCP. However, until now, if
> >> at least one stateful ACL (allow-related) is configured on the switch,
> >> all traffic gets sent to connection tracking. This induces a hit in
> >> performance when forwarding packets that don't need stateful processing.
> >>
> >> Signed-off-by: Dumitru Ceara <[email protected]>
> >
> > Hi Dumitru,
> >
> > Thanks for the patch. I have few comments.
> >
>
> Hi Numan,
>
> Thanks for the review.
>
> > 1. From what I understand, the packet now is not sent to the conntrack
> > in the ingress pipeline, but it
> >     is still sent in the egress pipeline
> >
>
> Actually, it shouldn't be sent to conntrack in the egress pipeline
> either if it matches the acl-stateful-bypass filter.

I think if the logical switch has a load balancer, then the packet
will go to conntrack in the egress
pipeline right ?

Thanks
Numan

>
> >    This patch breaks for the below OVN resources and ACL configuration.
> >
> >   ***
> > ovn-nbctl ls-add sw0
> > ovn-nbctl lsp-add sw0 sw0-port1
> > ovn-nbctl lsp-set-addresses sw0-port1 "50:54:00:00:00:03 10.0.0.3"
> >
> > ovn-nbctl lr-add lr0
> > ovn-nbctl lrp-add lr0 lr0-sw0 00:00:00:00:ff:01 10.0.0.1/24
> > ovn-nbctl lsp-add sw0 sw0-lr0
> > ovn-nbctl lsp-set-type sw0-lr0 router
> > ovn-nbctl lsp-set-addresses sw0-lr0 router
> > ovn-nbctl lsp-set-options sw0-lr0 router-port=lr0-sw0
> >
> > ovn-nbctl ls-add public
> > ovn-nbctl lrp-add lr0 lr0-public 00:00:20:20:12:13 172.16.0.100/24
> > ovn-nbctl lsp-add public public-lr0
> > ovn-nbctl lsp-set-type public-lr0 router
> > ovn-nbctl lsp-set-addresses public-lr0 router
> > ovn-nbctl lsp-set-options public-lr0 router-port=lr0-public
> >
> > # localnet port
> > ovn-nbctl lsp-add public ln-public
> > ovn-nbctl lsp-set-type ln-public localnet
> > ovn-nbctl lsp-set-addresses ln-public unknown
> > ovn-nbctl lsp-set-options ln-public network_name=public
> >
> > # schedule the gw router port to a chassis.
> > ovn-nbctl lrp-set-gateway-chassis lr0-public ovn-gw-1 20
> >
> > # Create NAT entries
> > ovn-nbctl lr-nat-add lr0 snat 172.16.0.100 10.0.0.0/24
> > ovn-nbctl lr-nat-add lr0 dnat_and_snat 172.16.0.110 10.0.0.3 sw0-port1
> > 30:54:00:00:00:03
> >
> > ovn-nbctl pg-add pg0 sw0-port1
> > ovn-nbctl pg-add pg0_drop sw0-port1
> >
> > ovn-nbctl acl-add pg0_drop from-lport 1001 "inport == @pg0_drop && ip" drop
> > ovn-nbctl acl-add pg0_drop to-lport 1001 "outport == @pg0_drop && ip" drop
> >
> > ovn-nbctl acl-add pg0 to-lport 1002 "outport == @pg0 && ip4 && tcp.dst
> > == 80" allow-related
> > ovn-nbctl set logical_switch sw0 other_config:acl-stateful-bypass="tcp"
> > *****
> >
> > Start a nc server on sw0-port1 and connect to the sw0-port1 nc server
> > from outside (i.e the nc client to 172.16.0.110 (North/South
> > traffic)).
> >
> > Without this patch, it works, but with this patch, the reply gets
> > dropped since the packet is not sent to the conntrack now.
> >
>
> I think this needs more documentation on my side. I might be wrong but I
> think we discussed this during the OVN community meeting with Han when I
> first suggested this approach: if the match of an allow-related ACL is a
> superset of the acl-stateful-bypass match then the ACL the ACL is
> implicitly converted to an "allow" ACL.
>
> The CMS should avoid such configurations. I think this is unfortunately
> the only way to go because the packets go to conntrack before the
> IN/OUT_ACL table. in the PRE_STATEFUL stage.
>
> In your case the SYN packet matches the ACL and is allowed while for the
> SYN-ACK packet there's no matching rule except for the drop rule.

It would be great to add more documentation on it.

>
> The configuration that would make this work is:
> ovn-nbctl set logical_switch sw0 other_config:acl-stateful-bypass="tcp"
> ovn-nbctl acl-add pg0_drop from-lport 1001 "inport == @pg0_drop && ip" drop
> ovn-nbctl acl-add pg0_drop to-lport 1001 "outport == @pg0_drop && ip" drop
> ovn-nbctl acl-add pg0 to-lport 1002 "outport == @pg0 && ip4 && tcp.dst
> == 80" allow
> ovn-nbctl acl-add pg0 to-lport 1002 "inport == @pg0 && ip4 && tcp.src ==
> 80" allow
>
> If there would be another allow-related ACL (e.g. for UDP):
> ovn-nbctl acl-add pg0 to-lport 1002 "outport == @pg0 && ip4 && udp.dst
> == 4242" allow-related
>
> Without this patch the two "allow" rules for TCP would implicitly get
> transformed to "allow-related" because all packets would go to conntrack
> and all ACLs are considered stateful.
>
> With this patch, TCP can be processed without any conntrack
> recirculation while for UDP we'd be using conntrack.
>
> >
> > 2. After this patch we see the below lflow for each ACL.
> >
> >     ****
> >     table=4 (ls_out_acl         ), priority=2002 , match=((reg0[7] ==
> > 1 || !ct.trk || (!ct.new && ct.est && !ct.rpl && ct_label.blocked ==
> > 0)) && (ACL MATCH)), action=(next;)
> >     ****
> >
> >     The above flow (with the match - "reg0[7] == 1 " will result in
> > one extra Open vSwitch flow for every ACL.
> >
> >      I am not sure if this can be avoided, but if we can find a way to
> > avoid it would be great. Ignore this comment if this is not possible.
> >
>
> I don't think it can be avoided because we'd have to parse the match
> expression of the ACL to determine if the "acl-stateful-bypass" is a
> subexpression of it.

My initial thought was to add a high prio (may be 65535) flow in n_acl/out_acl
stage with the action - next if reg0[7] ==1 because we don't need to send
the packet to conntrack. But then realized that we need to handle drop ACLs.

>
> >
> > 3. The CMS can set any match in the option - 
> > "other_config:acl-stateful-bypass".
> >       For example:
> >      ovn-nbctl set logical_switch sw0
> > other_config:acl-stateful-bypass="tcp && tcp.dst >= 1000 && tcp.dst <=
> > 1005"
> >
> >      Although it works, it seems a bit odd to me that a config option
> > can be a match.
> >      Can't we instead add another ACL action ? Maybe "allow-stateless"
> > ? I am not sure if this is feasible, but just a thought.
> >
>
> We could do that but then these "allow-stateless" ACLs would be
> translated to flows in the pre-acl table like the patch does for
> "acl-stateful-bypass" now. So some of the ACLs will generate flows in
> the IN/OUT_ACL table while others will generate flows in the PRE_ACL
> table. Is that acceptable?

I think it should be fine. But I won't press too hard on it.

>
> > 4. A Small minor comment. In the lflow-list, I see below flows. Can
> > you please correct the indentations and spacing.
> >     ********
> >     table=6 (ls_in_acl          ), priority=65535, match=(reg0[7] ==
> > 0&& (ct.inv || (ct.est && ct.rpl && ct_label.blocked == 1))),
> > action=(drop;)
> >   table=6 (ls_in_acl          ), priority=65535, match=(reg0[7]== 0 &&
> > !ct.est && ct.rel && !ct.new && !ct.inv && ct_label.blocked == 0),
> > action=(next;)
> >   table=6 (ls_in_acl          ), priority=65535, match=(reg0[7]== 0 &&
> > ct.est && !ct.rel && !ct.new && !ct.inv && ct.rpl && ct_label.blocked
> > == 0), action=(next;)
> >     ********
> >
> >     Notice - reg0[7] == 0&&... and "reg0[7]== 0 && ....
>
> Thanks, noted, I'll fix it in the next iteration.

Thanks
Numan

>
> Regards,
> Dumitru
>
> >
> > Thanks
> > Numan
> >
> >
> >
> >
> >> ---
> >>  NEWS                          |   3 +
> >>  northd/ovn-northd.8.xml       |  18 ++++++
> >>  northd/ovn-northd.c           | 114 ++++++++++++++++++++++++++++---------
> >>  ovn-nb.xml                    |  27 ++++++++-
> >>  tests/ovn-northd.at           | 129 
> >> ++++++++++++++++++++++++++++++++++++++++++
> >>  tests/system-common-macros.at |   8 +++
> >>  tests/system-ovn.at           | 111 ++++++++++++++++++++++++++++++++++++
> >>  utilities/ovn-nbctl.c         |  10 ++++
> >>  8 files changed, 390 insertions(+), 30 deletions(-)
> >>
> >> diff --git a/NEWS b/NEWS
> >> index a1ce4e8..0ef7905 100644
> >> --- a/NEWS
> >> +++ b/NEWS
> >> @@ -11,6 +11,9 @@ Post-v20.06.0
> >>       called Chassis_Private now contains the nb_cfg column which is 
> >> updated
> >>       by incrementing the value in the NB_Global table, CMSes relying on
> >>       this mechanism should update their code to use this new table.
> >> +   - Added support for bypassing connection tracking for ACL processing 
> >> for
> >> +     specific types of traffic through the user supplied 
> >> acl-stateful-bypass
> >> +     configuration.
> >>
> >>  OVN v20.06.0
> >>  --------------------------
> >> diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
> >> index ee21c82..fc315f8 100644
> >> --- a/northd/ovn-northd.8.xml
> >> +++ b/northd/ovn-northd.8.xml
> >> @@ -322,6 +322,15 @@
> >>      </p>
> >>
> >>      <p>
> >> +      A priority-105 flow which sets <code>reg0[7] = 1</code> for traffic
> >> +      that matches the condition set in
> >> +      <ref column="other_config:acl-stateful-bypass"/> and advances to 
> >> next
> >> +      table. <code>reg0[7]</code> acts as a hint for tables
> >> +      <code>Pre-Stateful</code> and <code>ACL</code> to avoid sending this
> >> +      traffic to the connection tracker.
> >> +    </p>
> >> +
> >> +    <p>
> >>        This table also has a priority-110 flow with the match
> >>        <code>eth.dst == <var>E</var></code> for all logical switch
> >>        datapaths to move traffic to the next table. Where <var>E</var>
> >> @@ -1372,6 +1381,15 @@ output;
> >>      </p>
> >>
> >>      <p>
> >> +      A priority-105 flow which sets <code>reg0[7] = 1</code> for traffic
> >> +      that matches the condition set in
> >> +      <ref column="other_config:acl-stateful-bypass"/> and advances to 
> >> next
> >> +      table. <code>reg0[7]</code> acts as a hint for tables
> >> +      <code>Pre-Stateful</code> and <code>ACL</code> to avoid sending this
> >> +      traffic to the connection tracker.
> >> +    </p>
> >> +
> >> +    <p>
> >>        This table also has a priority-110 flow with the match
> >>        <code>eth.src == <var>E</var></code> for all logical switch
> >>        datapaths to move traffic to the next table. Where <var>E</var>
> >> diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> >> index dc45929..1ff9190 100644
> >> --- a/northd/ovn-northd.c
> >> +++ b/northd/ovn-northd.c
> >> @@ -211,6 +211,7 @@ enum ovn_stage {
> >>  #define REGBIT_DNS_LOOKUP_RESULT "reg0[4]"
> >>  #define REGBIT_ND_RA_OPTS_RESULT "reg0[5]"
> >>  #define REGBIT_HAIRPIN           "reg0[6]"
> >> +#define REGBIT_SKIP_ACL_CT       "reg0[7]"
> >>
> >>  /* Register definitions for switches and routers. */
> >>
> >> @@ -245,11 +246,11 @@ enum ovn_stage {
> >>   * OVS register usage:
> >>   *
> >>   * Logical Switch pipeline:
> >> - * +---------+-------------------------------------+
> >> - * | R0      | REGBIT_{CONNTRACK/DHCP/DNS/HAIRPIN} |
> >> - * +---------+-------------------------------------+
> >> - * | R1 - R9 |              UNUSED                 |
> >> - * +---------+-------------------------------------+
> >> + * +---------+-------------------------------------------------+
> >> + * | R0      | REGBIT_{CONNTRACK/DHCP/DNS/HAIRPIN/SKIP_ACL_CT} |
> >> + * +---------+-------------------------------------------------+
> >> + * | R1 - R9 |              UNUSED                             |
> >> + * +---------+-------------------------------------------------+
> >>   *
> >>   * Logical Router pipeline:
> >>   * 
> >> +-----+--------------------------+---+-----------------+---+---------------+
> >> @@ -4713,6 +4714,18 @@ has_stateful_acl(struct ovn_datapath *od)
> >>      return false;
> >>  }
> >>
> >> +static const char *
> >> +get_stateful_acl_bypass(struct ovn_datapath *od)
> >> +{
> >> +    return smap_get(&od->nbs->other_config, "acl-stateful-bypass");
> >> +}
> >> +
> >> +static bool
> >> +has_stateful_acl_bypass(struct ovn_datapath *od)
> >> +{
> >> +    return !!get_stateful_acl_bypass(od);
> >> +}
> >> +
> >>  static void
> >>  build_lswitch_input_port_sec(struct hmap *ports, struct hmap *datapaths,
> >>                               struct hmap *lflows)
> >> @@ -4926,6 +4939,19 @@ build_pre_acls(struct ovn_datapath *od, struct hmap 
> >> *lflows)
> >>                        "nd || nd_rs || nd_ra || "
> >>                        "(udp && udp.src == 546 && udp.dst == 547)", 
> >> "next;");
> >>
> >> +        /* Ingress and Egress Pre-ACL Table (Priority 105).
> >> +         *
> >> +         * If the logical switch is configured to bypass conntrack for
> >> +         * specific types of traffic, skip conntrack for that traffic.
> >> +         */
> >> +        const char *stateful_bypass = get_stateful_acl_bypass(od);
> >> +        if (stateful_bypass) {
> >> +            ovn_lflow_add(lflows, od, S_SWITCH_IN_PRE_ACL, 105,
> >> +                          stateful_bypass, REGBIT_SKIP_ACL_CT" = 1; 
> >> next;");
> >> +            ovn_lflow_add(lflows, od, S_SWITCH_OUT_PRE_ACL, 105,
> >> +                          stateful_bypass, REGBIT_SKIP_ACL_CT" = 1; 
> >> next;");
> >> +        }
> >> +
> >>          /* Ingress and Egress Pre-ACL Table (Priority 100).
> >>           *
> >>           * Regardless of whether the ACL is "from-lport" or "to-lport",
> >> @@ -5260,7 +5286,8 @@ build_reject_acl_rules(struct ovn_datapath *od, 
> >> struct hmap *lflows,
> >>
> >>  static void
> >>  consider_acl(struct hmap *lflows, struct ovn_datapath *od,
> >> -             struct nbrec_acl *acl, bool has_stateful)
> >> +             struct nbrec_acl *acl, bool has_stateful,
> >> +             bool has_stateful_bypass)
> >>  {
> >>      bool ingress = !strcmp(acl->direction, "from-lport") ? true :false;
> >>      enum ovn_stage stage = ingress ? S_SWITCH_IN_ACL : S_SWITCH_OUT_ACL;
> >> @@ -5285,7 +5312,19 @@ consider_acl(struct hmap *lflows, struct 
> >> ovn_datapath *od,
> >>              struct ds match = DS_EMPTY_INITIALIZER;
> >>              struct ds actions = DS_EMPTY_INITIALIZER;
> >>
> >> -            /* Commit the connection tracking entry if it's a new
> >> +            /* If traffic matched the acl-stateful-bypass rule, we don't
> >> +             * need to commit the connection tracking entry.
> >> +             */
> >> +            if (has_stateful_bypass) {
> >> +                ds_put_format(&match, "(" REGBIT_SKIP_ACL_CT "== 1 && 
> >> (%s)",
> >> +                              acl->match);
> >> +                build_acl_log(&actions, acl);
> >> +                ds_put_format(&actions, "next;");
> >> +                ds_clear(&match);
> >> +                ds_clear(&actions);
> >> +            }
> >> +
> >> +            /* Otherwise commit the connection tracking entry if it's a 
> >> new
> >>               * connection that matches this ACL.  After this commit,
> >>               * the reply traffic is allowed by a flow we create at
> >>               * priority 65535, defined earlier.
> >> @@ -5297,10 +5336,11 @@ consider_acl(struct hmap *lflows, struct 
> >> ovn_datapath *od,
> >>               * by ct_commit in the "stateful" stage) to indicate that the
> >>               * connection should be allowed to resume.
> >>               */
> >> -            ds_put_format(&match, "((ct.new && !ct.est)"
> >> -                                  " || (!ct.new && ct.est && !ct.rpl "
> >> -                                       "&& ct_label.blocked == 1)) "
> >> -                                  "&& (%s)", acl->match);
> >> +            ds_put_format(&match, REGBIT_SKIP_ACL_CT " == 0 "
> >> +                          "&& ((ct.new && !ct.est)"
> >> +                          " || (!ct.new && ct.est && !ct.rpl "
> >> +                               "&& ct_label.blocked == 1)) "
> >> +                          "&& (%s)", acl->match);
> >>              ds_put_cstr(&actions, REGBIT_CONNTRACK_COMMIT" = 1; ");
> >>              build_acl_log(&actions, acl);
> >>              ds_put_cstr(&actions, "next;");
> >> @@ -5315,11 +5355,16 @@ consider_acl(struct hmap *lflows, struct 
> >> ovn_datapath *od,
> >>               * deletion.  There is no need to commit here, so we can just
> >>               * proceed to the next table. We use this to ensure that this
> >>               * connection is still allowed by the currently defined
> >> -             * policy. Match untracked packets too. */
> >> +             * policy. Match untracked packets too.
> >> +             *
> >> +             * This flow also allows traffic that matches the
> >> +             * acl-stateful-bypass rule.
> >> +             */
> >>              ds_clear(&match);
> >>              ds_clear(&actions);
> >>              ds_put_format(&match,
> >> -                          "(!ct.trk || (!ct.new && ct.est && !ct.rpl"
> >> +                          "(" REGBIT_SKIP_ACL_CT " == 1 || !ct.trk || "
> >> +                          "(!ct.new && ct.est && !ct.rpl"
> >>                            " && ct_label.blocked == 0)) && (%s)",
> >>                            acl->match);
> >>
> >> @@ -5346,7 +5391,7 @@ consider_acl(struct hmap *lflows, struct 
> >> ovn_datapath *od,
> >>              /* If the packet is not tracked or not part of an established
> >>               * connection, then we can simply reject/drop it. */
> >>              ds_put_cstr(&match,
> >> -                        "(!ct.trk || !ct.est"
> >> +                        "(" REGBIT_SKIP_ACL_CT " == 1 || !ct.trk || 
> >> !ct.est"
> >>                          " || (ct.est && ct_label.blocked == 1))");
> >>              if (!strcmp(acl->action, "reject")) {
> >>                  build_reject_acl_rules(od, lflows, stage, acl, &match,
> >> @@ -5373,7 +5418,8 @@ consider_acl(struct hmap *lflows, struct 
> >> ovn_datapath *od,
> >>               */
> >>              ds_clear(&match);
> >>              ds_clear(&actions);
> >> -            ds_put_cstr(&match, "ct.est && ct_label.blocked == 0");
> >> +            ds_put_cstr(&match, REGBIT_SKIP_ACL_CT " == 0 "
> >> +                        "&& ct.est && ct_label.blocked == 0");
> >>              ds_put_cstr(&actions, "ct_commit { ct_label.blocked = 1; }; 
> >> ");
> >>              if (!strcmp(acl->action, "reject")) {
> >>                  build_reject_acl_rules(od, lflows, stage, acl, &match,
> >> @@ -5478,6 +5524,7 @@ build_acls(struct ovn_datapath *od, struct hmap 
> >> *lflows,
> >>             struct hmap *port_groups)
> >>  {
> >>      bool has_stateful = has_stateful_acl(od);
> >> +    bool has_stateful_bypass = has_stateful_acl_bypass(od);
> >>
> >>      /* Ingress and Egress ACL Table (Priority 0): Packets are allowed by
> >>       * default.  A related rule at priority 1 is added below if there
> >> @@ -5508,11 +5555,15 @@ build_acls(struct ovn_datapath *od, struct hmap 
> >> *lflows,
> >>           * Subsequent packets will hit the flow at priority 0 that just
> >>           * uses "next;". */
> >>          ovn_lflow_add(lflows, od, S_SWITCH_IN_ACL, 1,
> >> -                      "ip && (!ct.est || (ct.est && ct_label.blocked == 
> >> 1))",
> >> -                       REGBIT_CONNTRACK_COMMIT" = 1; next;");
> >> +                      REGBIT_SKIP_ACL_CT " == 0 "
> >> +                      "&& ip "
> >> +                      "&& (!ct.est || (ct.est && ct_label.blocked == 1))",
> >> +                      REGBIT_CONNTRACK_COMMIT" = 1; next;");
> >>          ovn_lflow_add(lflows, od, S_SWITCH_OUT_ACL, 1,
> >> -                      "ip && (!ct.est || (ct.est && ct_label.blocked == 
> >> 1))",
> >> -                       REGBIT_CONNTRACK_COMMIT" = 1; next;");
> >> +                      REGBIT_SKIP_ACL_CT " == 0 "
> >> +                      "&& ip "
> >> +                      "&& (!ct.est || (ct.est && ct_label.blocked == 1))",
> >> +                      REGBIT_CONNTRACK_COMMIT" = 1; next;");
> >>
> >>          /* Ingress and Egress ACL Table (Priority 65535).
> >>           *
> >> @@ -5522,10 +5573,14 @@ build_acls(struct ovn_datapath *od, struct hmap 
> >> *lflows,
> >>           *
> >>           * This is enforced at a higher priority than ACLs can be 
> >> defined. */
> >>          ovn_lflow_add(lflows, od, S_SWITCH_IN_ACL, UINT16_MAX,
> >> -                      "ct.inv || (ct.est && ct.rpl && ct_label.blocked == 
> >> 1)",
> >> +                      REGBIT_SKIP_ACL_CT " == 0"
> >> +                      "&& (ct.inv "
> >> +                           "|| (ct.est && ct.rpl && ct_label.blocked == 
> >> 1))",
> >>                        "drop;");
> >>          ovn_lflow_add(lflows, od, S_SWITCH_OUT_ACL, UINT16_MAX,
> >> -                      "ct.inv || (ct.est && ct.rpl && ct_label.blocked == 
> >> 1)",
> >> +                      REGBIT_SKIP_ACL_CT " == 0 "
> >> +                      "&& (ct.inv "
> >> +                           "|| (ct.est && ct.rpl && ct_label.blocked == 
> >> 1))",
> >>                        "drop;");
> >>
> >>          /* Ingress and Egress ACL Table (Priority 65535).
> >> @@ -5538,11 +5593,13 @@ build_acls(struct ovn_datapath *od, struct hmap 
> >> *lflows,
> >>           *
> >>           * This is enforced at a higher priority than ACLs can be 
> >> defined. */
> >>          ovn_lflow_add(lflows, od, S_SWITCH_IN_ACL, UINT16_MAX,
> >> -                      "ct.est && !ct.rel && !ct.new && !ct.inv "
> >> +                      REGBIT_SKIP_ACL_CT "== 0 "
> >> +                      "&& ct.est && !ct.rel && !ct.new && !ct.inv "
> >>                        "&& ct.rpl && ct_label.blocked == 0",
> >>                        "next;");
> >>          ovn_lflow_add(lflows, od, S_SWITCH_OUT_ACL, UINT16_MAX,
> >> -                      "ct.est && !ct.rel && !ct.new && !ct.inv "
> >> +                      REGBIT_SKIP_ACL_CT "== 0 "
> >> +                      "&& ct.est && !ct.rel && !ct.new && !ct.inv "
> >>                        "&& ct.rpl && ct_label.blocked == 0",
> >>                        "next;");
> >>
> >> @@ -5558,11 +5615,13 @@ build_acls(struct ovn_datapath *od, struct hmap 
> >> *lflows,
> >>           * related traffic such as an ICMP Port Unreachable through
> >>           * that's generated from a non-listening UDP port.  */
> >>          ovn_lflow_add(lflows, od, S_SWITCH_IN_ACL, UINT16_MAX,
> >> -                      "!ct.est && ct.rel && !ct.new && !ct.inv "
> >> +                      REGBIT_SKIP_ACL_CT "== 0 "
> >> +                      "&& !ct.est && ct.rel && !ct.new && !ct.inv "
> >>                        "&& ct_label.blocked == 0",
> >>                        "next;");
> >>          ovn_lflow_add(lflows, od, S_SWITCH_OUT_ACL, UINT16_MAX,
> >> -                      "!ct.est && ct.rel && !ct.new && !ct.inv "
> >> +                      REGBIT_SKIP_ACL_CT "== 0 "
> >> +                      "&& !ct.est && ct.rel && !ct.new && !ct.inv "
> >>                        "&& ct_label.blocked == 0",
> >>                        "next;");
> >>
> >> @@ -5578,13 +5637,14 @@ build_acls(struct ovn_datapath *od, struct hmap 
> >> *lflows,
> >>      /* Ingress or Egress ACL Table (Various priorities). */
> >>      for (size_t i = 0; i < od->nbs->n_acls; i++) {
> >>          struct nbrec_acl *acl = od->nbs->acls[i];
> >> -        consider_acl(lflows, od, acl, has_stateful);
> >> +        consider_acl(lflows, od, acl, has_stateful, has_stateful_bypass);
> >>      }
> >>      struct ovn_port_group *pg;
> >>      HMAP_FOR_EACH (pg, key_node, port_groups) {
> >>          if (ovn_port_group_ls_find(pg, &od->nbs->header_.uuid)) {
> >>              for (size_t i = 0; i < pg->nb_pg->n_acls; i++) {
> >> -                consider_acl(lflows, od, pg->nb_pg->acls[i], 
> >> has_stateful);
> >> +                consider_acl(lflows, od, pg->nb_pg->acls[i], has_stateful,
> >> +                             has_stateful_bypass);
> >>              }
> >>          }
> >>      }
> >> diff --git a/ovn-nb.xml b/ovn-nb.xml
> >> index 9f3621d..f3e368e 100644
> >> --- a/ovn-nb.xml
> >> +++ b/ovn-nb.xml
> >> @@ -271,9 +271,30 @@
> >>        ip addresses.
> >>      </column>
> >>
> >> -    <column name="acls">
> >> -      Access control rules that apply to packets within the logical 
> >> switch.
> >> -    </column>
> >> +    <group title="ACL processing">
> >> +      <column name="acls">
> >> +        Access control rules that apply to packets within the logical 
> >> switch.
> >> +      </column>
> >> +
> >> +      <column name="other_config" key="acl-stateful-bypass">
> >> +        Set this value to a match expression to be used to determine for 
> >> which
> >> +        traffic the ACL processing should be stateless, without 
> >> recirculating
> >> +        through connection tracking, regardless of the type of ACL that is
> >> +        hit.
> >> +
> >> +        In normal operation, whenever an ACL associated to a 
> >> Logical_Switch
> >> +        has action <code>allow-related</code>, all IP traffic gets sent
> >> +        to conntrack and related traffic is allowed.
> >> +
> >> +        If <ref column="other_config" key="acl-stateful-bypass"/> is set 
> >> to
> >> +        <code>E</code> all <code>allow</code> and 
> >> <code>allow-related</code>
> >> +        ACLs that match packets for which <code>E</code> is true are 
> >> applied
> >> +        in a stateless way, without recirculating through connection 
> >> tracking.
> >> +
> >> +        This is useful when some specific types of traffic do not need
> >> +        stateful processing.
> >> +      </column>
> >> +    </group>
> >>
> >>      <column name="qos_rules">
> >>        QoS marking and metering rules that apply to packets within the
> >> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> >> index 8344c7f..7aa13a1 100644
> >> --- a/tests/ovn-northd.at
> >> +++ b/tests/ovn-northd.at
> >> @@ -1781,3 +1781,132 @@ AT_CHECK([ovn-sbctl lflow-list | grep 
> >> "ls_out_pre_lb.*priority=100" | grep reg0
> >>  ])
> >>
> >>  AT_CLEANUP
> >> +
> >> +AT_SETUP([ovn -- ACL Stateful Bypass])
> >> +ovn_start
> >> +
> >> +ovn-nbctl ls-add ls
> >> +ovn-nbctl lsp-add ls lsp1
> >> +ovn-nbctl lsp-set-addresses lsp1 00:00:00:00:00:01
> >> +ovn-nbctl lsp-add ls lsp2
> >> +ovn-nbctl lsp-set-addresses lsp2 00:00:00:00:00:02
> >> +
> >> +ovn-nbctl acl-add ls from-lport 3 "tcp" allow
> >> +ovn-nbctl acl-add ls from-lport 2 "udp" allow-related
> >> +ovn-nbctl acl-add ls from-lport 1 "ip" drop
> >> +ovn-nbctl --wait=sb sync
> >> +
> >> +flow_eth='eth.src == 00:00:00:00:00:01 && eth.dst == 00:00:00:00:00:02'
> >> +flow_ip='ip.ttl==64 && ip4.src == 42.42.42.1 && ip4.dst == 66.66.66.66'
> >> +flow_tcp='tcp && tcp.dst == 80'
> >> +flow_udp='udp && udp.dst == 80'
> >> +
> >> +# TCP packets should go to conntrack.
> >> +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_tcp}"
> >> +AT_CHECK([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl
> >> +# 
> >> tcp,reg14=0x1,vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80,tcp_flags=0
> >> +ct_next(ct_state=new|trk) {
> >> +    ct_next(ct_state=new|trk) {
> >> +        output("lsp2");
> >> +    };
> >> +};
> >> +])
> >> +
> >> +# UDP packets should go to conntrack.
> >> +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_udp}"
> >> +AT_CHECK([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl
> >> +# 
> >> udp,reg14=0x1,vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80
> >> +ct_next(ct_state=new|trk) {
> >> +    ct_next(ct_state=new|trk) {
> >> +        output("lsp2");
> >> +    };
> >> +};
> >> +])
> >> +
> >> +# Enable Stateful Bypass for TCP.
> >> +ovn-nbctl set logical_switch ls other_config:acl-stateful-bypass="tcp"
> >> +
> >> +# TCP packets should not go to conntrack anymore.
> >> +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_tcp}"
> >> +AT_CHECK([ovn-trace --minimal ls "${flow}"], [0], [dnl
> >> +# 
> >> tcp,reg14=0x1,vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80,tcp_flags=0
> >> +output("lsp2");
> >> +])
> >> +
> >> +# UDP packets still go to conntrack.
> >> +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_udp}"
> >> +AT_CHECK([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl
> >> +# 
> >> udp,reg14=0x1,vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80
> >> +ct_next(ct_state=new|trk) {
> >> +    ct_next(ct_state=new|trk) {
> >> +        output("lsp2");
> >> +    };
> >> +};
> >> +])
> >> +
> >> +# Add a load balancer.
> >> +ovn-nbctl lb-add lb-tcp 66.66.66.66:80 42.42.42.2:8080 tcp
> >> +ovn-nbctl lb-add lb-udp 66.66.66.66:80 42.42.42.2:8080 udp
> >> +ovn-nbctl ls-lb-add ls lb-tcp
> >> +ovn-nbctl ls-lb-add ls lb-udp
> >> +
> >> +# Disable Stateful Bypass for TCP.
> >> +ovn-nbctl remove logical_switch ls other_config acl-stateful-bypass
> >> +ovn-nbctl --wait=sb sync
> >> +
> >> +# TCP packets should go to conntrack.
> >> +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_tcp}"
> >> +AT_CHECK([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl
> >> +# 
> >> tcp,reg14=0x1,vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80,tcp_flags=0
> >> +ct_next(ct_state=new|trk) {
> >> +    ct_lb {
> >> +        ct_next(ct_state=new|trk) {
> >> +            output("lsp2");
> >> +        };
> >> +    };
> >> +};
> >> +])
> >> +
> >> +# UDP packets should go to conntrack.
> >> +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_udp}"
> >> +AT_CHECK([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl
> >> +# 
> >> udp,reg14=0x1,vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80
> >> +ct_next(ct_state=new|trk) {
> >> +    ct_lb {
> >> +        ct_next(ct_state=new|trk) {
> >> +            output("lsp2");
> >> +        };
> >> +    };
> >> +};
> >> +])
> >> +
> >> +# Enable Stateful Bypass for TCP.
> >> +ovn-nbctl set logical_switch ls other_config:acl-stateful-bypass="tcp"
> >> +
> >> +# TCP packets should go to conntrack for load balancing.
> >> +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_tcp}"
> >> +AT_CHECK([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl
> >> +# 
> >> tcp,reg14=0x1,vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80,tcp_flags=0
> >> +ct_next(ct_state=new|trk) {
> >> +    ct_lb {
> >> +        ct_next(ct_state=new|trk) {
> >> +            output("lsp2");
> >> +        };
> >> +    };
> >> +};
> >> +])
> >> +
> >> +# UDP packets still go to conntrack.
> >> +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_udp}"
> >> +AT_CHECK([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl
> >> +# 
> >> udp,reg14=0x1,vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80
> >> +ct_next(ct_state=new|trk) {
> >> +    ct_lb {
> >> +        ct_next(ct_state=new|trk) {
> >> +            output("lsp2");
> >> +        };
> >> +    };
> >> +};
> >> +])
> >> +
> >> +AT_CLEANUP
> >> diff --git a/tests/system-common-macros.at b/tests/system-common-macros.at
> >> index c8fa6f0..65904ed 100644
> >> --- a/tests/system-common-macros.at
> >> +++ b/tests/system-common-macros.at
> >> @@ -234,6 +234,14 @@ m4_define([FORMAT_PING], [grep "transmitted" | sed 
> >> 's/time.*ms$/time 0ms/'])
> >>  #
> >>  m4_define([STRIP_MONITOR_CSUM], [grep "csum:" | sed 's/csum:.*/csum: 
> >> <skip>/'])
> >>
> >> +# FORMAT_CT_STATE([ip-addr])
> >> +#
> >> +# Strip content from the piped input which would differ from test to test
> >> +# and limit the output to the rows containing 'ip-addr'. Don't strip 
> >> state.
> >> +#
> >> +m4_define([FORMAT_CT_STATE],
> >> +    [[grep "dst=$1" | sed -e 's/port=[0-9]*/port=<cleared>/g' -e 
> >> 's/id=[0-9]*/id=<cleared>/g' | sort | uniq]])
> >> +
> >>  # FORMAT_CT([ip-addr])
> >>  #
> >>  # Strip content from the piped input which would differ from test to test
> >> diff --git a/tests/system-ovn.at b/tests/system-ovn.at
> >> index 40ba6e4..583bca4 100644
> >> --- a/tests/system-ovn.at
> >> +++ b/tests/system-ovn.at
> >> @@ -5397,3 +5397,114 @@ as
> >>  OVS_TRAFFIC_VSWITCHD_STOP(["/.*error receiving.*/d
> >>  /.*terminating with signal 15.*/d"])
> >>  AT_CLEANUP
> >> +
> >> +AT_SETUP([ovn -- ACL Stateful Bypass + Load balancer])
> >> +AT_SKIP_IF([test $HAVE_NC = no])
> >> +AT_KEYWORDS([lb])
> >> +AT_KEYWORDS([conntrack])
> >> +ovn_start
> >> +
> >> +OVS_TRAFFIC_VSWITCHD_START()
> >> +ADD_BR([br-int])
> >> +
> >> +# Set external-ids in br-int needed for ovn-controller
> >> +ovs-vsctl \
> >> +        -- set Open_vSwitch . external-ids:system-id=hv1 \
> >> +        -- set Open_vSwitch . 
> >> external-ids:ovn-remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \
> >> +        -- set Open_vSwitch . external-ids:ovn-encap-type=geneve \
> >> +        -- set Open_vSwitch . external-ids:ovn-encap-ip=169.0.0.1 \
> >> +        -- set bridge br-int fail-mode=secure 
> >> other-config:disable-in-band=true
> >> +
> >> +# Start ovn-controller
> >> +start_daemon ovn-controller
> >> +
> >> +# Logical network:
> >> +# One logical switch with a load balancer with one backend.
> >> +# On the LS we add "allow" ACLs for TCP and "allow-related" ACLs for UDP.
> >> +# The "allow-related" ACL normally forces all traffic to go to conntrack.
> >> +# We enable ACL stateful bypass for TCP so TCP traffic should not be
> >> +# sent to conntrack for ACLs (only for LB).
> >> +
> >> +ovn-nbctl ls-add ls
> >> +ovn-nbctl lsp-add ls lsp1
> >> +ovn-nbctl lsp-set-addresses lsp1 00:00:00:00:00:01
> >> +ovn-nbctl lsp-add ls lsp2
> >> +ovn-nbctl lsp-set-addresses lsp2 00:00:00:00:00:02
> >> +
> >> +ovn-nbctl acl-add ls from-lport 3 "tcp" allow
> >> +ovn-nbctl acl-add ls from-lport 2 "udp" allow-related
> >> +ovn-nbctl acl-add ls from-lport 1 "ip" drop
> >> +
> >> +ovn-nbctl lr-add rtr
> >> +ovn-nbctl lrp-add rtr rtr-ls 00:00:00:00:01:00 42.42.42.254/24
> >> +ovn-nbctl lsp-add ls ls-rtr                       \
> >> +    -- lsp-set-type ls-rtr router                 \
> >> +    -- lsp-set-addresses ls-rtr 00:00:00:00:01:00 \
> >> +    -- lsp-set-options ls-rtr router-port=rtr-ls
> >> +
> >> +# Add a load balancer.
> >> +ovn-nbctl lb-add lb-tcp 66.66.66.66:80 42.42.42.2:8080 tcp
> >> +ovn-nbctl lb-add lb-udp 66.66.66.66:80 42.42.42.2:8080 udp
> >> +ovn-nbctl ls-lb-add ls lb-tcp
> >> +ovn-nbctl ls-lb-add ls lb-udp
> >> +
> >> +# Enable Stateful Bypass for TCP.
> >> +ovn-nbctl set logical_switch ls other_config:acl-stateful-bypass="tcp"
> >> +
> >> +ADD_NAMESPACES(lsp1)
> >> +ADD_VETH(lsp1, lsp1, br-int, "42.42.42.1/24", "00:00:00:00:00:01", \
> >> +         "42.42.42.254")
> >> +
> >> +ADD_NAMESPACES(lsp2)
> >> +ADD_VETH(lsp2, lsp2, br-int, "42.42.42.2/24", "00:00:00:00:00:02", \
> >> +         "42.42.42.254")
> >> +
> >> +ovn-nbctl --wait=hv sync
> >> +
> >> +# Start a UDP server on lsp2.
> >> +NETNS_DAEMONIZE([lsp2], [nc -l --no-shutdown -u 42.42.42.2 8080], 
> >> [nc2.pid])
> >> +
> >> +# Start a UDP connection.
> >> +NS_CHECK_EXEC([lsp1], [echo "foo" | nc --no-shutdown -u 66.66.66.66 80])
> >> +
> >> +# There should be 2 UDP conntrack entries:
> >> +# - one for the allow-related ACL.
> >> +# - one for the LB dnat.
> >> +OVS_WAIT_UNTIL([test "$(ovs-appctl dpctl/dump-conntrack | grep udp | grep 
> >> '42.42.42.1' -c)" = "2"])
> >> +
> >> +AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT_STATE(42.42.42.1) | 
> >> grep udp | \
> >> +sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [dnl
> >> +udp,orig=(src=42.42.42.1,dst=42.42.42.2,sport=<cleared>,dport=<cleared>),reply=(src=42.42.42.2,dst=42.42.42.1,sport=<cleared>,dport=<cleared>),zone=<cleared>
> >> +udp,orig=(src=42.42.42.1,dst=66.66.66.66,sport=<cleared>,dport=<cleared>),reply=(src=42.42.42.2,dst=42.42.42.1,sport=<cleared>,dport=<cleared>),zone=<cleared>,labels=0x2
> >> +])
> >> +
> >> +# Start a TCP server on lsp2.
> >> +NETNS_DAEMONIZE([lsp2], [nc -l --no-shutdown 42.42.42.2 8080], [nc0.pid])
> >> +
> >> +# Start a TCP connection.
> >> +NETNS_DAEMONIZE([lsp1], [nc --no-shutdown 66.66.66.66 80], [nc1.pid])
> >> +
> >> +OVS_WAIT_UNTIL([test "$(ovs-appctl dpctl/dump-conntrack | grep tcp | grep 
> >> '42.42.42.1' -c)" = "1"])
> >> +
> >> +# There should be only one TCP conntrack entry, for the LB dnat.
> >> +AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT_STATE(42.42.42.1) | 
> >> grep tcp | \
> >> +sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [dnl
> >> +tcp,orig=(src=42.42.42.1,dst=66.66.66.66,sport=<cleared>,dport=<cleared>),reply=(src=42.42.42.2,dst=42.42.42.1,sport=<cleared>,dport=<cleared>),zone=<cleared>,labels=0x2,protoinfo=(state=ESTABLISHED)
> >> +])
> >> +
> >> +OVS_APP_EXIT_AND_WAIT([ovn-controller])
> >> +
> >> +as ovn-sb
> >> +OVS_APP_EXIT_AND_WAIT([ovsdb-server])
> >> +
> >> +as ovn-nb
> >> +OVS_APP_EXIT_AND_WAIT([ovsdb-server])
> >> +
> >> +as northd
> >> +OVS_APP_EXIT_AND_WAIT([ovn-northd])
> >> +
> >> +as
> >> +OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query port patch-.*/d
> >> +/connection dropped.*/d"])
> >> +
> >> +AT_CLEANUP
> >> diff --git a/utilities/ovn-nbctl.c b/utilities/ovn-nbctl.c
> >> index d7bb4b4..695b0c3 100644
> >> --- a/utilities/ovn-nbctl.c
> >> +++ b/utilities/ovn-nbctl.c
> >> @@ -2079,6 +2079,16 @@ nbctl_acl_list(struct ctl_context *ctx)
> >>          return;
> >>      }
> >>
> >> +    if (ls) {
> >> +        const char *stateful_bypass = smap_get(&ls->other_config,
> >> +                                               "acl-stateful-bypass");
> >> +
> >> +        if (stateful_bypass) {
> >> +            ds_put_format(&ctx->output, "Stateful bypass rule match: 
> >> %s\n",
> >> +                          stateful_bypass);
> >> +        }
> >> +    }
> >> +
> >>      size_t n_acls = pg ? pg->n_acls : ls->n_acls;
> >>      struct nbrec_acl **nb_acls = pg ? pg->acls : ls->acls;
> >>
> >> --
> >> 1.8.3.1
> >>
> >> _______________________________________________
> >> dev mailing list
> >> [email protected]
> >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >>
> >
>
> _______________________________________________
> dev mailing list
> [email protected]
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to