On 12 Sep 2024, at 14:51, Farhan Tariq wrote:

> This patch introduces exact-match offload support in OVS to enhance
> optimization for hardware datapath. It's being submitted as a RFC for early
> feedback.
>
> Problem Statement:
> Currently, OVS offloads megaflows using tc-flower. However, hardware
> support for megaflows has certain limitations:
> a. TCAMs are used, which aren't scalable and support fewer rules.
> b. Using hash tables to emulate TCAMs results in complex and slow
> performance with many tables.
>
> Solution:
> This patch aims to add exact-match offload support to OVS, addressing
> these issues. This feature is an optional enhancement and is disabled by
> default.
>
> Implementation Details:
> 1. Added a new flag, "exact-match-offload" to "ovs-vsctl...other_config",
>    which is disabled by default but can be enabled at runtime. It works only
>    if megaflows are disabled.
> 2. Introduced the "enable_exact_match_offload()" API in
>    "/lib/netdev-offload-tc.c" to zero-out unused masks.
> 3. Moved "enable_megaflows" from "ofproto/ofproto-dpif-upcall.c" to "lib/-
>    netdev-offload-tc.c" to make it visible in the "netdev-offload-tc.c"
>    file. An alternate implementation could be to use a getter api.
>
> Testing results:
> Verified the basic functionality of the patch with different flows.
>
> Future Work:
> 1. Add support for connection tracking.
> 2. Add support for OVS-DPDK rte_flow.
>
> Note: This feature does not affect user OpenFlow tables.
> ---
> Potential Issue:
> This patch might not handle short-lived large numbers of new connections
> effectively. The last proposal (i.e. proposal #4 listed below) can address
> this issue. For now this feature is beneficial for many use cases, with
> further optimization for handling large numbers of short-lived connections
> planned for future work.
>
> Other Possible Solutions:
> Several approaches to implementing exact-match offload are being
> considered. Comments are welcome on the best approach:
> 1. Offload all fields except mutable and unpredictable ones (e.g., ttl,
>    tos, ip_frag, checksum). Megaflows are disabled.
> 2. Offload only essential fields (e.g., 5-tuple + VLAN IDs + tunnel info
>    etc). Megaflows are disabled.
> 3. Provide configuration options for users to select which fields to
>    offload. Megaflows are disabled.
> 4. Introduce an offloading path from the OVS datapath (kernel) module to
>    integrate megaflows and exact-match offload, addressing short-lived new
>    connection performance. Suggestions on this approach are welcome.
> --

Thanks, Farhat, for submitting the RFC series. I noticed something that might 
be a mistake in the subject line—it says "patch 1/3,” but only patch 1 was sent.

Regarding your patch, I see that it relies on disabling mega-flow support. 
However, this isn’t something typically done in a production environment. The 
‘ovs-appctl upcall/disable-megaflows’ command was originally introduced for 
debugging purposes when megaflows were first implemented. Since no one runs or 
tests with this disabled, it’s generally considered a debug-only command.

While I understand the value of your patch in simplifying hardware offloading 
of datapath flows, disabling megaflows doesn’t seem like a practical solution 
for production use. Perhaps it would be beneficial to collaborate with other 
hardware vendors within the OVS community to find a more viable approach. It 
might be worth reaching out to other vendors like, Broadcom, Marvell, NVIDIA, 
etc. on this mailing list.

Cheers,

Eelco

> Co-authored-by: Farhat Ullah 
> <[email protected]<mailto:[email protected]>>
> Signed-off-by: Farhat Ullah 
> <[email protected]<mailto:[email protected]>>
> Signed-off-by: Farhan Tariq 
> <[email protected]<mailto:[email protected]>>
>
> ---
>  Documentation/howto/tc-offload.rst |  56 +++++++
>  NEWS                               |   6 +
>  lib/netdev-offload-tc.c            | 236 +++++++++++++++++++----------
>  lib/netdev-offload.c               |   5 +
>  lib/netdev-offload.h               |   2 +
>  ofproto/ofproto-dpif-upcall.c      |   2 +-
>  tests/system-offloads-traffic.at<http://system-offloads-traffic.at>   |  24 
> +++
>  vswitchd/vswitch.xml               |  19 +++
>  8 files changed, 266 insertions(+), 84 deletions(-)
>
> diff --git a/Documentation/howto/tc-offload.rst 
> b/Documentation/howto/tc-offload.rst
> index ee7f73f8a..7e6e0e9d1 100644
> --- a/Documentation/howto/tc-offload.rst
> +++ b/Documentation/howto/tc-offload.rst
> @@ -37,6 +37,50 @@ The flow hardware offload is disabled by default and can 
> be enabled by::
>  TC flower has one additional configuration option caled ``tc-policy``.
>  For more details see ``man ovs-vswitchd.conf.db``.
>
> +
> +Exact Match Flow Hardware Offload
> +---------------------------------
> +
> +The exact match flow hardware offload is disabled by default and can be
> +enabled by::
> +
> +    $ ovs-appctl upcall/disable-megaflows
> +    $ ovs-vsctl set Open_vSwitch . other_config:hw-offload=true
> +    $ ovs-vsctl set Open_vSwitch . other_config:exact-match-offload=true
> +
> +For more details see ``man ovs-vswitchd.conf.db``.
> +
> +
> +Supported Fields
> +~~~~~~~~~~~~~~~~
> +
> +Not all fields are offloaded in the exact match flow. The list of supported
> +fields is:
> +
> +    ETH src_mac
> +    ETH dst_mac
> +    ETH eth_type
> +    ARP arp_sha
> +    ARP arp_tha
> +    ARP arp_sip
> +    ARP arp_tip
> +    ARP arp_op
> +    MPLS mpls_label
> +    VLAN eth_type
> +    VLAN id
> +    IPv4/v6 src_ip
> +    IPv4/v6 dst_ip
> +    IPv4/v6 ip_proto
> +    ICMPv4/v6 type
> +    ICMPv4/v6 code
> +    TCP/UDP/SCTP dst_port
> +    TCP/UDP/SCTP src_port
> +    TUNNEL enc_key_id
> +    TUNNEL enc_src_ip
> +    TUNNEL enc_dst_ip
> +    TUNNEL enc_dst_port
> +
> +
>  TC Meter Offload
>  ----------------
>
> @@ -123,3 +167,15 @@ might run through a TC rule, which internally will not 
> call the conntrack
>  helper required.
>
>  So if ALG support is required, tc offload must be disabled.
> +
> +Exact Match Flow Offload
> +~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +This section describes the limitations related to the implementation of exact
> +match flow offload.
> +
> +Conntrack 'ct_state' field not offloaded
> +++++++++++++++++++++++++++++++++++++++++
> +
> +Currently, the exact match offloaded flow does not contain the ``ct_state``,
> +field. This will be addressed in a future patch.
> \ No newline at end of file
> diff --git a/NEWS b/NEWS
> index 944c95a8d..37d22630f 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -1,3 +1,9 @@
> +Patch: v3.4.0 - 12 Sep 2024
> +---------------------------
> +  - Linux TC offload:
> +    * Add support for offloading exact-match flows.
> +
> +
>  v3.4.0 - 15 Aug 2024
>  --------------------
>     - Option '--mlockall' now only locks memory pages on fault, if possible.
> diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c
> index 3be1c08d2..fe1334d2c 100644
> --- a/lib/netdev-offload-tc.c
> +++ b/lib/netdev-offload-tc.c
> @@ -32,6 +32,7 @@
>  #include "netdev-linux.h"
>  #include "netdev-offload-provider.h"
>  #include "netdev-provider.h"
> +#include "netdev-offload.h"
>  #include "netdev-vport.h"
>  #include "netlink.h"
>  #include "netlink-socket.h"
> @@ -44,6 +45,8 @@
>
>  VLOG_DEFINE_THIS_MODULE(netdev_offload_tc);
>
> +atomic_bool enable_megaflows = true;
> +
>  static struct vlog_rate_limit error_rl = VLOG_RATE_LIMIT_INIT(60, 5);
>  static struct vlog_rate_limit warn_rl = VLOG_RATE_LIMIT_INIT(10, 2);
>
> @@ -1682,6 +1685,41 @@ is_ipv6_fragment_and_masked(const struct flow *key, 
> const struct flow *mask)
>      return false;
>  }
>
> +/* Check and clear unused masks for exact-match offloads */
> +static int
> +test_key_and_clear_mask(struct match *match)
> +{
> +    const struct flow *key = &match->flow;
> +    struct flow *mask = &match->wc.masks;
> +
> +    for (int i = 1; i < FLOW_MAX_MPLS_LABELS; i++) {
> +        if (key->mpls_lse[i]) {
> +            return EOPNOTSUPP;
> +        }
> +    }
> +
> +    for (int i = 1; i < FLOW_N_REGS; i++) {
> +        if (key->regs[i]) {
> +            return EOPNOTSUPP;
> +        }
> +    }
> +
> +    if (is_ipv6_fragment_and_masked(key, mask)) {
> +        return EOPNOTSUPP;
> +    }
> +
> +    if (key->dp_hash || key->conj_id ||
> +        key->skb_priority || key->actset_output ||
> +        key->packet_type || key->metadata ||
> +        (key->dl_type == htons(ETH_TYPE_IP) &&
> +            key->nw_proto == IPPROTO_IGMP)) {
> +        return EOPNOTSUPP;
> +    }
> +
> +    memset(mask, 0, sizeof *mask);
> +    return 0;
> +}
> +
>  static int
>  test_key_and_mask(struct match *match)
>  {
> @@ -2220,6 +2258,9 @@ netdev_tc_flow_put(struct netdev *netdev, struct match 
> *match,
>      const struct flow_tnl *tnl = &match->flow.tunnel;
>      struct flow_tnl *tnl_mask = &mask->tunnel;
>      struct dpif_flow_stats adjust_stats;
> +    bool exact_match_offload = false;
> +    bool microflow = false;
> +    bool megaflow = false;
>      bool recirc_act = false;
>      uint32_t block_id = 0;
>      struct tcf_id id;
> @@ -2235,6 +2276,11 @@ netdev_tc_flow_put(struct netdev *netdev, struct match 
> *match,
>          return -ifindex;
>      }
>
> +    atomic_read_relaxed(&enable_megaflows, &megaflow);
> +    atomic_read_relaxed(&netdev_exact_match_offload, &microflow);
> +
> +    exact_match_offload = !megaflow & microflow;
> +
>      memset(&flower, 0, sizeof flower);
>
>      chain = key->recirc_id;
> @@ -2253,57 +2299,60 @@ netdev_tc_flow_put(struct netdev *netdev, struct 
> match *match,
>          flower.key.tunnel.ipv4.ipv4_dst = tnl->ip_dst;
>          flower.key.tunnel.ipv6.ipv6_src = tnl->ipv6_src;
>          flower.key.tunnel.ipv6.ipv6_dst = tnl->ipv6_dst;
> -        flower.key.tunnel.tos = tnl->ip_tos;
> -        flower.key.tunnel.ttl = tnl->ip_ttl;
> -        flower.key.tunnel.tp_src = tnl->tp_src;
>          flower.key.tunnel.tp_dst = tnl->tp_dst;
> -        flower.key.tunnel.gbp.id<http://flower.key.tunnel.gbp.id> = 
> tnl->gbp_id;
> -        flower.key.tunnel.gbp.flags = tnl->gbp_flags;
> -        flower.key.tunnel.gbp.id_present = !!tnl_mask->gbp_id;
> -
> +        flower.mask.tunnel.id<http://flower.mask.tunnel.id> = (tnl->flags & 
> FLOW_TNL_F_KEY) ? tnl_mask->tun_id : 0;
>          flower.mask.tunnel.ipv4.ipv4_src = tnl_mask->ip_src;
>          flower.mask.tunnel.ipv4.ipv4_dst = tnl_mask->ip_dst;
>          flower.mask.tunnel.ipv6.ipv6_src = tnl_mask->ipv6_src;
>          flower.mask.tunnel.ipv6.ipv6_dst = tnl_mask->ipv6_dst;
> -        flower.mask.tunnel.tos = tnl_mask->ip_tos;
> -        flower.mask.tunnel.ttl = tnl_mask->ip_ttl;
> -        flower.mask.tunnel.tp_src = tnl_mask->tp_src;
>          /* XXX: We should be setting the mask from 'tnl_mask->tp_dst' here, 
> but
>           * some hardware drivers (mlx5) doesn't support masked matches and 
> will
>           * refuse to offload such flows keeping them in software path.
>           * Degrading the flow down to exact match for now as a workaround. */
>          flower.mask.tunnel.tp_dst = OVS_BE16_MAX;
> -        flower.mask.tunnel.id<http://flower.mask.tunnel.id> = (tnl->flags & 
> FLOW_TNL_F_KEY) ? tnl_mask->tun_id : 0;
> -        flower.mask.tunnel.gbp.id<http://flower.mask.tunnel.gbp.id> = 
> tnl_mask->gbp_id;
> -        flower.mask.tunnel.gbp.flags = tnl_mask->gbp_flags;
> -        flower.mask.tunnel.gbp.id_present = !!tnl_mask->gbp_id;
> -
> -        memset(&tnl_mask->ip_src, 0, sizeof tnl_mask->ip_src);
> -        memset(&tnl_mask->ip_dst, 0, sizeof tnl_mask->ip_dst);
> -        memset(&tnl_mask->ipv6_src, 0, sizeof tnl_mask->ipv6_src);
> -        memset(&tnl_mask->ipv6_dst, 0, sizeof tnl_mask->ipv6_dst);
> -        memset(&tnl_mask->ip_tos, 0, sizeof tnl_mask->ip_tos);
> -        memset(&tnl_mask->ip_ttl, 0, sizeof tnl_mask->ip_ttl);
> -        memset(&tnl_mask->tp_src, 0, sizeof tnl_mask->tp_src);
> -        memset(&tnl_mask->tp_dst, 0, sizeof tnl_mask->tp_dst);
> -
> -        memset(&tnl_mask->tun_id, 0, sizeof tnl_mask->tun_id);
> -        memset(&tnl_mask->gbp_id, 0, sizeof tnl_mask->gbp_id);
> -        memset(&tnl_mask->gbp_flags, 0, sizeof tnl_mask->gbp_flags);
> -        tnl_mask->flags &= ~FLOW_TNL_F_KEY;
> -
> -        /* XXX: This is wrong!  We're ignoring DF and CSUM flags 
> configuration
> -         * requested by the user.  However, TC for now has no way to pass
> -         * these flags in a flower key and their masks are set by default,
> -         * meaning tunnel offloading will not work at all if not cleared.
> -         * Keeping incorrect behavior for now. */
> -        tnl_mask->flags &= ~(FLOW_TNL_F_DONT_FRAGMENT | FLOW_TNL_F_CSUM);
>
> -        if (!strcmp(netdev_get_type(netdev), "geneve")) {
> -            err = flower_match_to_tun_opt(&flower, tnl, tnl_mask);
> -            if (err) {
> -                VLOG_WARN_RL(&warn_rl, "Unable to parse geneve options");
> -                return err;
> +        if (!exact_match_offload) {
> +            flower.key.tunnel.tos = tnl->ip_tos;
> +            flower.key.tunnel.ttl = tnl->ip_ttl;
> +            flower.key.tunnel.tp_src = tnl->tp_src;
> +            flower.key.tunnel.gbp.id<http://flower.key.tunnel.gbp.id> = 
> tnl->gbp_id;
> +            flower.key.tunnel.gbp.flags = tnl->gbp_flags;
> +            flower.key.tunnel.gbp.id_present = !!tnl_mask->gbp_id;
> +
> +            flower.mask.tunnel.tos = tnl_mask->ip_tos;
> +            flower.mask.tunnel.ttl = tnl_mask->ip_ttl;
> +            flower.mask.tunnel.tp_src = tnl_mask->tp_src;
> +            flower.mask.tunnel.gbp.id<http://flower.mask.tunnel.gbp.id> = 
> tnl_mask->gbp_id;
> +            flower.mask.tunnel.gbp.flags = tnl_mask->gbp_flags;
> +            flower.mask.tunnel.gbp.id_present = !!tnl_mask->gbp_id;
> +
> +            memset(&tnl_mask->ip_src, 0, sizeof tnl_mask->ip_src);
> +            memset(&tnl_mask->ip_dst, 0, sizeof tnl_mask->ip_dst);
> +            memset(&tnl_mask->ipv6_src, 0, sizeof tnl_mask->ipv6_src);
> +            memset(&tnl_mask->ipv6_dst, 0, sizeof tnl_mask->ipv6_dst);
> +            memset(&tnl_mask->ip_tos, 0, sizeof tnl_mask->ip_tos);
> +            memset(&tnl_mask->ip_ttl, 0, sizeof tnl_mask->ip_ttl);
> +            memset(&tnl_mask->tp_src, 0, sizeof tnl_mask->tp_src);
> +            memset(&tnl_mask->tp_dst, 0, sizeof tnl_mask->tp_dst);
> +
> +            memset(&tnl_mask->tun_id, 0, sizeof tnl_mask->tun_id);
> +            memset(&tnl_mask->gbp_id, 0, sizeof tnl_mask->gbp_id);
> +            memset(&tnl_mask->gbp_flags, 0, sizeof tnl_mask->gbp_flags);
> +            tnl_mask->flags &= ~FLOW_TNL_F_KEY;
> +
> +            /* XXX: This is wrong!  We're ignoring DF and CSUM flags 
> configuration
> +            * requested by the user.  However, TC for now has no way to pass
> +            * these flags in a flower key and their masks are set by default,
> +            * meaning tunnel offloading will not work at all if not cleared.
> +            * Keeping incorrect behavior for now. */
> +            tnl_mask->flags &= ~(FLOW_TNL_F_DONT_FRAGMENT | FLOW_TNL_F_CSUM);
> +
> +            if (!strcmp(netdev_get_type(netdev), "geneve")) {
> +                err = flower_match_to_tun_opt(&flower, tnl, tnl_mask);
> +                if (err) {
> +                    VLOG_WARN_RL(&warn_rl, "Unable to parse geneve options");
> +                    return err;
> +                }
>              }
>          }
>          flower.tunnel = true;
> @@ -2319,6 +2368,12 @@ netdev_tc_flow_put(struct netdev *netdev, struct match 
> *match,
>          flower.key.mpls_lse = key->mpls_lse[0];
>          flower.mask.mpls_lse = mask->mpls_lse[0];
>          flower.key.encap_eth_type[0] = flower.key.eth_type;
> +
> +        if (exact_match_offload) {
> +            /* Enable mpls_label only */
> +            flower.key.mpls_lse &= ntohl(MPLS_LABEL_MASK);
> +            flower.mask.mpls_lse &= ntohl(MPLS_LABEL_MASK);
> +        }
>      }
>      mask->mpls_lse[0] = 0;
>
> @@ -2342,7 +2397,8 @@ netdev_tc_flow_put(struct netdev *netdev, struct match 
> *match,
>                  flower.mask.vlan_id[0] = vlan_tci_to_vid(mask->vlans[0].tci);
>                  VLOG_DBG_RL(&rl, "vlan_id[0]: %d\n", flower.key.vlan_id[0]);
>              }
> -            if (pcp_mask) {
> +            /* Skip vlan_prio in case of exact-match offload */
> +            if (pcp_mask && !exact_match_offload) {
>                  flower.key.vlan_prio[0] = vlan_tci_to_pcp(key->vlans[0].tci);
>                  flower.mask.vlan_prio[0] = 
> vlan_tci_to_pcp(mask->vlans[0].tci);
>                  VLOG_DBG_RL(&rl, "vlan_prio[0]: %d\n",
> @@ -2377,7 +2433,7 @@ netdev_tc_flow_put(struct netdev *netdev, struct match 
> *match,
>                  flower.mask.vlan_id[1] = vlan_tci_to_vid(mask->vlans[1].tci);
>                  VLOG_DBG_RL(&rl, "vlan_id[1]: %d", flower.key.vlan_id[1]);
>              }
> -            if (pcp_mask) {
> +            if (pcp_mask && !exact_match_offload) {
>                  flower.key.vlan_prio[1] = vlan_tci_to_pcp(key->vlans[1].tci);
>                  flower.mask.vlan_prio[1] = 
> vlan_tci_to_pcp(mask->vlans[1].tci);
>                  VLOG_DBG_RL(&rl, "vlan_prio[1]: %d", 
> flower.key.vlan_prio[1]);
> @@ -2421,44 +2477,34 @@ netdev_tc_flow_put(struct netdev *netdev, struct 
> match *match,
>      }
>
>      if (is_ip_any(key) && !is_ipv6_fragment_and_masked(key, mask)) {
> +
> +        if (key->dl_type == htons(ETH_P_IP)) {
> +            flower.key.ipv4.ipv4_src = key->nw_src;
> +            flower.mask.ipv4.ipv4_src = mask->nw_src;
> +            flower.key.ipv4.ipv4_dst = key->nw_dst;
> +            flower.mask.ipv4.ipv4_dst = mask->nw_dst;
> +            mask->nw_src = 0;
> +            mask->nw_dst = 0;
> +        } else if (key->dl_type == htons(ETH_P_IPV6)) {
> +            flower.key.ipv6.ipv6_src = key->ipv6_src;
> +            flower.mask.ipv6.ipv6_src = mask->ipv6_src;
> +            flower.key.ipv6.ipv6_dst = key->ipv6_dst;
> +            flower.mask.ipv6.ipv6_dst = mask->ipv6_dst;
> +            memset(&mask->ipv6_src, 0, sizeof mask->ipv6_src);
> +            memset(&mask->ipv6_dst, 0, sizeof mask->ipv6_dst);
> +        }
> +
>          flower.key.ip_proto = key->nw_proto;
>          flower.mask.ip_proto = mask->nw_proto;
>          mask->nw_proto = 0;
> -        flower.key.ip_tos = key->nw_tos;
> -        flower.mask.ip_tos = mask->nw_tos;
> -        mask->nw_tos = 0;
> -        flower.key.ip_ttl = key->nw_ttl;
> -        flower.mask.ip_ttl = mask->nw_ttl;
> -        mask->nw_ttl = 0;
> -
> -        if (mask->nw_frag & FLOW_NW_FRAG_ANY) {
> -            flower.mask.flags |= TCA_FLOWER_KEY_FLAGS_IS_FRAGMENT;
> -
> -            if (key->nw_frag & FLOW_NW_FRAG_ANY) {
> -                flower.key.flags |= TCA_FLOWER_KEY_FLAGS_IS_FRAGMENT;
> -
> -                if (mask->nw_frag & FLOW_NW_FRAG_LATER) {
> -                    flower.mask.flags |= TCA_FLOWER_KEY_FLAGS_FRAG_IS_FIRST;
> -
> -                    if (!(key->nw_frag & FLOW_NW_FRAG_LATER)) {
> -                        flower.key.flags |= 
> TCA_FLOWER_KEY_FLAGS_FRAG_IS_FIRST;
> -                    }
> -                }
> -            }
> -
> -            mask->nw_frag = 0;
> -        }
>
>          if (key->nw_proto == IPPROTO_TCP) {
>              flower.key.tcp_dst = key->tp_dst;
>              flower.mask.tcp_dst = mask->tp_dst;
>              flower.key.tcp_src = key->tp_src;
>              flower.mask.tcp_src = mask->tp_src;
> -            flower.key.tcp_flags = key->tcp_flags;
> -            flower.mask.tcp_flags = mask->tcp_flags;
>              mask->tp_src = 0;
>              mask->tp_dst = 0;
> -            mask->tcp_flags = 0;
>          } else if (key->nw_proto == IPPROTO_UDP) {
>              flower.key.udp_dst = key->tp_dst;
>              flower.mask.udp_dst = mask->tp_dst;
> @@ -2483,20 +2529,37 @@ netdev_tc_flow_put(struct netdev *netdev, struct 
> match *match,
>              mask->tp_dst = 0;
>          }
>
> -        if (key->dl_type == htons(ETH_P_IP)) {
> -            flower.key.ipv4.ipv4_src = key->nw_src;
> -            flower.mask.ipv4.ipv4_src = mask->nw_src;
> -            flower.key.ipv4.ipv4_dst = key->nw_dst;
> -            flower.mask.ipv4.ipv4_dst = mask->nw_dst;
> -            mask->nw_src = 0;
> -            mask->nw_dst = 0;
> -        } else if (key->dl_type == htons(ETH_P_IPV6)) {
> -            flower.key.ipv6.ipv6_src = key->ipv6_src;
> -            flower.mask.ipv6.ipv6_src = mask->ipv6_src;
> -            flower.key.ipv6.ipv6_dst = key->ipv6_dst;
> -            flower.mask.ipv6.ipv6_dst = mask->ipv6_dst;
> -            memset(&mask->ipv6_src, 0, sizeof mask->ipv6_src);
> -            memset(&mask->ipv6_dst, 0, sizeof mask->ipv6_dst);
> +        if (!exact_match_offload) {
> +            flower.key.ip_tos = key->nw_tos;
> +            flower.mask.ip_tos = mask->nw_tos;
> +            flower.key.ip_ttl = key->nw_ttl;
> +            flower.mask.ip_ttl = mask->nw_ttl;
> +            mask->nw_tos = 0;
> +            mask->nw_ttl = 0;
> +
> +            if (mask->nw_frag & FLOW_NW_FRAG_ANY) {
> +                flower.mask.flags |= TCA_FLOWER_KEY_FLAGS_IS_FRAGMENT;
> +
> +                if (key->nw_frag & FLOW_NW_FRAG_ANY) {
> +                    flower.key.flags |= TCA_FLOWER_KEY_FLAGS_IS_FRAGMENT;
> +
> +                    if (mask->nw_frag & FLOW_NW_FRAG_LATER) {
> +                        flower.mask.flags |= 
> TCA_FLOWER_KEY_FLAGS_FRAG_IS_FIRST;
> +
> +                        if (!(key->nw_frag & FLOW_NW_FRAG_LATER)) {
> +                            flower.key.flags |= 
> TCA_FLOWER_KEY_FLAGS_FRAG_IS_FIRST;
> +                        }
> +                    }
> +                }
> +
> +                mask->nw_frag = 0;
> +            }
> +
> +            if (key->nw_proto == IPPROTO_TCP) {
> +                flower.key.tcp_flags = key->tcp_flags;
> +                flower.mask.tcp_flags = mask->tcp_flags;
> +                mask->tcp_flags = 0;
> +            }
>          }
>      }
>
> @@ -2507,6 +2570,13 @@ netdev_tc_flow_put(struct netdev *netdev, struct match 
> *match,
>          mask->pkt_mark = 0;
>      }
>
> +    if (exact_match_offload) {
> +        err = test_key_and_clear_mask(match);
> +        if (err) {
> +            return err;
> +        }
> +    }
> +
>      err = test_key_and_mask(match);
>      if (err) {
>          return err;
> diff --git a/lib/netdev-offload.c b/lib/netdev-offload.c
> index 8a9d36555..349abcf9e 100644
> --- a/lib/netdev-offload.c
> +++ b/lib/netdev-offload.c
> @@ -60,6 +60,7 @@ VLOG_DEFINE_THIS_MODULE(netdev_offload);
>
>  static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>  static bool netdev_flow_api_enabled = false;
> +atomic_bool netdev_exact_match_offload = false;
>
>  #define DEFAULT_OFFLOAD_THREAD_NB 1
>  #define MAX_OFFLOAD_THREAD_NB 10
> @@ -902,6 +903,10 @@ netdev_set_flow_api_enabled(const struct smap 
> *ovs_other_config)
>                  netdev_offload_rebalance_policy = true;
>              }
>
> +            if (smap_get_bool(ovs_other_config, "exact-match-offload", 
> false)) {
> +                atomic_store_relaxed(&netdev_exact_match_offload, true);
> +            }
> +
>              netdev_ports_flow_init();
>
>              ovsthread_once_done(&once);
> diff --git a/lib/netdev-offload.h b/lib/netdev-offload.h
> index 47f8e6f48..004d6feaa 100644
> --- a/lib/netdev-offload.h
> +++ b/lib/netdev-offload.h
> @@ -43,6 +43,8 @@ struct smap;
>  struct sset;
>  struct ovs_action_push_tnl;
>
> +extern atomic_bool enable_megaflows;
> +extern atomic_bool netdev_exact_match_offload;
>
>  /* Offload-capable (HW) netdev information */
>  struct netdev_hw_info {
> diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
> index 4d39bc5a7..72a6a303c 100644
> --- a/ofproto/ofproto-dpif-upcall.c
> +++ b/ofproto/ofproto-dpif-upcall.c
> @@ -23,6 +23,7 @@
>  #include "coverage.h"
>  #include "cmap.h"
>  #include "lib/dpif-provider.h"
> +#include "lib/netdev-offload.h"
>  #include "dpif.h"
>  #include "openvswitch/dynamic-string.h"
>  #include "fail-open.h"
> @@ -446,7 +447,6 @@ static int udpif_flow_unprogram(struct udpif *udpif, 
> struct udpif_key *ukey,
>  static upcall_callback upcall_cb;
>  static dp_purge_callback dp_purge_cb;
>
> -static atomic_bool enable_megaflows = true;
>  static atomic_bool enable_ufid = true;
>
>  void
> diff --git 
> a/tests/system-offloads-traffic.at<http://system-offloads-traffic.at> 
> b/tests/system-offloads-traffic.at<http://system-offloads-traffic.at>
> index d1da33d96..345161f82 100644
> --- a/tests/system-offloads-traffic.at<http://system-offloads-traffic.at>
> +++ b/tests/system-offloads-traffic.at<http://system-offloads-traffic.at>
> @@ -93,6 +93,30 @@ AT_CHECK([ovs-appctl upcall/show | grep -E "offloaded 
> flows : [[1-9]]"], [0], [i
>  OVS_TRAFFIC_VSWITCHD_STOP
>  AT_CLEANUP
>
> +AT_SETUP([offloads - ping between two ports - offloads enabled 
> (exact-match)])
> +OVS_TRAFFIC_VSWITCHD_START([], [], [-- set Open_vSwitch . 
> other_config:hw-offload=true other_config:exact-match-offload=true])
> +   AT_CHECK([ovs-appctl upcall/disable-megaflows], [0], [megaflows disabled
> +], [])
> +
> +AT_CHECK([ovs-appctl vlog/set dpif:dbg dpif_netdev:dbg 
> netdev_offload_tc:file:dbg ofproto_dpif_upcall:DBG dpif_netlink:DBG])
> +
> +AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"])
> +
> +ADD_NAMESPACES(at_ns0, at_ns1)
> +
> +ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24<http://10.1.1.1/24> ")
> +ADD_VETH(p1, at_ns1, br0, "10.1.1.2/24<http://10.1.1.2/24> ")
> +
> +NS_CHECK_EXEC([at_ns0], [ping -q -c 3 -i 0.1 -w 2 10.1.1.2 | FORMAT_PING], 
> [0], [ignore])
> +
> +AT_CHECK([ovs-appctl dpctl/dump-flows type=tc | grep "eth_type(0x0800)" | 
> strip_stats | DUMP_CLEAN_SORTED], [0], [dnl
> +in_port(2),ct_zone(0),ct_mark(0),ct_label(0),eth(macs),eth_type(0x0800),ipv4(src=10.1.1.1,dst=10.1.1.2,proto=1),icmp(type=8,code=0),
>  packets:0, bytes:0, used:0.001s, actions:output
> +in_port(3),ct_zone(0),ct_mark(0),ct_label(0),eth(macs),eth_type(0x0800),ipv4(src=10.1.1.2,dst=10.1.1.1,proto=1),icmp(type=0,code=0),
>  packets:0, bytes:0, used:0.001s, actions:output
> +])
> +
> +OVS_TRAFFIC_VSWITCHD_STOP
> +AT_CLEANUP
> +
>  AT_SETUP([offloads - set ingress_policing_rate and ingress_policing_burst - 
> offloads disabled])
>  AT_KEYWORDS([ingress_policing])
>  OVS_CHECK_TC_QDISC()
> diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
> index 36cb4e495..90844b428 100644
> --- a/vswitchd/vswitch.xml
> +++ b/vswitchd/vswitch.xml
> @@ -254,6 +254,25 @@
>          </p>
>        </column>
>
> +      <column name="other_config" key="exact-match-offload"
> +              type='{"type": "boolean"}'>
> +        <p>
> +          Set this value to <code>true</code> to enable exact match netdev 
> flow
> +          offload.
> +        </p>
> +        <p>
> +          The default value is <code>false</code>.
> +        </p>
> +        <p>
> +          To offload exact match flows, the user needs to enable this
> +          feature along with disabling the megaflows.
> +        </p>
> +        <p>
> +          This is only relevant if
> +          <ref column="other_config" key="hw-offload"/> is enabled.
> +        </p>
> +      </column>
> +
>        <column name="other_config" key="n-offload-threads"
>                type='{"type": "integer", "minInteger": 1, "maxInteger": 10}'>
>          <p>
> --
> 2.27.0
>
> _______________________________________________
> dev mailing list
> [email protected]
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to