On 25 Sep 2024, at 11:46, Farhat Ullah wrote:

> ________________________________
> From: dev <[email protected]> on behalf of Eelco Chaudron 
> <[email protected]>
> Sent: Friday, September 20, 2024 4:46 PM
> To: Farhan Tariq <[email protected]>
> Cc: [email protected] <[email protected]>
> Subject: Re: [ovs-dev] [RFC PATCH 1/3] exact-match-offload: Introducing 
> Exact-Match HW Offload Support for OVS
>
>
>
> On 12 Sep 2024, at 14:51, Farhan Tariq wrote:
>
>> This patch introduces exact-match offload support in OVS to enhance
>> optimization for hardware datapath. It's being submitted as a RFC for early
>> feedback.
>>
>> Problem Statement:
>> Currently, OVS offloads megaflows using tc-flower. However, hardware
>> support for megaflows has certain limitations:
>> a. TCAMs are used, which aren't scalable and support fewer rules.
>> b. Using hash tables to emulate TCAMs results in complex and slow
>> performance with many tables.
>>
>> Solution:
>> This patch aims to add exact-match offload support to OVS, addressing
>> these issues. This feature is an optional enhancement and is disabled by
>> default.
>>
>> Implementation Details:
>> 1. Added a new flag, "exact-match-offload" to "ovs-vsctl...other_config",
>>    which is disabled by default but can be enabled at runtime. It works only
>>    if megaflows are disabled.
>> 2. Introduced the "enable_exact_match_offload()" API in
>>    "/lib/netdev-offload-tc.c" to zero-out unused masks.
>> 3. Moved "enable_megaflows" from "ofproto/ofproto-dpif-upcall.c" to "lib/-
>>    netdev-offload-tc.c" to make it visible in the "netdev-offload-tc.c"
>>    file. An alternate implementation could be to use a getter api.
>>
>> Testing results:
>> Verified the basic functionality of the patch with different flows.
>>
>> Future Work:
>> 1. Add support for connection tracking.
>> 2. Add support for OVS-DPDK rte_flow.
>>
>> Note: This feature does not affect user OpenFlow tables.
>> ---
>> Potential Issue:
>> This patch might not handle short-lived large numbers of new connections
>> effectively. The last proposal (i.e. proposal #4 listed below) can address
>> this issue. For now this feature is beneficial for many use cases, with
>> further optimization for handling large numbers of short-lived connections
>> planned for future work.
>>
>> Other Possible Solutions:
>> Several approaches to implementing exact-match offload are being
>> considered. Comments are welcome on the best approach:
>> 1. Offload all fields except mutable and unpredictable ones (e.g., ttl,
>>    tos, ip_frag, checksum). Megaflows are disabled.
>> 2. Offload only essential fields (e.g., 5-tuple + VLAN IDs + tunnel info
>>    etc). Megaflows are disabled.
>> 3. Provide configuration options for users to select which fields to
>>    offload. Megaflows are disabled.
>> 4. Introduce an offloading path from the OVS datapath (kernel) module to
>>    integrate megaflows and exact-match offload, addressing short-lived new
>>    connection performance. Suggestions on this approach are welcome.
>> --
>
> Thanks, Eelco,
>
> Thanks, Farhat, for submitting the RFC series. I noticed something that might 
> be a mistake in the subject line—it says "patch 1/3,” but only patch 1 was 
> sent.
> We are planning to submit 2 more patches in the future. I think I 
> misunderstood the use of series of patches.
>
> Regarding your patch, I see that it relies on disabling mega-flow support. 
> However, this isn’t something typically done in a production environment. The 
> ‘ovs-appctl upcall/disable-megaflows’ command was originally introduced for 
> debugging purposes when megaflows were first implemented. Since no one runs 
> or tests with this disabled, it’s generally considered a debug-only command.
>
> While I understand the value of your patch in simplifying hardware offloading 
> of datapath flows, disabling megaflows doesn’t seem like a practical solution 
> for production use. Perhaps it would be beneficial to collaborate with other 
> hardware vendors within the OVS community to find a more viable approach. It 
> might be worth reaching out to other vendors like, Broadcom, Marvell, NVIDIA, 
> etc. on this mailing list.

> What's your take on introducing offload path from megaflows to microflow and 
> microflow to tc-flower in the datapath (kernel) module.

I’m not entirely sure what you're proposing from an OVS architecture 
perspective, but it seems like it might require a new dpif implementation. 
Alternatively, we could consider adding this as a type of offload. That said, 
adding this to the existing netdev-offload-tc.c implementation might lead to a 
hacky solution, so you need a new type.

However, to support multiple offload types per netdev, a significant 
architectural change is needed to separate the offload framework. You might 
find it helpful to refer to the long discussion on the SFlow patch series 
between Ilya and Chris. I've CCed Ilya in case he wants to provide additional 
input.

> Regards,
> Farhat
>
> Cheers,
>
> Eelco
>
>> Co-authored-by: Farhat Ullah 
>> <[email protected]<mailto:[email protected]>>
>> Signed-off-by: Farhat Ullah 
>> <[email protected]<mailto:[email protected]>>
>> Signed-off-by: Farhan Tariq 
>> <[email protected]<mailto:[email protected]>>
>>
>> ---
>>  Documentation/howto/tc-offload.rst |  56 +++++++
>>  NEWS                               |   6 +
>>  lib/netdev-offload-tc.c            | 236 +++++++++++++++++++----------
>>  lib/netdev-offload.c               |   5 +
>>  lib/netdev-offload.h               |   2 +
>>  ofproto/ofproto-dpif-upcall.c      |   2 +-
>>  tests/system-offloads-traffic.at<http://system-offloads-traffic.at>   |  24 
>> +++
>>  vswitchd/vswitch.xml               |  19 +++
>>  8 files changed, 266 insertions(+), 84 deletions(-)
>>
>> diff --git a/Documentation/howto/tc-offload.rst 
>> b/Documentation/howto/tc-offload.rst
>> index ee7f73f8a..7e6e0e9d1 100644
>> --- a/Documentation/howto/tc-offload.rst
>> +++ b/Documentation/howto/tc-offload.rst
>> @@ -37,6 +37,50 @@ The flow hardware offload is disabled by default and can 
>> be enabled by::
>>  TC flower has one additional configuration option caled ``tc-policy``.
>>  For more details see ``man ovs-vswitchd.conf.db``.
>>
>> +
>> +Exact Match Flow Hardware Offload
>> +---------------------------------
>> +
>> +The exact match flow hardware offload is disabled by default and can be
>> +enabled by::
>> +
>> +    $ ovs-appctl upcall/disable-megaflows
>> +    $ ovs-vsctl set Open_vSwitch . other_config:hw-offload=true
>> +    $ ovs-vsctl set Open_vSwitch . other_config:exact-match-offload=true
>> +
>> +For more details see ``man ovs-vswitchd.conf.db``.
>> +
>> +
>> +Supported Fields
>> +~~~~~~~~~~~~~~~~
>> +
>> +Not all fields are offloaded in the exact match flow. The list of supported
>> +fields is:
>> +
>> +    ETH src_mac
>> +    ETH dst_mac
>> +    ETH eth_type
>> +    ARP arp_sha
>> +    ARP arp_tha
>> +    ARP arp_sip
>> +    ARP arp_tip
>> +    ARP arp_op
>> +    MPLS mpls_label
>> +    VLAN eth_type
>> +    VLAN id
>> +    IPv4/v6 src_ip
>> +    IPv4/v6 dst_ip
>> +    IPv4/v6 ip_proto
>> +    ICMPv4/v6 type
>> +    ICMPv4/v6 code
>> +    TCP/UDP/SCTP dst_port
>> +    TCP/UDP/SCTP src_port
>> +    TUNNEL enc_key_id
>> +    TUNNEL enc_src_ip
>> +    TUNNEL enc_dst_ip
>> +    TUNNEL enc_dst_port
>> +
>> +
>>  TC Meter Offload
>>  ----------------
>>
>> @@ -123,3 +167,15 @@ might run through a TC rule, which internally will not 
>> call the conntrack
>>  helper required.
>>
>>  So if ALG support is required, tc offload must be disabled.
>> +
>> +Exact Match Flow Offload
>> +~~~~~~~~~~~~~~~~~~~~~~~~
>> +
>> +This section describes the limitations related to the implementation of 
>> exact
>> +match flow offload.
>> +
>> +Conntrack 'ct_state' field not offloaded
>> +++++++++++++++++++++++++++++++++++++++++
>> +
>> +Currently, the exact match offloaded flow does not contain the ``ct_state``,
>> +field. This will be addressed in a future patch.
>> \ No newline at end of file
>> diff --git a/NEWS b/NEWS
>> index 944c95a8d..37d22630f 100644
>> --- a/NEWS
>> +++ b/NEWS
>> @@ -1,3 +1,9 @@
>> +Patch: v3.4.0 - 12 Sep 2024
>> +---------------------------
>> +  - Linux TC offload:
>> +    * Add support for offloading exact-match flows.
>> +
>> +
>>  v3.4.0 - 15 Aug 2024
>>  --------------------
>>     - Option '--mlockall' now only locks memory pages on fault, if possible.
>> diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c
>> index 3be1c08d2..fe1334d2c 100644
>> --- a/lib/netdev-offload-tc.c
>> +++ b/lib/netdev-offload-tc.c
>> @@ -32,6 +32,7 @@
>>  #include "netdev-linux.h"
>>  #include "netdev-offload-provider.h"
>>  #include "netdev-provider.h"
>> +#include "netdev-offload.h"
>>  #include "netdev-vport.h"
>>  #include "netlink.h"
>>  #include "netlink-socket.h"
>> @@ -44,6 +45,8 @@
>>
>>  VLOG_DEFINE_THIS_MODULE(netdev_offload_tc);
>>
>> +atomic_bool enable_megaflows = true;
>> +
>>  static struct vlog_rate_limit error_rl = VLOG_RATE_LIMIT_INIT(60, 5);
>>  static struct vlog_rate_limit warn_rl = VLOG_RATE_LIMIT_INIT(10, 2);
>>
>> @@ -1682,6 +1685,41 @@ is_ipv6_fragment_and_masked(const struct flow *key, 
>> const struct flow *mask)
>>      return false;
>>  }
>>
>> +/* Check and clear unused masks for exact-match offloads */
>> +static int
>> +test_key_and_clear_mask(struct match *match)
>> +{
>> +    const struct flow *key = &match->flow;
>> +    struct flow *mask = &match->wc.masks;
>> +
>> +    for (int i = 1; i < FLOW_MAX_MPLS_LABELS; i++) {
>> +        if (key->mpls_lse[i]) {
>> +            return EOPNOTSUPP;
>> +        }
>> +    }
>> +
>> +    for (int i = 1; i < FLOW_N_REGS; i++) {
>> +        if (key->regs[i]) {
>> +            return EOPNOTSUPP;
>> +        }
>> +    }
>> +
>> +    if (is_ipv6_fragment_and_masked(key, mask)) {
>> +        return EOPNOTSUPP;
>> +    }
>> +
>> +    if (key->dp_hash || key->conj_id ||
>> +        key->skb_priority || key->actset_output ||
>> +        key->packet_type || key->metadata ||
>> +        (key->dl_type == htons(ETH_TYPE_IP) &&
>> +            key->nw_proto == IPPROTO_IGMP)) {
>> +        return EOPNOTSUPP;
>> +    }
>> +
>> +    memset(mask, 0, sizeof *mask);
>> +    return 0;
>> +}
>> +
>>  static int
>>  test_key_and_mask(struct match *match)
>>  {
>> @@ -2220,6 +2258,9 @@ netdev_tc_flow_put(struct netdev *netdev, struct match 
>> *match,
>>      const struct flow_tnl *tnl = &match->flow.tunnel;
>>      struct flow_tnl *tnl_mask = &mask->tunnel;
>>      struct dpif_flow_stats adjust_stats;
>> +    bool exact_match_offload = false;
>> +    bool microflow = false;
>> +    bool megaflow = false;
>>      bool recirc_act = false;
>>      uint32_t block_id = 0;
>>      struct tcf_id id;
>> @@ -2235,6 +2276,11 @@ netdev_tc_flow_put(struct netdev *netdev, struct 
>> match *match,
>>          return -ifindex;
>>      }
>>
>> +    atomic_read_relaxed(&enable_megaflows, &megaflow);
>> +    atomic_read_relaxed(&netdev_exact_match_offload, &microflow);
>> +
>> +    exact_match_offload = !megaflow & microflow;
>> +
>>      memset(&flower, 0, sizeof flower);
>>
>>      chain = key->recirc_id;
>> @@ -2253,57 +2299,60 @@ netdev_tc_flow_put(struct netdev *netdev, struct 
>> match *match,
>>          flower.key.tunnel.ipv4.ipv4_dst = tnl->ip_dst;
>>          flower.key.tunnel.ipv6.ipv6_src = tnl->ipv6_src;
>>          flower.key.tunnel.ipv6.ipv6_dst = tnl->ipv6_dst;
>> -        flower.key.tunnel.tos = tnl->ip_tos;
>> -        flower.key.tunnel.ttl = tnl->ip_ttl;
>> -        flower.key.tunnel.tp_src = tnl->tp_src;
>>          flower.key.tunnel.tp_dst = tnl->tp_dst;
>> -        flower.key.tunnel.gbp.id<http://flower.key.tunnel.gbp.id> = 
>> tnl->gbp_id;
>> -        flower.key.tunnel.gbp.flags = tnl->gbp_flags;
>> -        flower.key.tunnel.gbp.id_present = !!tnl_mask->gbp_id;
>> -
>> +        flower.mask.tunnel.id<http://flower.mask.tunnel.id> = (tnl->flags & 
>> FLOW_TNL_F_KEY) ? tnl_mask->tun_id : 0;
>>          flower.mask.tunnel.ipv4.ipv4_src = tnl_mask->ip_src;
>>          flower.mask.tunnel.ipv4.ipv4_dst = tnl_mask->ip_dst;
>>          flower.mask.tunnel.ipv6.ipv6_src = tnl_mask->ipv6_src;
>>          flower.mask.tunnel.ipv6.ipv6_dst = tnl_mask->ipv6_dst;
>> -        flower.mask.tunnel.tos = tnl_mask->ip_tos;
>> -        flower.mask.tunnel.ttl = tnl_mask->ip_ttl;
>> -        flower.mask.tunnel.tp_src = tnl_mask->tp_src;
>>          /* XXX: We should be setting the mask from 'tnl_mask->tp_dst' here, 
>> but
>>           * some hardware drivers (mlx5) doesn't support masked matches and 
>> will
>>           * refuse to offload such flows keeping them in software path.
>>           * Degrading the flow down to exact match for now as a workaround. 
>> */
>>          flower.mask.tunnel.tp_dst = OVS_BE16_MAX;
>> -        flower.mask.tunnel.id<http://flower.mask.tunnel.id> = (tnl->flags & 
>> FLOW_TNL_F_KEY) ? tnl_mask->tun_id : 0;
>> -        flower.mask.tunnel.gbp.id<http://flower.mask.tunnel.gbp.id> = 
>> tnl_mask->gbp_id;
>> -        flower.mask.tunnel.gbp.flags = tnl_mask->gbp_flags;
>> -        flower.mask.tunnel.gbp.id_present = !!tnl_mask->gbp_id;
>> -
>> -        memset(&tnl_mask->ip_src, 0, sizeof tnl_mask->ip_src);
>> -        memset(&tnl_mask->ip_dst, 0, sizeof tnl_mask->ip_dst);
>> -        memset(&tnl_mask->ipv6_src, 0, sizeof tnl_mask->ipv6_src);
>> -        memset(&tnl_mask->ipv6_dst, 0, sizeof tnl_mask->ipv6_dst);
>> -        memset(&tnl_mask->ip_tos, 0, sizeof tnl_mask->ip_tos);
>> -        memset(&tnl_mask->ip_ttl, 0, sizeof tnl_mask->ip_ttl);
>> -        memset(&tnl_mask->tp_src, 0, sizeof tnl_mask->tp_src);
>> -        memset(&tnl_mask->tp_dst, 0, sizeof tnl_mask->tp_dst);
>> -
>> -        memset(&tnl_mask->tun_id, 0, sizeof tnl_mask->tun_id);
>> -        memset(&tnl_mask->gbp_id, 0, sizeof tnl_mask->gbp_id);
>> -        memset(&tnl_mask->gbp_flags, 0, sizeof tnl_mask->gbp_flags);
>> -        tnl_mask->flags &= ~FLOW_TNL_F_KEY;
>> -
>> -        /* XXX: This is wrong!  We're ignoring DF and CSUM flags 
>> configuration
>> -         * requested by the user.  However, TC for now has no way to pass
>> -         * these flags in a flower key and their masks are set by default,
>> -         * meaning tunnel offloading will not work at all if not cleared.
>> -         * Keeping incorrect behavior for now. */
>> -        tnl_mask->flags &= ~(FLOW_TNL_F_DONT_FRAGMENT | FLOW_TNL_F_CSUM);
>>
>> -        if (!strcmp(netdev_get_type(netdev), "geneve")) {
>> -            err = flower_match_to_tun_opt(&flower, tnl, tnl_mask);
>> -            if (err) {
>> -                VLOG_WARN_RL(&warn_rl, "Unable to parse geneve options");
>> -                return err;
>> +        if (!exact_match_offload) {
>> +            flower.key.tunnel.tos = tnl->ip_tos;
>> +            flower.key.tunnel.ttl = tnl->ip_ttl;
>> +            flower.key.tunnel.tp_src = tnl->tp_src;
>> +            flower.key.tunnel.gbp.id<http://flower.key.tunnel.gbp.id> = 
>> tnl->gbp_id;
>> +            flower.key.tunnel.gbp.flags = tnl->gbp_flags;
>> +            flower.key.tunnel.gbp.id_present = !!tnl_mask->gbp_id;
>> +
>> +            flower.mask.tunnel.tos = tnl_mask->ip_tos;
>> +            flower.mask.tunnel.ttl = tnl_mask->ip_ttl;
>> +            flower.mask.tunnel.tp_src = tnl_mask->tp_src;
>> +            flower.mask.tunnel.gbp.id<http://flower.mask.tunnel.gbp.id> = 
>> tnl_mask->gbp_id;
>> +            flower.mask.tunnel.gbp.flags = tnl_mask->gbp_flags;
>> +            flower.mask.tunnel.gbp.id_present = !!tnl_mask->gbp_id;
>> +
>> +            memset(&tnl_mask->ip_src, 0, sizeof tnl_mask->ip_src);
>> +            memset(&tnl_mask->ip_dst, 0, sizeof tnl_mask->ip_dst);
>> +            memset(&tnl_mask->ipv6_src, 0, sizeof tnl_mask->ipv6_src);
>> +            memset(&tnl_mask->ipv6_dst, 0, sizeof tnl_mask->ipv6_dst);
>> +            memset(&tnl_mask->ip_tos, 0, sizeof tnl_mask->ip_tos);
>> +            memset(&tnl_mask->ip_ttl, 0, sizeof tnl_mask->ip_ttl);
>> +            memset(&tnl_mask->tp_src, 0, sizeof tnl_mask->tp_src);
>> +            memset(&tnl_mask->tp_dst, 0, sizeof tnl_mask->tp_dst);
>> +
>> +            memset(&tnl_mask->tun_id, 0, sizeof tnl_mask->tun_id);
>> +            memset(&tnl_mask->gbp_id, 0, sizeof tnl_mask->gbp_id);
>> +            memset(&tnl_mask->gbp_flags, 0, sizeof tnl_mask->gbp_flags);
>> +            tnl_mask->flags &= ~FLOW_TNL_F_KEY;
>> +
>> +            /* XXX: This is wrong!  We're ignoring DF and CSUM flags 
>> configuration
>> +            * requested by the user.  However, TC for now has no way to pass
>> +            * these flags in a flower key and their masks are set by 
>> default,
>> +            * meaning tunnel offloading will not work at all if not cleared.
>> +            * Keeping incorrect behavior for now. */
>> +            tnl_mask->flags &= ~(FLOW_TNL_F_DONT_FRAGMENT | 
>> FLOW_TNL_F_CSUM);
>> +
>> +            if (!strcmp(netdev_get_type(netdev), "geneve")) {
>> +                err = flower_match_to_tun_opt(&flower, tnl, tnl_mask);
>> +                if (err) {
>> +                    VLOG_WARN_RL(&warn_rl, "Unable to parse geneve 
>> options");
>> +                    return err;
>> +                }
>>              }
>>          }
>>          flower.tunnel = true;
>> @@ -2319,6 +2368,12 @@ netdev_tc_flow_put(struct netdev *netdev, struct 
>> match *match,
>>          flower.key.mpls_lse = key->mpls_lse[0];
>>          flower.mask.mpls_lse = mask->mpls_lse[0];
>>          flower.key.encap_eth_type[0] = flower.key.eth_type;
>> +
>> +        if (exact_match_offload) {
>> +            /* Enable mpls_label only */
>> +            flower.key.mpls_lse &= ntohl(MPLS_LABEL_MASK);
>> +            flower.mask.mpls_lse &= ntohl(MPLS_LABEL_MASK);
>> +        }
>>      }
>>      mask->mpls_lse[0] = 0;
>>
>> @@ -2342,7 +2397,8 @@ netdev_tc_flow_put(struct netdev *netdev, struct match 
>> *match,
>>                  flower.mask.vlan_id[0] = 
>> vlan_tci_to_vid(mask->vlans[0].tci);
>>                  VLOG_DBG_RL(&rl, "vlan_id[0]: %d\n", flower.key.vlan_id[0]);
>>              }
>> -            if (pcp_mask) {
>> +            /* Skip vlan_prio in case of exact-match offload */
>> +            if (pcp_mask && !exact_match_offload) {
>>                  flower.key.vlan_prio[0] = 
>> vlan_tci_to_pcp(key->vlans[0].tci);
>>                  flower.mask.vlan_prio[0] = 
>> vlan_tci_to_pcp(mask->vlans[0].tci);
>>                  VLOG_DBG_RL(&rl, "vlan_prio[0]: %d\n",
>> @@ -2377,7 +2433,7 @@ netdev_tc_flow_put(struct netdev *netdev, struct match 
>> *match,
>>                  flower.mask.vlan_id[1] = 
>> vlan_tci_to_vid(mask->vlans[1].tci);
>>                  VLOG_DBG_RL(&rl, "vlan_id[1]: %d", flower.key.vlan_id[1]);
>>              }
>> -            if (pcp_mask) {
>> +            if (pcp_mask && !exact_match_offload) {
>>                  flower.key.vlan_prio[1] = 
>> vlan_tci_to_pcp(key->vlans[1].tci);
>>                  flower.mask.vlan_prio[1] = 
>> vlan_tci_to_pcp(mask->vlans[1].tci);
>>                  VLOG_DBG_RL(&rl, "vlan_prio[1]: %d", 
>> flower.key.vlan_prio[1]);
>> @@ -2421,44 +2477,34 @@ netdev_tc_flow_put(struct netdev *netdev, struct 
>> match *match,
>>      }
>>
>>      if (is_ip_any(key) && !is_ipv6_fragment_and_masked(key, mask)) {
>> +
>> +        if (key->dl_type == htons(ETH_P_IP)) {
>> +            flower.key.ipv4.ipv4_src = key->nw_src;
>> +            flower.mask.ipv4.ipv4_src = mask->nw_src;
>> +            flower.key.ipv4.ipv4_dst = key->nw_dst;
>> +            flower.mask.ipv4.ipv4_dst = mask->nw_dst;
>> +            mask->nw_src = 0;
>> +            mask->nw_dst = 0;
>> +        } else if (key->dl_type == htons(ETH_P_IPV6)) {
>> +            flower.key.ipv6.ipv6_src = key->ipv6_src;
>> +            flower.mask.ipv6.ipv6_src = mask->ipv6_src;
>> +            flower.key.ipv6.ipv6_dst = key->ipv6_dst;
>> +            flower.mask.ipv6.ipv6_dst = mask->ipv6_dst;
>> +            memset(&mask->ipv6_src, 0, sizeof mask->ipv6_src);
>> +            memset(&mask->ipv6_dst, 0, sizeof mask->ipv6_dst);
>> +        }
>> +
>>          flower.key.ip_proto = key->nw_proto;
>>          flower.mask.ip_proto = mask->nw_proto;
>>          mask->nw_proto = 0;
>> -        flower.key.ip_tos = key->nw_tos;
>> -        flower.mask.ip_tos = mask->nw_tos;
>> -        mask->nw_tos = 0;
>> -        flower.key.ip_ttl = key->nw_ttl;
>> -        flower.mask.ip_ttl = mask->nw_ttl;
>> -        mask->nw_ttl = 0;
>> -
>> -        if (mask->nw_frag & FLOW_NW_FRAG_ANY) {
>> -            flower.mask.flags |= TCA_FLOWER_KEY_FLAGS_IS_FRAGMENT;
>> -
>> -            if (key->nw_frag & FLOW_NW_FRAG_ANY) {
>> -                flower.key.flags |= TCA_FLOWER_KEY_FLAGS_IS_FRAGMENT;
>> -
>> -                if (mask->nw_frag & FLOW_NW_FRAG_LATER) {
>> -                    flower.mask.flags |= TCA_FLOWER_KEY_FLAGS_FRAG_IS_FIRST;
>> -
>> -                    if (!(key->nw_frag & FLOW_NW_FRAG_LATER)) {
>> -                        flower.key.flags |= 
>> TCA_FLOWER_KEY_FLAGS_FRAG_IS_FIRST;
>> -                    }
>> -                }
>> -            }
>> -
>> -            mask->nw_frag = 0;
>> -        }
>>
>>          if (key->nw_proto == IPPROTO_TCP) {
>>              flower.key.tcp_dst = key->tp_dst;
>>              flower.mask.tcp_dst = mask->tp_dst;
>>              flower.key.tcp_src = key->tp_src;
>>              flower.mask.tcp_src = mask->tp_src;
>> -            flower.key.tcp_flags = key->tcp_flags;
>> -            flower.mask.tcp_flags = mask->tcp_flags;
>>              mask->tp_src = 0;
>>              mask->tp_dst = 0;
>> -            mask->tcp_flags = 0;
>>          } else if (key->nw_proto == IPPROTO_UDP) {
>>              flower.key.udp_dst = key->tp_dst;
>>              flower.mask.udp_dst = mask->tp_dst;
>> @@ -2483,20 +2529,37 @@ netdev_tc_flow_put(struct netdev *netdev, struct 
>> match *match,
>>              mask->tp_dst = 0;
>>          }
>>
>> -        if (key->dl_type == htons(ETH_P_IP)) {
>> -            flower.key.ipv4.ipv4_src = key->nw_src;
>> -            flower.mask.ipv4.ipv4_src = mask->nw_src;
>> -            flower.key.ipv4.ipv4_dst = key->nw_dst;
>> -            flower.mask.ipv4.ipv4_dst = mask->nw_dst;
>> -            mask->nw_src = 0;
>> -            mask->nw_dst = 0;
>> -        } else if (key->dl_type == htons(ETH_P_IPV6)) {
>> -            flower.key.ipv6.ipv6_src = key->ipv6_src;
>> -            flower.mask.ipv6.ipv6_src = mask->ipv6_src;
>> -            flower.key.ipv6.ipv6_dst = key->ipv6_dst;
>> -            flower.mask.ipv6.ipv6_dst = mask->ipv6_dst;
>> -            memset(&mask->ipv6_src, 0, sizeof mask->ipv6_src);
>> -            memset(&mask->ipv6_dst, 0, sizeof mask->ipv6_dst);
>> +        if (!exact_match_offload) {
>> +            flower.key.ip_tos = key->nw_tos;
>> +            flower.mask.ip_tos = mask->nw_tos;
>> +            flower.key.ip_ttl = key->nw_ttl;
>> +            flower.mask.ip_ttl = mask->nw_ttl;
>> +            mask->nw_tos = 0;
>> +            mask->nw_ttl = 0;
>> +
>> +            if (mask->nw_frag & FLOW_NW_FRAG_ANY) {
>> +                flower.mask.flags |= TCA_FLOWER_KEY_FLAGS_IS_FRAGMENT;
>> +
>> +                if (key->nw_frag & FLOW_NW_FRAG_ANY) {
>> +                    flower.key.flags |= TCA_FLOWER_KEY_FLAGS_IS_FRAGMENT;
>> +
>> +                    if (mask->nw_frag & FLOW_NW_FRAG_LATER) {
>> +                        flower.mask.flags |= 
>> TCA_FLOWER_KEY_FLAGS_FRAG_IS_FIRST;
>> +
>> +                        if (!(key->nw_frag & FLOW_NW_FRAG_LATER)) {
>> +                            flower.key.flags |= 
>> TCA_FLOWER_KEY_FLAGS_FRAG_IS_FIRST;
>> +                        }
>> +                    }
>> +                }
>> +
>> +                mask->nw_frag = 0;
>> +            }
>> +
>> +            if (key->nw_proto == IPPROTO_TCP) {
>> +                flower.key.tcp_flags = key->tcp_flags;
>> +                flower.mask.tcp_flags = mask->tcp_flags;
>> +                mask->tcp_flags = 0;
>> +            }
>>          }
>>      }
>>
>> @@ -2507,6 +2570,13 @@ netdev_tc_flow_put(struct netdev *netdev, struct 
>> match *match,
>>          mask->pkt_mark = 0;
>>      }
>>
>> +    if (exact_match_offload) {
>> +        err = test_key_and_clear_mask(match);
>> +        if (err) {
>> +            return err;
>> +        }
>> +    }
>> +
>>      err = test_key_and_mask(match);
>>      if (err) {
>>          return err;
>> diff --git a/lib/netdev-offload.c b/lib/netdev-offload.c
>> index 8a9d36555..349abcf9e 100644
>> --- a/lib/netdev-offload.c
>> +++ b/lib/netdev-offload.c
>> @@ -60,6 +60,7 @@ VLOG_DEFINE_THIS_MODULE(netdev_offload);
>>
>>  static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>>  static bool netdev_flow_api_enabled = false;
>> +atomic_bool netdev_exact_match_offload = false;
>>
>>  #define DEFAULT_OFFLOAD_THREAD_NB 1
>>  #define MAX_OFFLOAD_THREAD_NB 10
>> @@ -902,6 +903,10 @@ netdev_set_flow_api_enabled(const struct smap 
>> *ovs_other_config)
>>                  netdev_offload_rebalance_policy = true;
>>              }
>>
>> +            if (smap_get_bool(ovs_other_config, "exact-match-offload", 
>> false)) {
>> +                atomic_store_relaxed(&netdev_exact_match_offload, true);
>> +            }
>> +
>>              netdev_ports_flow_init();
>>
>>              ovsthread_once_done(&once);
>> diff --git a/lib/netdev-offload.h b/lib/netdev-offload.h
>> index 47f8e6f48..004d6feaa 100644
>> --- a/lib/netdev-offload.h
>> +++ b/lib/netdev-offload.h
>> @@ -43,6 +43,8 @@ struct smap;
>>  struct sset;
>>  struct ovs_action_push_tnl;
>>
>> +extern atomic_bool enable_megaflows;
>> +extern atomic_bool netdev_exact_match_offload;
>>
>>  /* Offload-capable (HW) netdev information */
>>  struct netdev_hw_info {
>> diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
>> index 4d39bc5a7..72a6a303c 100644
>> --- a/ofproto/ofproto-dpif-upcall.c
>> +++ b/ofproto/ofproto-dpif-upcall.c
>> @@ -23,6 +23,7 @@
>>  #include "coverage.h"
>>  #include "cmap.h"
>>  #include "lib/dpif-provider.h"
>> +#include "lib/netdev-offload.h"
>>  #include "dpif.h"
>>  #include "openvswitch/dynamic-string.h"
>>  #include "fail-open.h"
>> @@ -446,7 +447,6 @@ static int udpif_flow_unprogram(struct udpif *udpif, 
>> struct udpif_key *ukey,
>>  static upcall_callback upcall_cb;
>>  static dp_purge_callback dp_purge_cb;
>>
>> -static atomic_bool enable_megaflows = true;
>>  static atomic_bool enable_ufid = true;
>>
>>  void
>> diff --git 
>> a/tests/system-offloads-traffic.at<http://system-offloads-traffic.at> 
>> b/tests/system-offloads-traffic.at<http://system-offloads-traffic.at>
>> index d1da33d96..345161f82 100644
>> --- a/tests/system-offloads-traffic.at<http://system-offloads-traffic.at>
>> +++ b/tests/system-offloads-traffic.at<http://system-offloads-traffic.at>
>> @@ -93,6 +93,30 @@ AT_CHECK([ovs-appctl upcall/show | grep -E "offloaded 
>> flows : [[1-9]]"], [0], [i
>>  OVS_TRAFFIC_VSWITCHD_STOP
>>  AT_CLEANUP
>>
>> +AT_SETUP([offloads - ping between two ports - offloads enabled 
>> (exact-match)])
>> +OVS_TRAFFIC_VSWITCHD_START([], [], [-- set Open_vSwitch . 
>> other_config:hw-offload=true other_config:exact-match-offload=true])
>> +   AT_CHECK([ovs-appctl upcall/disable-megaflows], [0], [megaflows disabled
>> +], [])
>> +
>> +AT_CHECK([ovs-appctl vlog/set dpif:dbg dpif_netdev:dbg 
>> netdev_offload_tc:file:dbg ofproto_dpif_upcall:DBG dpif_netlink:DBG])
>> +
>> +AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"])
>> +
>> +ADD_NAMESPACES(at_ns0, at_ns1)
>> +
>> +ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24<http://10.1.1.1/24> ")
>> +ADD_VETH(p1, at_ns1, br0, "10.1.1.2/24<http://10.1.1.2/24> ")
>> +
>> +NS_CHECK_EXEC([at_ns0], [ping -q -c 3 -i 0.1 -w 2 10.1.1.2 | FORMAT_PING], 
>> [0], [ignore])
>> +
>> +AT_CHECK([ovs-appctl dpctl/dump-flows type=tc | grep "eth_type(0x0800)" | 
>> strip_stats | DUMP_CLEAN_SORTED], [0], [dnl
>> +in_port(2),ct_zone(0),ct_mark(0),ct_label(0),eth(macs),eth_type(0x0800),ipv4(src=10.1.1.1,dst=10.1.1.2,proto=1),icmp(type=8,code=0),
>>  packets:0, bytes:0, used:0.001s, actions:output
>> +in_port(3),ct_zone(0),ct_mark(0),ct_label(0),eth(macs),eth_type(0x0800),ipv4(src=10.1.1.2,dst=10.1.1.1,proto=1),icmp(type=0,code=0),
>>  packets:0, bytes:0, used:0.001s, actions:output
>> +])
>> +
>> +OVS_TRAFFIC_VSWITCHD_STOP
>> +AT_CLEANUP
>> +
>>  AT_SETUP([offloads - set ingress_policing_rate and ingress_policing_burst - 
>> offloads disabled])
>>  AT_KEYWORDS([ingress_policing])
>>  OVS_CHECK_TC_QDISC()
>> diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
>> index 36cb4e495..90844b428 100644
>> --- a/vswitchd/vswitch.xml
>> +++ b/vswitchd/vswitch.xml
>> @@ -254,6 +254,25 @@
>>          </p>
>>        </column>
>>
>> +      <column name="other_config" key="exact-match-offload"
>> +              type='{"type": "boolean"}'>
>> +        <p>
>> +          Set this value to <code>true</code> to enable exact match netdev 
>> flow
>> +          offload.
>> +        </p>
>> +        <p>
>> +          The default value is <code>false</code>.
>> +        </p>
>> +        <p>
>> +          To offload exact match flows, the user needs to enable this
>> +          feature along with disabling the megaflows.
>> +        </p>
>> +        <p>
>> +          This is only relevant if
>> +          <ref column="other_config" key="hw-offload"/> is enabled.
>> +        </p>
>> +      </column>
>> +
>>        <column name="other_config" key="n-offload-threads"
>>                type='{"type": "integer", "minInteger": 1, "maxInteger": 10}'>
>>          <p>
>> --
>> 2.27.0
>>
>> _______________________________________________
>> dev mailing list
>> [email protected]
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
> _______________________________________________
> dev mailing list
> [email protected]
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to