On Tue, Nov 26, 2024 at 8:22 PM Mike Pattrick <[email protected]> wrote:
>
> This patch extends the support of tunneled TSO packets from just vxlan
> and geneve to also include GRE. Support for GRE checksum offloading
> isn't included as a this feature is not included in the DPDK offload

Maybe me, this reads strange.


> features enumeration.
>
> The associated documentation and news items have also been updated to
> indicate this feature.
>
> Signed-off-by: Mike Pattrick <[email protected]>

I did not test this patch yet.
Just a quick first pass for now.


> ---
>  Documentation/topics/userspace-tso.rst |  10 +--
>  NEWS                                   |   3 +
>  lib/dp-packet-gso.c                    |  40 +++++++--
>  lib/dp-packet.c                        |   2 +
>  lib/dp-packet.h                        |  22 +++++
>  lib/netdev-dpdk.c                      |  14 ++++
>  lib/netdev-native-tnl.c                |  17 +++-
>  lib/netdev-provider.h                  |   1 +
>  lib/netdev.c                           |  12 ++-
>  tests/dpif-netdev.at                   |  85 ++++++++++++++++---
>  tests/system-traffic.at                | 110 ++++++++++++++++++++++++-
>  11 files changed, 285 insertions(+), 31 deletions(-)
>
> diff --git a/Documentation/topics/userspace-tso.rst 
> b/Documentation/topics/userspace-tso.rst
> index a21bb2b5d..a67421024 100644
> --- a/Documentation/topics/userspace-tso.rst
> +++ b/Documentation/topics/userspace-tso.rst
> @@ -110,8 +110,8 @@ Limitations
>  ~~~~~~~~~~~
>
>  The current OvS userspace `TSO` implementation supports flat and VLAN 
> networks
> -only (i.e. no support for `TSO` over tunneled connection [VxLAN, GRE, IPinIP,
> -etc.]).
> +only, as well as a limited selection of protocols. Currently VxLAN, Geneve,
> +and GRE are supported.

Strictly speaking, updating the doc about VxLAN/Geneve + TSO should be
a separate fix.


>
>  The NIC driver must support and advertise checksum offload for TCP and UDP.
>  However, SCTP is not mandatory because very few drivers advertised support
> @@ -120,12 +120,6 @@ in Open vSwitch. Currently, if the NIC supports that, 
> then the feature is
>  enabled, otherwise TSO can still be enabled but SCTP packets sent to the NIC
>  will be dropped.
>
> -There is no software implementation of TSO, so all ports attached to the
> -datapath must support TSO or packets using that feature will be dropped
> -on ports without TSO support.  That also means guests using vhost-user
> -in client mode will receive TSO packet regardless of TSO being enabled
> -or disabled within the guest.
> -

Idem.


>  All kernel devices that use the raw socket interface (veth, for example)
>  require the kernel commit 9d2f67e43b73 ("net/packet: fix packet drop as of
>  virtio gso") in order to work properly. This commit was merged in upstream
> diff --git a/NEWS b/NEWS
> index 200a8bb40..977085402 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -5,6 +5,9 @@ Post-v3.4.0
>         that does not have a specific value defined, rather than being
>         treated as a global value, aligning the behavior with that of
>         the kernel datapath.
> +     * Packets marked for TSO and egressing on an interface which doesn't
> +       support that feature will now be segmented in software, even when
> +       tunnel encapsulated.

I don't think it is something new added in v3.5.


>     - Linux TC offload:
>       * Add support for matching tunnel flags if the kernel supports it.
>       * Add support for the "Don't Fragment" (DF) flag in the encap action,
> diff --git a/lib/dp-packet-gso.c b/lib/dp-packet-gso.c
> index 04ebb19da..a0ff02442 100644
> --- a/lib/dp-packet-gso.c
> +++ b/lib/dp-packet-gso.c
> @@ -74,6 +74,7 @@ dp_packet_gso_nr_segs(struct dp_packet *p)
>      const char *data_pos;
>
>      if (dp_packet_hwol_is_tunnel_vxlan(p) ||
> +        dp_packet_hwol_is_tunnel_gre(p) ||
>          dp_packet_hwol_is_tunnel_geneve(p)) {
>          data_pos = dp_packet_get_inner_tcp_payload(p);
>      } else {
> @@ -105,7 +106,9 @@ dp_packet_gso(struct dp_packet *p, struct dp_packet_batch 
> **batches)
>      bool outer_ipv4;
>      int hdr_len;
>      int seg_len;
> -    bool tnl;
> +    bool udp_tnl = dp_packet_hwol_is_tunnel_vxlan(p) ||
> +                   dp_packet_hwol_is_tunnel_geneve(p);
> +    bool gre_tnl = dp_packet_hwol_is_tunnel_gre(p);
>
>      tso_segsz = dp_packet_get_tso_segsz(p);
>      if (!tso_segsz) {
> @@ -114,11 +117,21 @@ dp_packet_gso(struct dp_packet *p, struct 
> dp_packet_batch **batches)
>          return false;
>      }
>
> -    if (dp_packet_hwol_is_tunnel_vxlan(p) ||
> -        dp_packet_hwol_is_tunnel_geneve(p)) {
> +    if (gre_tnl) {
> +        struct gre_base_hdr *ghdr;
> +
> +        ghdr = dp_packet_l4(p);
> +
> +        if (ghdr->flags & htons(GRE_SEQ)) {
> +            /* Cannot support TSO with sequence numbers. */
> +            VLOG_WARN_RL(&rl, "Cannot segmnet GRE packet with sequence "

segment*

> +                              "numbers.");

I am not familiar with GRE.
Is using GRE sequence numbers common?


> +        }
> +    }
> +
> +    if (udp_tnl || gre_tnl) {
>          outer_ipv4 = dp_packet_hwol_is_outer_ipv4(p);
>          tcp_hdr = dp_packet_inner_l4(p);
> -        tnl = true;
>
>          if (outer_ipv4) {
>              outer_ip_id = ntohs(((struct ip_header *) 
> dp_packet_l3(p))->ip_id);
> @@ -130,7 +143,6 @@ dp_packet_gso(struct dp_packet *p, struct dp_packet_batch 
> **batches)
>      } else {
>          outer_ipv4 = dp_packet_hwol_is_ipv4(p);
>          tcp_hdr = dp_packet_l4(p);
> -        tnl = false;
>
>          if (outer_ipv4) {
>              struct ip_header *ip_hdr = dp_packet_l3(p);
> @@ -156,13 +168,15 @@ dp_packet_gso(struct dp_packet *p, struct 
> dp_packet_batch **batches)
>          seg = dp_packet_gso_seg_new(p, hdr_len, data_pos, seg_len);
>          data_pos += seg_len;
>
> -        if (tnl) {
> +        if (udp_tnl) {
>              /* Update tunnel UDP header length. */
>              struct udp_header *tnl_hdr;
>
>              tnl_hdr = dp_packet_l4(seg);
>              tnl_hdr->udp_len = htons(dp_packet_l4_size(seg));
> +        }
>
> +        if (udp_tnl || gre_tnl) {
>              /* Update tunnel inner L3 header. */
>              if (dp_packet_hwol_is_ipv4(seg)) {
>                  struct ip_header *ip_hdr = dp_packet_inner_l3(seg);
> @@ -194,7 +208,7 @@ dp_packet_gso(struct dp_packet *p, struct dp_packet_batch 
> **batches)
>          }
>
>          /* Update L4 header. */
> -        if (tnl) {
> +        if (udp_tnl || gre_tnl) {
>              tcp_hdr = dp_packet_inner_l4(seg);
>          } else {
>              tcp_hdr = dp_packet_l4(seg);
> @@ -208,6 +222,18 @@ dp_packet_gso(struct dp_packet *p, struct 
> dp_packet_batch **batches)
>              tcp_hdr->tcp_ctl = TCP_CTL(tcp_flags, tcp_offset);
>          }
>
> +        if (gre_tnl) {
> +            struct gre_base_hdr *ghdr;
> +
> +            ghdr = dp_packet_l4(seg);
> +
> +            if (ghdr->flags & htons(GRE_CSUM)) {
> +                ovs_be16 *csum_opt = (ovs_be16 *) (ghdr + 1);
> +                *csum_opt = 0;
> +                *csum_opt = csum(ghdr, dp_packet_l4_size(seg));
> +            }
> +        }
> +
>          if (dp_packet_batch_is_full(curr_batch)) {
>              curr_batch++;
>          }
> diff --git a/lib/dp-packet.c b/lib/dp-packet.c
> index df7bf8e6b..dad0d7be3 100644
> --- a/lib/dp-packet.c
> +++ b/lib/dp-packet.c
> @@ -604,6 +604,8 @@ dp_packet_ol_send_prepare(struct dp_packet *p, uint64_t 
> flags)
>                         NETDEV_TX_OFFLOAD_SCTP_CKSUM |
>                         NETDEV_TX_OFFLOAD_IPV4_CKSUM);
>          }
> +    } else if (dp_packet_hwol_is_tunnel_gre(p)) {
> +        tnl_inner = true;
>      }
>
>      if (dp_packet_hwol_tx_ip_csum(p)) {
> diff --git a/lib/dp-packet.h b/lib/dp-packet.h
> index 4afbbe722..f4094daa8 100644
> --- a/lib/dp-packet.h
> +++ b/lib/dp-packet.h
> @@ -104,6 +104,9 @@ enum dp_packet_offload_mask {
>      /* Offload tunnel packet, outer header is IPv6. */
>      DEF_OL_FLAG(DP_PACKET_OL_TX_OUTER_IPV6,
>                  RTE_MBUF_F_TX_OUTER_IPV6, 0x40000),
> +    /* Offload packet is GRE tunnel. */
> +    DEF_OL_FLAG(DP_PACKET_OL_TX_TUNNEL_GRE,
> +                RTE_MBUF_F_TX_TUNNEL_GRE, 0x80000),
>
>      /* Adding new field requires adding to DP_PACKET_OL_SUPPORTED_MASK. */
>  };
> @@ -123,6 +126,7 @@ enum dp_packet_offload_mask {
>                                       DP_PACKET_OL_TX_IP_CKSUM        | \
>                                       DP_PACKET_OL_TX_TUNNEL_GENEVE   | \
>                                       DP_PACKET_OL_TX_TUNNEL_VXLAN    | \
> +                                     DP_PACKET_OL_TX_TUNNEL_GRE      | \
>                                       DP_PACKET_OL_TX_OUTER_IPV4      | \
>                                       DP_PACKET_OL_TX_OUTER_IP_CKSUM  | \
>                                       DP_PACKET_OL_TX_OUTER_UDP_CKSUM | \
> @@ -1171,6 +1175,13 @@ dp_packet_hwol_is_tunnel_vxlan(struct dp_packet *b)
>      return !!(*dp_packet_ol_flags_ptr(b) & DP_PACKET_OL_TX_TUNNEL_VXLAN);
>  }
>
> +/* Returns 'true' if packet 'b' is marked for GRE tunnel offloading. */
> +static inline bool
> +dp_packet_hwol_is_tunnel_gre(struct dp_packet *b)
> +{
> +    return !!(*dp_packet_ol_flags_ptr(b) & DP_PACKET_OL_TX_TUNNEL_GRE);
> +}
> +
>  /* Returns 'true' if packet 'b' is marked for outer IPv4 checksum offload. */
>  static inline bool
>  dp_packet_hwol_is_outer_ipv4_cksum(const struct dp_packet *b)
> @@ -1289,11 +1300,19 @@ dp_packet_hwol_set_tunnel_vxlan(struct dp_packet *b)
>      *dp_packet_ol_flags_ptr(b) |= DP_PACKET_OL_TX_TUNNEL_VXLAN;
>  }
>
> +/* Mark packet 'b' for GRE tunnel offloading. */
> +static inline void
> +dp_packet_hwol_set_tunnel_gre(struct dp_packet *b)
> +{
> +    *dp_packet_ol_flags_ptr(b) |= DP_PACKET_OL_TX_TUNNEL_GRE;
> +}
> +
>  /* Clears tunnel offloading marks. */
>  static inline void
>  dp_packet_hwol_reset_tunnel(struct dp_packet *b)
>  {
>      *dp_packet_ol_flags_ptr(b) &= ~(DP_PACKET_OL_TX_TUNNEL_VXLAN |
> +                                    DP_PACKET_OL_TX_TUNNEL_GRE |
>                                      DP_PACKET_OL_TX_TUNNEL_GENEVE);
>  }
>
> @@ -1352,6 +1371,9 @@ dp_packet_hwol_reset_tcp_seg(struct dp_packet *p)
>              ol_flags |= DP_PACKET_OL_TX_OUTER_IP_CKSUM;
>          }
>          ol_flags |= DP_PACKET_OL_TX_OUTER_UDP_CKSUM;
> +    } else if (ol_flags & (DP_PACKET_OL_TX_TUNNEL_GRE &
> +                           DP_PACKET_OL_TX_OUTER_IPV4)) {

The only presence of DP_PACKET_OL_TX_TUNNEL_GRE will trigger setting
DP_PACKET_OL_TX_OUTER_IP_CKSUM.
You need to check that both bits are set.
Something like: ol_flags & (DP_PACKET_OL_TX_TUNNEL_GRE |
DP_PACKET_OL_TX_OUTER_IPV4) == (DP_PACKET_OL_TX_TUNNEL_GRE |
DP_PACKET_OL_TX_OUTER_IPV4)


> +        ol_flags |= DP_PACKET_OL_TX_OUTER_IP_CKSUM;
>      }
>
>      *dp_packet_ol_flags_ptr(p) = ol_flags;
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> index e454a4a5d..36f56a243 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -421,6 +421,7 @@ enum dpdk_hw_ol_features {
>      NETDEV_TX_GENEVE_TNL_TSO_OFFLOAD = 1 << 9,
>      NETDEV_TX_OUTER_IP_CKSUM_OFFLOAD = 1 << 10,
>      NETDEV_TX_OUTER_UDP_CKSUM_OFFLOAD = 1 << 11,
> +    NETDEV_TX_GRE_TNL_TSO_OFFLOAD = 1 << 12,
>  };
>
>  enum dpdk_rx_steer_flags {
> @@ -1084,6 +1085,8 @@ netdev_dpdk_update_netdev_flags(struct netdev_dpdk *dev)
>                                     NETDEV_TX_OFFLOAD_TCP_TSO);
>      netdev_dpdk_update_netdev_flag(dev, NETDEV_TX_VXLAN_TNL_TSO_OFFLOAD,
>                                     NETDEV_TX_VXLAN_TNL_TSO);
> +    netdev_dpdk_update_netdev_flag(dev, NETDEV_TX_GRE_TNL_TSO_OFFLOAD,
> +                                   NETDEV_TX_GRE_TNL_TSO);
>      netdev_dpdk_update_netdev_flag(dev, NETDEV_TX_GENEVE_TNL_TSO_OFFLOAD,
>                                     NETDEV_TX_GENEVE_TNL_TSO);
>      netdev_dpdk_update_netdev_flag(dev, NETDEV_TX_OUTER_IP_CKSUM_OFFLOAD,
> @@ -1152,6 +1155,10 @@ dpdk_eth_dev_port_config(struct netdev_dpdk *dev, int 
> n_rxq, int n_txq)
>          conf.txmode.offloads |= RTE_ETH_TX_OFFLOAD_GENEVE_TNL_TSO;
>      }
>
> +    if (dev->hw_ol_features & NETDEV_TX_GRE_TNL_TSO_OFFLOAD) {
> +        conf.txmode.offloads |= RTE_ETH_TX_OFFLOAD_GRE_TNL_TSO;
> +    }
> +
>      if (dev->hw_ol_features & NETDEV_TX_OUTER_IP_CKSUM_OFFLOAD) {
>          conf.txmode.offloads |= RTE_ETH_TX_OFFLOAD_OUTER_IPV4_CKSUM;
>      }
> @@ -1424,6 +1431,13 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev)
>              VLOG_WARN("%s: Tx Geneve tunnel TSO offload is not supported.",
>                        netdev_get_name(&dev->up));
>          }
> +
> +        if (info.tx_offload_capa & RTE_ETH_TX_OFFLOAD_GRE_TNL_TSO) {
> +            dev->hw_ol_features |= NETDEV_TX_GRE_TNL_TSO_OFFLOAD;
> +        } else {
> +            VLOG_WARN("%s: Tx GRE tunnel TSO offload is not supported.",
> +                      netdev_get_name(&dev->up));
> +        }
>      }
>
>      n_rxq = MIN(info.max_rx_queues, dev->up.n_rxq);
> diff --git a/lib/netdev-native-tnl.c b/lib/netdev-native-tnl.c
> index 3e609cf54..b92ebe185 100644
> --- a/lib/netdev-native-tnl.c
> +++ b/lib/netdev-native-tnl.c
> @@ -174,7 +174,8 @@ netdev_tnl_push_ip_header(struct dp_packet *packet, const 
> void *header,
>          packet->l4_ofs = dp_packet_size(packet) - *ip_tot_size;
>
>          if (dp_packet_hwol_is_tunnel_geneve(packet) ||
> -            dp_packet_hwol_is_tunnel_vxlan(packet)) {
> +            dp_packet_hwol_is_tunnel_vxlan(packet) ||
> +            dp_packet_hwol_is_tunnel_gre(packet)) {
>              dp_packet_hwol_set_tx_outer_ipv6(packet);
>          } else {
>              dp_packet_hwol_set_tx_ipv6(packet);
> @@ -187,7 +188,8 @@ netdev_tnl_push_ip_header(struct dp_packet *packet, const 
> void *header,
>          ip->ip_tot_len = htons(*ip_tot_size);
>          /* Postpone checksum to when the packet is pushed to the port. */
>          if (dp_packet_hwol_is_tunnel_geneve(packet) ||
> -            dp_packet_hwol_is_tunnel_vxlan(packet)) {
> +            dp_packet_hwol_is_tunnel_vxlan(packet) ||
> +            dp_packet_hwol_is_tunnel_gre(packet)) {
>              dp_packet_hwol_set_tx_outer_ipv4(packet);
>              dp_packet_hwol_set_tx_outer_ipv4_csum(packet);
>          } else {
> @@ -510,9 +512,13 @@ netdev_gre_push_header(const struct netdev *netdev,
>                         const struct ovs_action_push_tnl *data)
>  {
>      struct netdev_vport *dev = netdev_vport_cast(netdev);
> +    uint16_t l3_ofs = packet->l3_ofs;
> +    uint16_t l4_ofs = packet->l4_ofs;
>      struct gre_base_hdr *greh;
>      int ip_tot_size;
>
> +    dp_packet_hwol_set_tunnel_gre(packet);
> +
>      greh = netdev_tnl_push_ip_header(packet, data->header, data->header_len,
>                                       &ip_tot_size, 0);
>
> @@ -528,6 +534,13 @@ netdev_gre_push_header(const struct netdev *netdev,
>              ALIGNED_CAST(ovs_16aligned_be32 *, (char *)greh + seq_ofs);
>          put_16aligned_be32(seq_opt, 
> htonl(atomic_count_inc(&dev->gre_seqno)));
>      }
> +
> +    if (l3_ofs != UINT16_MAX) {
> +        packet->inner_l3_ofs = l3_ofs + data->header_len;
> +    }
> +    if (l4_ofs != UINT16_MAX) {
> +        packet->inner_l4_ofs = l4_ofs + data->header_len;
> +    }
>  }
>
>  int
> diff --git a/lib/netdev-provider.h b/lib/netdev-provider.h
> index 22840a058..5ae379469 100644
> --- a/lib/netdev-provider.h
> +++ b/lib/netdev-provider.h
> @@ -47,6 +47,7 @@ enum netdev_ol_flags {
>      NETDEV_TX_GENEVE_TNL_TSO = 1 << 6,
>      NETDEV_TX_OFFLOAD_OUTER_IP_CKSUM = 1 << 7,
>      NETDEV_TX_OFFLOAD_OUTER_UDP_CKSUM = 1 << 8,
> +    NETDEV_TX_GRE_TNL_TSO = 1 << 9,
>  };
>
>  /* A network device (e.g. an Ethernet device).
> diff --git a/lib/netdev.c b/lib/netdev.c
> index 02beac9d0..96f46ba92 100644
> --- a/lib/netdev.c
> +++ b/lib/netdev.c
> @@ -916,10 +916,12 @@ netdev_send(struct netdev *netdev, int qid, struct 
> dp_packet_batch *batch,
>                  }
>              }
>          } else if (!(netdev_flags & (NETDEV_TX_VXLAN_TNL_TSO |
> +                                     NETDEV_TX_GRE_TNL_TSO |
>                                       NETDEV_TX_GENEVE_TNL_TSO))) {
>              DP_PACKET_BATCH_FOR_EACH (i, packet, batch) {
>                  if (dp_packet_hwol_is_tso(packet) &&
>                      (dp_packet_hwol_is_tunnel_vxlan(packet) ||
> +                     dp_packet_hwol_is_tunnel_gre(packet) ||
>                       dp_packet_hwol_is_tunnel_geneve(packet))) {
>                      return netdev_send_tso(netdev, qid, batch, 
> concurrent_txq);
>                  }
> @@ -1011,6 +1013,8 @@ netdev_push_header(const struct netdev *netdev,
>      DP_PACKET_BATCH_REFILL_FOR_EACH (i, size, packet, batch) {
>          if (OVS_UNLIKELY(data->tnl_type != OVS_VPORT_TYPE_GENEVE &&
>                           data->tnl_type != OVS_VPORT_TYPE_VXLAN &&
> +                         data->tnl_type != OVS_VPORT_TYPE_GRE &&
> +                         data->tnl_type != OVS_VPORT_TYPE_IP6GRE &&
>                           dp_packet_hwol_is_tso(packet))) {
>              COVERAGE_INC(netdev_push_header_drops);
>              dp_packet_delete(packet);
> @@ -1019,16 +1023,19 @@ netdev_push_header(const struct netdev *netdev,
>                           netdev_get_name(netdev), netdev_get_type(netdev));
>          } else {
>              if (data->tnl_type != OVS_VPORT_TYPE_GENEVE &&
> -                data->tnl_type != OVS_VPORT_TYPE_VXLAN) {
> +                data->tnl_type != OVS_VPORT_TYPE_VXLAN &&
> +                data->tnl_type != OVS_VPORT_TYPE_GRE &&
> +                data->tnl_type != OVS_VPORT_TYPE_IP6GRE) {
>                  dp_packet_ol_send_prepare(packet, 0);
>              } else if (dp_packet_hwol_is_tunnel_geneve(packet) ||
> +                       dp_packet_hwol_is_tunnel_gre(packet) ||
>                         dp_packet_hwol_is_tunnel_vxlan(packet)) {
>                  if (dp_packet_hwol_is_tso(packet)) {
>                      COVERAGE_INC(netdev_push_header_drops);
>                      dp_packet_delete(packet);
>                      VLOG_WARN_RL(&rl, "%s: Tunneling packets with TSO is not 
> "
>                                        "supported with multiple levels of "
> -                                      "VXLAN or GENEVE encapsulation.",
> +                                      "VXLAN, GENEVE, or GRE encapsulation.",
>                                   netdev_get_name(netdev));
>                      continue;
>                  }
> @@ -1480,6 +1487,7 @@ netdev_get_status(const struct netdev *netdev, struct 
> smap *smap)
>          OL_ADD_STAT("sctp_csum", NETDEV_TX_OFFLOAD_SCTP_CKSUM);
>          OL_ADD_STAT("tcp_seg", NETDEV_TX_OFFLOAD_TCP_TSO);
>          OL_ADD_STAT("vxlan_tso", NETDEV_TX_VXLAN_TNL_TSO);
> +        OL_ADD_STAT("gre_tso", NETDEV_TX_GRE_TNL_TSO);
>          OL_ADD_STAT("geneve_tso", NETDEV_TX_GENEVE_TNL_TSO);
>          OL_ADD_STAT("out_ip_csum", NETDEV_TX_OFFLOAD_OUTER_IP_CKSUM);
>          OL_ADD_STAT("out_udp_csum", NETDEV_TX_OFFLOAD_OUTER_UDP_CKSUM);

[snip]


-- 
David Marchand

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to