On Tue, Nov 26, 2024 at 8:22 PM Mike Pattrick <[email protected]> wrote: > > This patch extends the support of tunneled TSO packets from just vxlan > and geneve to also include GRE. Support for GRE checksum offloading > isn't included as a this feature is not included in the DPDK offload
Maybe me, this reads strange. > features enumeration. > > The associated documentation and news items have also been updated to > indicate this feature. > > Signed-off-by: Mike Pattrick <[email protected]> I did not test this patch yet. Just a quick first pass for now. > --- > Documentation/topics/userspace-tso.rst | 10 +-- > NEWS | 3 + > lib/dp-packet-gso.c | 40 +++++++-- > lib/dp-packet.c | 2 + > lib/dp-packet.h | 22 +++++ > lib/netdev-dpdk.c | 14 ++++ > lib/netdev-native-tnl.c | 17 +++- > lib/netdev-provider.h | 1 + > lib/netdev.c | 12 ++- > tests/dpif-netdev.at | 85 ++++++++++++++++--- > tests/system-traffic.at | 110 ++++++++++++++++++++++++- > 11 files changed, 285 insertions(+), 31 deletions(-) > > diff --git a/Documentation/topics/userspace-tso.rst > b/Documentation/topics/userspace-tso.rst > index a21bb2b5d..a67421024 100644 > --- a/Documentation/topics/userspace-tso.rst > +++ b/Documentation/topics/userspace-tso.rst > @@ -110,8 +110,8 @@ Limitations > ~~~~~~~~~~~ > > The current OvS userspace `TSO` implementation supports flat and VLAN > networks > -only (i.e. no support for `TSO` over tunneled connection [VxLAN, GRE, IPinIP, > -etc.]). > +only, as well as a limited selection of protocols. Currently VxLAN, Geneve, > +and GRE are supported. Strictly speaking, updating the doc about VxLAN/Geneve + TSO should be a separate fix. > > The NIC driver must support and advertise checksum offload for TCP and UDP. > However, SCTP is not mandatory because very few drivers advertised support > @@ -120,12 +120,6 @@ in Open vSwitch. Currently, if the NIC supports that, > then the feature is > enabled, otherwise TSO can still be enabled but SCTP packets sent to the NIC > will be dropped. > > -There is no software implementation of TSO, so all ports attached to the > -datapath must support TSO or packets using that feature will be dropped > -on ports without TSO support. That also means guests using vhost-user > -in client mode will receive TSO packet regardless of TSO being enabled > -or disabled within the guest. > - Idem. > All kernel devices that use the raw socket interface (veth, for example) > require the kernel commit 9d2f67e43b73 ("net/packet: fix packet drop as of > virtio gso") in order to work properly. This commit was merged in upstream > diff --git a/NEWS b/NEWS > index 200a8bb40..977085402 100644 > --- a/NEWS > +++ b/NEWS > @@ -5,6 +5,9 @@ Post-v3.4.0 > that does not have a specific value defined, rather than being > treated as a global value, aligning the behavior with that of > the kernel datapath. > + * Packets marked for TSO and egressing on an interface which doesn't > + support that feature will now be segmented in software, even when > + tunnel encapsulated. I don't think it is something new added in v3.5. > - Linux TC offload: > * Add support for matching tunnel flags if the kernel supports it. > * Add support for the "Don't Fragment" (DF) flag in the encap action, > diff --git a/lib/dp-packet-gso.c b/lib/dp-packet-gso.c > index 04ebb19da..a0ff02442 100644 > --- a/lib/dp-packet-gso.c > +++ b/lib/dp-packet-gso.c > @@ -74,6 +74,7 @@ dp_packet_gso_nr_segs(struct dp_packet *p) > const char *data_pos; > > if (dp_packet_hwol_is_tunnel_vxlan(p) || > + dp_packet_hwol_is_tunnel_gre(p) || > dp_packet_hwol_is_tunnel_geneve(p)) { > data_pos = dp_packet_get_inner_tcp_payload(p); > } else { > @@ -105,7 +106,9 @@ dp_packet_gso(struct dp_packet *p, struct dp_packet_batch > **batches) > bool outer_ipv4; > int hdr_len; > int seg_len; > - bool tnl; > + bool udp_tnl = dp_packet_hwol_is_tunnel_vxlan(p) || > + dp_packet_hwol_is_tunnel_geneve(p); > + bool gre_tnl = dp_packet_hwol_is_tunnel_gre(p); > > tso_segsz = dp_packet_get_tso_segsz(p); > if (!tso_segsz) { > @@ -114,11 +117,21 @@ dp_packet_gso(struct dp_packet *p, struct > dp_packet_batch **batches) > return false; > } > > - if (dp_packet_hwol_is_tunnel_vxlan(p) || > - dp_packet_hwol_is_tunnel_geneve(p)) { > + if (gre_tnl) { > + struct gre_base_hdr *ghdr; > + > + ghdr = dp_packet_l4(p); > + > + if (ghdr->flags & htons(GRE_SEQ)) { > + /* Cannot support TSO with sequence numbers. */ > + VLOG_WARN_RL(&rl, "Cannot segmnet GRE packet with sequence " segment* > + "numbers."); I am not familiar with GRE. Is using GRE sequence numbers common? > + } > + } > + > + if (udp_tnl || gre_tnl) { > outer_ipv4 = dp_packet_hwol_is_outer_ipv4(p); > tcp_hdr = dp_packet_inner_l4(p); > - tnl = true; > > if (outer_ipv4) { > outer_ip_id = ntohs(((struct ip_header *) > dp_packet_l3(p))->ip_id); > @@ -130,7 +143,6 @@ dp_packet_gso(struct dp_packet *p, struct dp_packet_batch > **batches) > } else { > outer_ipv4 = dp_packet_hwol_is_ipv4(p); > tcp_hdr = dp_packet_l4(p); > - tnl = false; > > if (outer_ipv4) { > struct ip_header *ip_hdr = dp_packet_l3(p); > @@ -156,13 +168,15 @@ dp_packet_gso(struct dp_packet *p, struct > dp_packet_batch **batches) > seg = dp_packet_gso_seg_new(p, hdr_len, data_pos, seg_len); > data_pos += seg_len; > > - if (tnl) { > + if (udp_tnl) { > /* Update tunnel UDP header length. */ > struct udp_header *tnl_hdr; > > tnl_hdr = dp_packet_l4(seg); > tnl_hdr->udp_len = htons(dp_packet_l4_size(seg)); > + } > > + if (udp_tnl || gre_tnl) { > /* Update tunnel inner L3 header. */ > if (dp_packet_hwol_is_ipv4(seg)) { > struct ip_header *ip_hdr = dp_packet_inner_l3(seg); > @@ -194,7 +208,7 @@ dp_packet_gso(struct dp_packet *p, struct dp_packet_batch > **batches) > } > > /* Update L4 header. */ > - if (tnl) { > + if (udp_tnl || gre_tnl) { > tcp_hdr = dp_packet_inner_l4(seg); > } else { > tcp_hdr = dp_packet_l4(seg); > @@ -208,6 +222,18 @@ dp_packet_gso(struct dp_packet *p, struct > dp_packet_batch **batches) > tcp_hdr->tcp_ctl = TCP_CTL(tcp_flags, tcp_offset); > } > > + if (gre_tnl) { > + struct gre_base_hdr *ghdr; > + > + ghdr = dp_packet_l4(seg); > + > + if (ghdr->flags & htons(GRE_CSUM)) { > + ovs_be16 *csum_opt = (ovs_be16 *) (ghdr + 1); > + *csum_opt = 0; > + *csum_opt = csum(ghdr, dp_packet_l4_size(seg)); > + } > + } > + > if (dp_packet_batch_is_full(curr_batch)) { > curr_batch++; > } > diff --git a/lib/dp-packet.c b/lib/dp-packet.c > index df7bf8e6b..dad0d7be3 100644 > --- a/lib/dp-packet.c > +++ b/lib/dp-packet.c > @@ -604,6 +604,8 @@ dp_packet_ol_send_prepare(struct dp_packet *p, uint64_t > flags) > NETDEV_TX_OFFLOAD_SCTP_CKSUM | > NETDEV_TX_OFFLOAD_IPV4_CKSUM); > } > + } else if (dp_packet_hwol_is_tunnel_gre(p)) { > + tnl_inner = true; > } > > if (dp_packet_hwol_tx_ip_csum(p)) { > diff --git a/lib/dp-packet.h b/lib/dp-packet.h > index 4afbbe722..f4094daa8 100644 > --- a/lib/dp-packet.h > +++ b/lib/dp-packet.h > @@ -104,6 +104,9 @@ enum dp_packet_offload_mask { > /* Offload tunnel packet, outer header is IPv6. */ > DEF_OL_FLAG(DP_PACKET_OL_TX_OUTER_IPV6, > RTE_MBUF_F_TX_OUTER_IPV6, 0x40000), > + /* Offload packet is GRE tunnel. */ > + DEF_OL_FLAG(DP_PACKET_OL_TX_TUNNEL_GRE, > + RTE_MBUF_F_TX_TUNNEL_GRE, 0x80000), > > /* Adding new field requires adding to DP_PACKET_OL_SUPPORTED_MASK. */ > }; > @@ -123,6 +126,7 @@ enum dp_packet_offload_mask { > DP_PACKET_OL_TX_IP_CKSUM | \ > DP_PACKET_OL_TX_TUNNEL_GENEVE | \ > DP_PACKET_OL_TX_TUNNEL_VXLAN | \ > + DP_PACKET_OL_TX_TUNNEL_GRE | \ > DP_PACKET_OL_TX_OUTER_IPV4 | \ > DP_PACKET_OL_TX_OUTER_IP_CKSUM | \ > DP_PACKET_OL_TX_OUTER_UDP_CKSUM | \ > @@ -1171,6 +1175,13 @@ dp_packet_hwol_is_tunnel_vxlan(struct dp_packet *b) > return !!(*dp_packet_ol_flags_ptr(b) & DP_PACKET_OL_TX_TUNNEL_VXLAN); > } > > +/* Returns 'true' if packet 'b' is marked for GRE tunnel offloading. */ > +static inline bool > +dp_packet_hwol_is_tunnel_gre(struct dp_packet *b) > +{ > + return !!(*dp_packet_ol_flags_ptr(b) & DP_PACKET_OL_TX_TUNNEL_GRE); > +} > + > /* Returns 'true' if packet 'b' is marked for outer IPv4 checksum offload. */ > static inline bool > dp_packet_hwol_is_outer_ipv4_cksum(const struct dp_packet *b) > @@ -1289,11 +1300,19 @@ dp_packet_hwol_set_tunnel_vxlan(struct dp_packet *b) > *dp_packet_ol_flags_ptr(b) |= DP_PACKET_OL_TX_TUNNEL_VXLAN; > } > > +/* Mark packet 'b' for GRE tunnel offloading. */ > +static inline void > +dp_packet_hwol_set_tunnel_gre(struct dp_packet *b) > +{ > + *dp_packet_ol_flags_ptr(b) |= DP_PACKET_OL_TX_TUNNEL_GRE; > +} > + > /* Clears tunnel offloading marks. */ > static inline void > dp_packet_hwol_reset_tunnel(struct dp_packet *b) > { > *dp_packet_ol_flags_ptr(b) &= ~(DP_PACKET_OL_TX_TUNNEL_VXLAN | > + DP_PACKET_OL_TX_TUNNEL_GRE | > DP_PACKET_OL_TX_TUNNEL_GENEVE); > } > > @@ -1352,6 +1371,9 @@ dp_packet_hwol_reset_tcp_seg(struct dp_packet *p) > ol_flags |= DP_PACKET_OL_TX_OUTER_IP_CKSUM; > } > ol_flags |= DP_PACKET_OL_TX_OUTER_UDP_CKSUM; > + } else if (ol_flags & (DP_PACKET_OL_TX_TUNNEL_GRE & > + DP_PACKET_OL_TX_OUTER_IPV4)) { The only presence of DP_PACKET_OL_TX_TUNNEL_GRE will trigger setting DP_PACKET_OL_TX_OUTER_IP_CKSUM. You need to check that both bits are set. Something like: ol_flags & (DP_PACKET_OL_TX_TUNNEL_GRE | DP_PACKET_OL_TX_OUTER_IPV4) == (DP_PACKET_OL_TX_TUNNEL_GRE | DP_PACKET_OL_TX_OUTER_IPV4) > + ol_flags |= DP_PACKET_OL_TX_OUTER_IP_CKSUM; > } > > *dp_packet_ol_flags_ptr(p) = ol_flags; > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c > index e454a4a5d..36f56a243 100644 > --- a/lib/netdev-dpdk.c > +++ b/lib/netdev-dpdk.c > @@ -421,6 +421,7 @@ enum dpdk_hw_ol_features { > NETDEV_TX_GENEVE_TNL_TSO_OFFLOAD = 1 << 9, > NETDEV_TX_OUTER_IP_CKSUM_OFFLOAD = 1 << 10, > NETDEV_TX_OUTER_UDP_CKSUM_OFFLOAD = 1 << 11, > + NETDEV_TX_GRE_TNL_TSO_OFFLOAD = 1 << 12, > }; > > enum dpdk_rx_steer_flags { > @@ -1084,6 +1085,8 @@ netdev_dpdk_update_netdev_flags(struct netdev_dpdk *dev) > NETDEV_TX_OFFLOAD_TCP_TSO); > netdev_dpdk_update_netdev_flag(dev, NETDEV_TX_VXLAN_TNL_TSO_OFFLOAD, > NETDEV_TX_VXLAN_TNL_TSO); > + netdev_dpdk_update_netdev_flag(dev, NETDEV_TX_GRE_TNL_TSO_OFFLOAD, > + NETDEV_TX_GRE_TNL_TSO); > netdev_dpdk_update_netdev_flag(dev, NETDEV_TX_GENEVE_TNL_TSO_OFFLOAD, > NETDEV_TX_GENEVE_TNL_TSO); > netdev_dpdk_update_netdev_flag(dev, NETDEV_TX_OUTER_IP_CKSUM_OFFLOAD, > @@ -1152,6 +1155,10 @@ dpdk_eth_dev_port_config(struct netdev_dpdk *dev, int > n_rxq, int n_txq) > conf.txmode.offloads |= RTE_ETH_TX_OFFLOAD_GENEVE_TNL_TSO; > } > > + if (dev->hw_ol_features & NETDEV_TX_GRE_TNL_TSO_OFFLOAD) { > + conf.txmode.offloads |= RTE_ETH_TX_OFFLOAD_GRE_TNL_TSO; > + } > + > if (dev->hw_ol_features & NETDEV_TX_OUTER_IP_CKSUM_OFFLOAD) { > conf.txmode.offloads |= RTE_ETH_TX_OFFLOAD_OUTER_IPV4_CKSUM; > } > @@ -1424,6 +1431,13 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev) > VLOG_WARN("%s: Tx Geneve tunnel TSO offload is not supported.", > netdev_get_name(&dev->up)); > } > + > + if (info.tx_offload_capa & RTE_ETH_TX_OFFLOAD_GRE_TNL_TSO) { > + dev->hw_ol_features |= NETDEV_TX_GRE_TNL_TSO_OFFLOAD; > + } else { > + VLOG_WARN("%s: Tx GRE tunnel TSO offload is not supported.", > + netdev_get_name(&dev->up)); > + } > } > > n_rxq = MIN(info.max_rx_queues, dev->up.n_rxq); > diff --git a/lib/netdev-native-tnl.c b/lib/netdev-native-tnl.c > index 3e609cf54..b92ebe185 100644 > --- a/lib/netdev-native-tnl.c > +++ b/lib/netdev-native-tnl.c > @@ -174,7 +174,8 @@ netdev_tnl_push_ip_header(struct dp_packet *packet, const > void *header, > packet->l4_ofs = dp_packet_size(packet) - *ip_tot_size; > > if (dp_packet_hwol_is_tunnel_geneve(packet) || > - dp_packet_hwol_is_tunnel_vxlan(packet)) { > + dp_packet_hwol_is_tunnel_vxlan(packet) || > + dp_packet_hwol_is_tunnel_gre(packet)) { > dp_packet_hwol_set_tx_outer_ipv6(packet); > } else { > dp_packet_hwol_set_tx_ipv6(packet); > @@ -187,7 +188,8 @@ netdev_tnl_push_ip_header(struct dp_packet *packet, const > void *header, > ip->ip_tot_len = htons(*ip_tot_size); > /* Postpone checksum to when the packet is pushed to the port. */ > if (dp_packet_hwol_is_tunnel_geneve(packet) || > - dp_packet_hwol_is_tunnel_vxlan(packet)) { > + dp_packet_hwol_is_tunnel_vxlan(packet) || > + dp_packet_hwol_is_tunnel_gre(packet)) { > dp_packet_hwol_set_tx_outer_ipv4(packet); > dp_packet_hwol_set_tx_outer_ipv4_csum(packet); > } else { > @@ -510,9 +512,13 @@ netdev_gre_push_header(const struct netdev *netdev, > const struct ovs_action_push_tnl *data) > { > struct netdev_vport *dev = netdev_vport_cast(netdev); > + uint16_t l3_ofs = packet->l3_ofs; > + uint16_t l4_ofs = packet->l4_ofs; > struct gre_base_hdr *greh; > int ip_tot_size; > > + dp_packet_hwol_set_tunnel_gre(packet); > + > greh = netdev_tnl_push_ip_header(packet, data->header, data->header_len, > &ip_tot_size, 0); > > @@ -528,6 +534,13 @@ netdev_gre_push_header(const struct netdev *netdev, > ALIGNED_CAST(ovs_16aligned_be32 *, (char *)greh + seq_ofs); > put_16aligned_be32(seq_opt, > htonl(atomic_count_inc(&dev->gre_seqno))); > } > + > + if (l3_ofs != UINT16_MAX) { > + packet->inner_l3_ofs = l3_ofs + data->header_len; > + } > + if (l4_ofs != UINT16_MAX) { > + packet->inner_l4_ofs = l4_ofs + data->header_len; > + } > } > > int > diff --git a/lib/netdev-provider.h b/lib/netdev-provider.h > index 22840a058..5ae379469 100644 > --- a/lib/netdev-provider.h > +++ b/lib/netdev-provider.h > @@ -47,6 +47,7 @@ enum netdev_ol_flags { > NETDEV_TX_GENEVE_TNL_TSO = 1 << 6, > NETDEV_TX_OFFLOAD_OUTER_IP_CKSUM = 1 << 7, > NETDEV_TX_OFFLOAD_OUTER_UDP_CKSUM = 1 << 8, > + NETDEV_TX_GRE_TNL_TSO = 1 << 9, > }; > > /* A network device (e.g. an Ethernet device). > diff --git a/lib/netdev.c b/lib/netdev.c > index 02beac9d0..96f46ba92 100644 > --- a/lib/netdev.c > +++ b/lib/netdev.c > @@ -916,10 +916,12 @@ netdev_send(struct netdev *netdev, int qid, struct > dp_packet_batch *batch, > } > } > } else if (!(netdev_flags & (NETDEV_TX_VXLAN_TNL_TSO | > + NETDEV_TX_GRE_TNL_TSO | > NETDEV_TX_GENEVE_TNL_TSO))) { > DP_PACKET_BATCH_FOR_EACH (i, packet, batch) { > if (dp_packet_hwol_is_tso(packet) && > (dp_packet_hwol_is_tunnel_vxlan(packet) || > + dp_packet_hwol_is_tunnel_gre(packet) || > dp_packet_hwol_is_tunnel_geneve(packet))) { > return netdev_send_tso(netdev, qid, batch, > concurrent_txq); > } > @@ -1011,6 +1013,8 @@ netdev_push_header(const struct netdev *netdev, > DP_PACKET_BATCH_REFILL_FOR_EACH (i, size, packet, batch) { > if (OVS_UNLIKELY(data->tnl_type != OVS_VPORT_TYPE_GENEVE && > data->tnl_type != OVS_VPORT_TYPE_VXLAN && > + data->tnl_type != OVS_VPORT_TYPE_GRE && > + data->tnl_type != OVS_VPORT_TYPE_IP6GRE && > dp_packet_hwol_is_tso(packet))) { > COVERAGE_INC(netdev_push_header_drops); > dp_packet_delete(packet); > @@ -1019,16 +1023,19 @@ netdev_push_header(const struct netdev *netdev, > netdev_get_name(netdev), netdev_get_type(netdev)); > } else { > if (data->tnl_type != OVS_VPORT_TYPE_GENEVE && > - data->tnl_type != OVS_VPORT_TYPE_VXLAN) { > + data->tnl_type != OVS_VPORT_TYPE_VXLAN && > + data->tnl_type != OVS_VPORT_TYPE_GRE && > + data->tnl_type != OVS_VPORT_TYPE_IP6GRE) { > dp_packet_ol_send_prepare(packet, 0); > } else if (dp_packet_hwol_is_tunnel_geneve(packet) || > + dp_packet_hwol_is_tunnel_gre(packet) || > dp_packet_hwol_is_tunnel_vxlan(packet)) { > if (dp_packet_hwol_is_tso(packet)) { > COVERAGE_INC(netdev_push_header_drops); > dp_packet_delete(packet); > VLOG_WARN_RL(&rl, "%s: Tunneling packets with TSO is not > " > "supported with multiple levels of " > - "VXLAN or GENEVE encapsulation.", > + "VXLAN, GENEVE, or GRE encapsulation.", > netdev_get_name(netdev)); > continue; > } > @@ -1480,6 +1487,7 @@ netdev_get_status(const struct netdev *netdev, struct > smap *smap) > OL_ADD_STAT("sctp_csum", NETDEV_TX_OFFLOAD_SCTP_CKSUM); > OL_ADD_STAT("tcp_seg", NETDEV_TX_OFFLOAD_TCP_TSO); > OL_ADD_STAT("vxlan_tso", NETDEV_TX_VXLAN_TNL_TSO); > + OL_ADD_STAT("gre_tso", NETDEV_TX_GRE_TNL_TSO); > OL_ADD_STAT("geneve_tso", NETDEV_TX_GENEVE_TNL_TSO); > OL_ADD_STAT("out_ip_csum", NETDEV_TX_OFFLOAD_OUTER_IP_CKSUM); > OL_ADD_STAT("out_udp_csum", NETDEV_TX_OFFLOAD_OUTER_UDP_CKSUM); [snip] -- David Marchand _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
