On 3/27/23 12:50, Mike Pattrick wrote:
> From: Flavio Leitner <[email protected]>
>
> The netdev receiving packets is supposed to provide the flags
> indicating if the IP checksum was verified and it is GOOD or BAD,
> otherwise the stack will check when appropriate by software.
>
> If the packet comes with good checksum, then postpone the
> checksum calculation to the egress device if needed.
>
> When encapsulate a packet with that flag, set the checksum
> of the inner IP header since that is not yet supported.
>
> Calculate the IP checksum when the packet is going to be sent over
> a device that doesn't support the feature.
>
> Linux devices don't support IP checksum offload alone, so the
> support is not enabled.
>
> Signed-off-by: Flavio Leitner <[email protected]>
> Co-authored-by: Mike Pattrick <[email protected]>
> Signed-off-by: Mike Pattrick <[email protected]>
> --
> Since v9:
> - Removed duplicative field tx_ip_csum_offload from netdev-dpdk.c
> - Left rx_csum_offload field as it is not duplicative
> - Moved system-userspace-offload.at tests to dpif-netdev.at
> - Various visual changes
> - Extended miniflow_extract changes into avx512 code
> Since v10:
> - avx512 checksum length corrected
> ---
> lib/conntrack.c | 19 ++++----
> lib/dp-packet.c | 15 ++++++
> lib/dp-packet.h | 62 +++++++++++++++++++++++--
> lib/dpif-netdev-extract-avx512.c | 5 ++
> lib/dpif-netdev.c | 2 +
> lib/flow.c | 15 ++++--
> lib/ipf.c | 11 +++--
> lib/netdev-dpdk.c | 71 +++++++++++++++++++----------
> lib/netdev-dummy.c | 23 ++++++++++
> lib/netdev-native-tnl.c | 21 ++++++---
> lib/netdev.c | 16 +++++++
> lib/odp-execute-avx512.c | 19 +++++---
> lib/odp-execute.c | 21 +++++++--
> lib/packets.c | 34 +++++++++++---
> tests/dpif-netdev.at | 78 ++++++++++++++++++++++++++++++++
> 15 files changed, 345 insertions(+), 67 deletions(-)
>
<snip>
> diff --git a/lib/netdev-native-tnl.c b/lib/netdev-native-tnl.c
> index b89dfdd52..53055a254 100644
> --- a/lib/netdev-native-tnl.c
> +++ b/lib/netdev-native-tnl.c
> @@ -88,7 +88,10 @@ netdev_tnl_ip_extract_tnl_md(struct dp_packet *packet,
> struct flow_tnl *tnl,
>
> ovs_be32 ip_src, ip_dst;
>
> - if (OVS_UNLIKELY(!dp_packet_ip_checksum_valid(packet))) {
> + /* A packet coming from a network device might have the
> + * csum already checked. In this case, skip the check. */
> + if (OVS_UNLIKELY(!dp_packet_ip_checksum_good(packet))
> + && !dp_packet_hwol_tx_ip_csum(packet)) {
> if (csum(ip, IP_IHL(ip->ip_ihl_ver) * 4)) {
> VLOG_WARN_RL(&err_rl, "ip packet has invalid checksum");
> return NULL;
> @@ -142,7 +145,8 @@ netdev_tnl_ip_extract_tnl_md(struct dp_packet *packet,
> struct flow_tnl *tnl,
> *
> * This function sets the IP header's ip_tot_len field (which should be
> zeroed
> * as part of 'header') and puts its value into '*ip_tot_size' as well. Also
> - * updates IP header checksum, as well as the l3 and l4 offsets in 'packet'.
> + * updates IP header checksum if not offloaded, as well as the l3 and l4
> + * offsets in the 'packet'.
> *
> * Return pointer to the L4 header added to 'packet'. */
> void *
> @@ -167,11 +171,16 @@ netdev_tnl_push_ip_header(struct dp_packet *packet,
> *ip_tot_size -= IPV6_HEADER_LEN;
> ip6->ip6_plen = htons(*ip_tot_size);
> packet->l4_ofs = dp_packet_size(packet) - *ip_tot_size;
> + dp_packet_hwol_set_tx_ipv6(packet);
> + dp_packet_ol_reset_ip_csum_good(packet);
> return ip6 + 1;
> } else {
> ip = netdev_tnl_ip_hdr(eth);
> ip->ip_tot_len = htons(*ip_tot_size);
> - ip->ip_csum = recalc_csum16(ip->ip_csum, 0, ip->ip_tot_len);
> + /* Postpone checksum to when the packet is pushed to the port. */
> + dp_packet_hwol_set_tx_ipv4(packet);
> + dp_packet_hwol_set_tx_ip_csum(packet);
> + dp_packet_ol_reset_ip_csum_good(packet);
> *ip_tot_size -= IP_HEADER_LEN;
> packet->l4_ofs = dp_packet_size(packet) - *ip_tot_size;
> return ip + 1;
> @@ -190,7 +199,7 @@ udp_extract_tnl_md(struct dp_packet *packet, struct
> flow_tnl *tnl,
> }
>
> if (udp->udp_csum) {
> - if (OVS_UNLIKELY(!dp_packet_l4_checksum_valid(packet))) {
> + if (OVS_UNLIKELY(!dp_packet_l4_checksum_good(packet))) {
> uint32_t csum;
> if (netdev_tnl_is_header_ipv6(dp_packet_data(packet))) {
> csum = packet_csum_pseudoheader6(dp_packet_l3(packet));
> @@ -297,8 +306,8 @@ netdev_tnl_ip_build_header(struct ovs_action_push_tnl
> *data,
> ip->ip_frag_off = (params->flow->tunnel.flags &
> FLOW_TNL_F_DONT_FRAGMENT) ?
> htons(IP_DF) : 0;
>
> - /* Checksum has already been zeroed by eth_build_header. */
> - ip->ip_csum = csum(ip, sizeof *ip);
> + /* The checksum will be calculated when the headers are pushed
> + * to the packet if offloading is not enabled. */
It seems like we're assuming here that the actual header push will be
performed by netdev_tnl_push_ip_header(). But that will not be the
case for the rte_flow tunnel offloading. In that case the actual
tunnel push will be performed by the hardware. It will work for vxlan,
but it will not work for the RTE_FLOW_ACTION_TYPE_RAW_ENCAP, because
HW will just prepend the raw header that doesn't have correct checksums.
Is my understanding correct?
P.S. I suppose IPv6 tunnels are not working with RAW_ENCAP even without
the change due to UDP checksum being required and there is no way to
calculate it without the packet. But IPv4 tunnels should work.
Best regards, Ilya Maximets.
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev