Outer UDP checksum for TSO packets can be computed in software regardless of how segmentation is done (SW or HW). This property comes from nested checksums that cancels impact of the inner L4 payload to the outer checksum.
Performance was evaluated using iperf between two virtual machines hosted by two RHEL 9 hypervisors running OVS-DPDK. The main limiting factor is on the receiving endpoint as the receiving virtual machine seems too slow to dequeue packets coming from the hypervisor (tx retries/tx drops seen on the vhost-user port). A little warning on those numbers that should be taken simply as an indication of the improvement and not as absolute numbers: - the VMs and OVS-DPDKs were running from numa 0 while the E810 nic is placed in numa 1, - the random drops on the receiving side means that there is some variance on the numbers between runs, Main branch: Switching to 0000:3b:00.0 (mlx5_core) IPv4/IPv4 [ 5] 0.00-1.00 sec 753 MBytes 6.31 Gbits/sec 816 1.31 MBytes IPv4/IPv6 [ 5] 0.00-1.00 sec 761 MBytes 6.39 Gbits/sec 0 3.10 MBytes IPv6/IPv4 [ 5] 0.00-1.00 sec 751 MBytes 6.30 Gbits/sec 422 1.19 MBytes IPv6/IPv6 [ 5] 0.00-1.00 sec 757 MBytes 6.35 Gbits/sec 0 3.12 MBytes Switching to 0000:5e:00.0 (i40e) IPv4/IPv4 [ 5] 0.00-1.00 sec 637 MBytes 5.35 Gbits/sec 0 2.06 MBytes IPv4/IPv6 [ 5] 0.00-1.00 sec 655 MBytes 5.49 Gbits/sec 0 2.52 MBytes IPv6/IPv4 [ 5] 0.00-1.00 sec 649 MBytes 5.44 Gbits/sec 256 1.68 MBytes IPv6/IPv6 [ 5] 0.00-1.00 sec 642 MBytes 5.38 Gbits/sec 0 2.16 MBytes Switching to 0000:d8:00.0 (ice) IPv4/IPv4 [ 5] 0.00-1.00 sec 1.24 GBytes 10.6 Gbits/sec 60 1.51 MBytes IPv4/IPv6 [ 5] 0.00-1.00 sec 1.18 GBytes 10.2 Gbits/sec 93 2.29 MBytes IPv6/IPv4 [ 5] 0.00-1.00 sec 1.11 GBytes 9.52 Gbits/sec 98 1.19 MBytes IPv6/IPv6 [ 5] 0.00-1.00 sec 1.12 GBytes 9.62 Gbits/sec 122 2.30 MBytes After series: Switching to 0000:3b:00.0 (mlx5_core) IPv4/IPv4 [ 5] 0.00-1.00 sec 1.15 GBytes 9.87 Gbits/sec 167 1.07 MBytes IPv4/IPv6 [ 5] 0.00-1.00 sec 1.22 GBytes 10.5 Gbits/sec 164 1.37 MBytes IPv6/IPv4 [ 5] 0.00-1.00 sec 1.23 GBytes 10.5 Gbits/sec 862 998 KBytes IPv6/IPv6 [ 5] 0.00-1.00 sec 1.20 GBytes 10.3 Gbits/sec 140 1.29 MBytes Switching to 0000:5e:00.0 (i40e) IPv4/IPv4 [ 5] 0.00-1.00 sec 1.10 GBytes 9.44 Gbits/sec 172 1.36 MBytes IPv4/IPv6 [ 5] 0.00-1.00 sec 1.16 GBytes 9.94 Gbits/sec 520 1.04 MBytes IPv6/IPv4 [ 5] 0.00-1.00 sec 1.20 GBytes 10.3 Gbits/sec 389 1.11 MBytes IPv6/IPv6 [ 5] 0.00-1.00 sec 1.20 GBytes 10.3 Gbits/sec 396 835 KBytes Switching to 0000:d8:00.0 (ice) IPv4/IPv4 [ 5] 0.00-1.00 sec 1.22 GBytes 10.5 Gbits/sec 64 1.45 MBytes IPv4/IPv6 [ 5] 0.00-1.00 sec 1.19 GBytes 10.2 Gbits/sec 120 1.84 MBytes IPv6/IPv4 [ 5] 0.00-1.00 sec 1.21 GBytes 10.4 Gbits/sec 1876 1.15 MBytes IPv6/IPv6 [ 5] 0.00-1.00 sec 1.10 GBytes 9.47 Gbits/sec 96 1.66 MBytes -- David Marchand Changes since v2: - dropped dependency on scapy (Ilya, Mike), - factorised csum=false/csum=true tests, - enhanced coverage in tunnel tests (caught by Claude), - removed some asserts in optimised udp csum helper (Mike), - removed special case of innner udp in optimised udp csum helper (Kevin), - fixed tunnel flag stripping for TSO packets (Kevin), - added some documentation on partial segmentation (Kevin), Changes since v1: - dropped patch 1, it may be worked on in a separate series, in the future, - refactored checksum tests, - added coverage for checksum with tunnels, - fixed bugs in the optimised helper for null, unknown and bad inner IP or L4 checksums, David Marchand (11): netdev-dpdk: Fix rx queue fill level with QoS. netdev-dpdk: Enforce mono-segment mbufs. netdev-dpdk: Fix TSO packet length check for tunnels. dpif-netdev.at: Rename checksum offloads tests. dpif-netdev.at: Add helpers for checksum tests. dpif-netdev: Enhance checksum coverage for tunnels. dp-packet-gso: Request UDP checksum when needed. dp-packet: Strip tunnel info when unneeded. dp-packet: Optimize outer checksum for nested checksums. dp-packet-gso: Refactor software segmentation code. netdev: Use HW segmentation without outer UDP checksum. Documentation/topics/userspace-tso.rst | 26 + lib/dp-packet-gso.c | 304 +-- lib/dp-packet-gso.h | 4 +- lib/dp-packet.c | 37 +- lib/dp-packet.h | 24 + lib/netdev-dpdk.c | 119 +- lib/netdev-dummy.c | 83 +- lib/netdev.c | 36 +- lib/packets.c | 140 ++ lib/packets.h | 1 + tests/dpif-netdev.at | 2698 +++++++++++++++++------- 11 files changed, 2459 insertions(+), 1013 deletions(-) -- 2.51.1 _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
