On 12/3/2019 12:15 PM, Ilya Maximets wrote:
On 02.12.2019 14:44, Flavio Leitner wrote:
Abbreviated as TSO, TCP Segmentation Offload is a feature which enables
the network stack to delegate the TCP segmentation to the NIC reducing
the per packet CPU overhead.

A guest using vhost-user interface with TSO enabled can send TCP packets
much bigger than the MTU, which saves CPU cycles normally used to break
the packets down to MTU size and to calculate checksums.

It also saves CPU cycles used to parse multiple packets/headers during
the packet processing inside virtual switch.

If the destination of the packet is another guest in the same host, then
the same big packet can be sent through a vhost-user interface skipping
the segmentation completely. However, if the destination is not local,
the NIC hardware is instructed to do the TCP segmentation and checksum
calculation.

The first 3 patches are not really part of TSO support, but they are
required to make sure everything works.

The TSO is only enabled in the vhost-user ports in client mode which
means DPDK is required. The other ports (dpdk or linux) can receive
those packets.

This patchset is based on branch dpdk-latest (v19.11 required).

The numbers look good and I noticed no regression so far. However, my
environment is tuned but not well fined tuned, so consider those as
ball park numbers only.

This is VM-to-external host (Gbits/sec)
                | No patch  |  TSO off  |  TSO on
               ---------------------------------------------
TCPv4          |    10     |    10     |   38  (line rate)
-----------------------------------------------------------
TCPv6          |     9     |     9     |   38  (line rate)
-----------------------------------------------------------
V4 Conntrack   |     1     |     1     |   33


This is VM-to-VM in the same host (Gbits/sec)
                | No patch  |  TSO off  |  TSO on
               ---------------------------------------------
TCPv4          |     9.4   |    9.4    |   31
-----------------------------------------------------------
TCPv6          |     8     |    8      |   30
-----------------------------------------------------------

There are good improvements sending to veth pairs or tap devices too.

Flavio Leitner (4):
   dp-packet: preserve headroom when cloning a pkt batch
   vhost: Disable multi-segmented buffers
   dp-packet: handle new dpdk buffer flags
   netdev-dpdk: Add TCP Segmentation Offload support

  Documentation/automake.mk           |   1 +
  Documentation/topics/dpdk/index.rst |   1 +
  Documentation/topics/dpdk/tso.rst   |  83 +++++++++++++++++++++
  NEWS                                |   1 +
  lib/automake.mk                     |   2 +
  lib/dp-packet.c                     |  50 ++++++++++++-
  lib/dp-packet.h                     |  38 ++++++++--
  lib/netdev-bsd.c                    |   7 ++
  lib/netdev-dpdk.c                   | 112 +++++++++++++++++++++++-----
  lib/netdev-dummy.c                  |   7 ++
  lib/netdev-linux.c                  |  71 ++++++++++++++++--
  lib/tso.c                           |  54 ++++++++++++++
  lib/tso.h                           |  23 ++++++
  vswitchd/bridge.c                   |   2 +
  vswitchd/vswitch.xml                |  14 ++++
  15 files changed, 430 insertions(+), 36 deletions(-)
  create mode 100644 Documentation/topics/dpdk/tso.rst
  create mode 100644 lib/tso.c
  create mode 100644 lib/tso.h



Hi Flavio,

Thanks for working on this.
I didn't read the code carefully,  just a couple of first look points:

* Patch-set needs to have 'dpdk-latest' in a subject prefix.


FYI, I'm creating a patch to move to 19.11 today for master, just validating at the moment. Will send this evening.

Ian

* First patch seems to be OK even without TSO, batch_clone usually
   happens on clone action that implies further packet modifications,
   so it might make sense to have a headroom for that.

* Maybe we need to enable LINEARBUF for usual vhost-user too just to
   avoid receiving of mutli-segment mbufs?

* Third patch is not needed.  Similar patch was already merged recently.

* I don't see the call to rte_eth_tx_prepare().  It is required to prepare
   packets for TSO for Intel NICs that needs pseudo-header checksum
   precalculated.

* Guest is allowed to disable offloading.  In this case we're not allowed
   to send incomplete packets (no-checksum, oversized), they will be just
   dropped by the guest.

* NICs could not support TSO.  I see that you're moving this issue to the
   user by forcing to be sure that TSO is supported.  Most of HW NICs
   nowadays should support TSO, so it might be not a big deal, but
   what about some software NIC that DPDK implements?  I'm not sure.

* Don't include virtio headers in dp-packet code.  This will break non-Linux
   builds.  In fact, construction of virtio-net header is only needed for
   netdev-linux, so it should be implemented there.

* netdev-afxdp and netdev-windows are not handled by the patches.

* I'm not sure if we need separate tso.{c,h} files.

* Docs and log messages are very confusing because they makes impression that
   system datapath doesn't support TSO.  Also, why we cant' have TSO without 
DPDK?
   You have a virtio header for netdev-linux, so we can use it.

* Is it safe to just drop virtio header on packet receive?

* Not sure if it fully safe to say that checksum is valid while it's not
   calculated.  Have you checked this from the ipf and conntrack points of
   view.

Best regards, Ilya Maximets.

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to