Now vsock only support send/receive small packet, it can't achieve
high performance. As previous discussed with Jason Wang, I revisit the
idea of vhost-net about mergeable rx buffer and implement the mergeable
rx buffer in vhost-vsock, it can allow big packet to be scattered in
into different buffers and improve performance obviously.

This series of patches mainly did three things:
- mergeable buffer implementation
- increase the max send pkt size
- add used and signal guest in a batch

And I write a tool to test the vhost-vsock performance, mainly send big
packet(64K) included guest->Host and Host->Guest. I test performance
independently and the result as follows:

Before performance:
              Single socket            Multiple sockets(Max Bandwidth)
Guest->Host   ~400MB/s                 ~480MB/s
Host->Guest   ~1450MB/s                ~1600MB/s

After performance only use implement mergeable rx buffer:
              Single socket            Multiple sockets(Max Bandwidth)
Guest->Host   ~400MB/s                 ~480MB/s
Host->Guest   ~1280MB/s                ~1350MB/s

In this case, max send pkt size is still limited to 4K, so Host->Guest
performance will worse than before.

After performance increase the max send pkt size to 64K:
              Single socket            Multiple sockets(Max Bandwidth)
Guest->Host   ~1700MB/s                ~2900MB/s
Host->Guest   ~1500MB/s                ~2440MB/s

After performance all patches are used:
              Single socket            Multiple sockets(Max Bandwidth)
Guest->Host   ~1700MB/s                ~2900MB/s
Host->Guest   ~1700MB/s                ~2900MB/s

>From the test results, the performance is improved obviously, and guest
memory will not be wasted.

In addition, in order to support mergeable rx buffer in virtio-vsock,
we need to add a qemu patch to support parse feature.

---
v1 -> v2:
 * Addressed comments from Jason Wang.
 * Add performance test result independently.
 * Use Skb_page_frag_refill() which can use high order page and reduce
   the stress of page allocator.
 * Still use fixed size(PAGE_SIZE) to fill rx buffer, because too small
   size can't fill one full packet, we only 128 vq num now.
 * Use iovec to replace buf in struct virtio_vsock_pkt, keep tx and rx
   consistency.
 * Add virtio_transport ops to get max pkt len, in order to be compatible
   with old version.
---

Yiwen Jiang (5):
  VSOCK: support fill mergeable rx buffer in guest
  VSOCK: support fill data to mergeable rx buffer in host
  VSOCK: support receive mergeable rx buffer in guest
  VSOCK: increase send pkt len in mergeable mode to improve performance
  VSOCK: batch sending rx buffer to increase bandwidth

 drivers/vhost/vsock.c                   | 183 ++++++++++++++++++++-----
 include/linux/virtio_vsock.h            |  13 +-
 include/uapi/linux/virtio_vsock.h       |   5 +
 net/vmw_vsock/virtio_transport.c        | 229 +++++++++++++++++++++++++++-----
 net/vmw_vsock/virtio_transport_common.c |  66 ++++++---
 5 files changed, 411 insertions(+), 85 deletions(-)

-- 
1.8.3.1

Reply via email to