On Mon, Jul 25, 2022 at 11:55:26PM +0300, Andrey Zhadchenko wrote:
> Although QEMU virtio-blk is quite fast, there is still some room for
> improvements. Disk latency can be reduced if we handle virito-blk requests
> in host kernel so we avoid a lot of syscalls and context switches.
> 
> The biggest disadvantage of this vhost-blk flavor is raw format.
> Luckily Kirill Thai proposed device mapper driver for QCOW2 format to attach
> files as block devices: https://www.spinics.net/lists/kernel/msg4292965.html
> 
> Also by using kernel modules we can bypass iothread limitation and finaly 
> scale
> block requests with cpus for high-performance devices. This is planned to be
> implemented in next version.
> 
> Linux kernel module part:
> https://lore.kernel.org/kvm/20220725202753.298725-1-andrey.zhadche...@virtuozzo.com/
> 
> test setups and results:
> fio --direct=1 --rw=randread  --bs=4k  --ioengine=libaio --iodepth=128
> QEMU drive options: cache=none
> filesystem: xfs
> 
> SSD:
>                | randread, IOPS  | randwrite, IOPS |
> Host           |      95.8k    |      85.3k      |
> QEMU virtio    |      57.5k    |      79.4k      |
> QEMU vhost-blk |      95.6k    |      84.3k      |
> 
> RAMDISK (vq == vcpu):
>                  | randread, IOPS | randwrite, IOPS |
> virtio, 1vcpu    |    123k      |      129k       |
> virtio, 2vcpu    |    253k (??) |      250k (??)  |
> virtio, 4vcpu    |    158k      |      154k       |
> vhost-blk, 1vcpu |    110k      |      113k       |
> vhost-blk, 2vcpu |    247k      |      252k       |
> vhost-blk, 4vcpu |    576k      |      567k       |
> 
> Andrey Zhadchenko (1):
>   block: add vhost-blk backend
> 
>  configure                     |  13 ++
>  hw/block/Kconfig              |   5 +
>  hw/block/meson.build          |   1 +
>  hw/block/vhost-blk.c          | 395 ++++++++++++++++++++++++++++++++++
>  hw/virtio/meson.build         |   1 +
>  hw/virtio/vhost-blk-pci.c     | 102 +++++++++
>  include/hw/virtio/vhost-blk.h |  44 ++++
>  linux-headers/linux/vhost.h   |   3 +
>  8 files changed, 564 insertions(+)
>  create mode 100644 hw/block/vhost-blk.c
>  create mode 100644 hw/virtio/vhost-blk-pci.c
>  create mode 100644 include/hw/virtio/vhost-blk.h

vhost-blk has been tried several times in the past. That doesn't mean it
cannot be merged this time, but past arguments should be addressed:

- What makes it necessary to move the code into the kernel? In the past
  the performance results were not very convincing. The fastest
  implementations actually tend to be userspace NVMe PCI drivers that
  bypass the kernel! Bypassing the VFS and submitting block requests
  directly was not a huge boost. The syscall/context switch argument
  sounds okay but the numbers didn't really show that kernel block I/O
  is much faster than userspace block I/O.

  I've asked for more details on the QEMU command-line to understand
  what your numbers show. Maybe something has changed since previous
  times when vhost-blk has been tried.

  The only argument I see is QEMU's current 1 IOThread per virtio-blk
  device limitation, which is currently being worked on. If that's the
  only reason for vhost-blk then is it worth doing all the work of
  getting vhost-blk shipped (kernel, QEMU, and libvirt changes)? It
  seems like a short-term solution.

- The security impact of bugs in kernel vhost-blk code is more serious
  than bugs in a QEMU userspace process.

- The management stack needs to be changed to use vhost-blk whereas
  QEMU can be optimized without affecting other layers.

Stefan

Attachment: signature.asc
Description: PGP signature

Reply via email to