Do not merge this series. The performance effects are not significant. I am sharing this mainly to archive the patches and in case someone has ideas on how to improve this.
Bernd Schubert mentioned io_uring_setup(2) flags that may improve performance: - IORING_SETUP_SINGLE_ISSUER: optimization when only 1 thread uses an io_uring context - IORING_SETUP_COOP_TASKRUN: avoids IPIs - IORING_SETUP_TASKRUN_FLAG: makes COOP_TASKRUN work with userspace CQ ring polling Suraj Shirvankar already started work on SINGLE_ISSUER in the past: https://lore.kernel.org/qemu-devel/174293621917.22751.1138131986510202996...@git.sr.ht/ Where this differs from Suraj's previous work is that I have worked around the need for the main loop AioContext to be shared by multiple threads (vCPU threads and the migration thread). Here are the performance numbers for fio bs=4k in a 4 vCPU guest with 1 IOThread using a virtio-blk disk backed by a local NVMe drive: IOPS IOPS Benchmark SINGLE_ISSUER SINGLE_ISSUER|COOP_TASKRUN|TASKRUN_FLAG randread iodepth=1 54,045 (+1.2%) 54,189 (+1.5%) randread iodepth=64 318,135 (+0.1%) 315,632 (-0.68%) randwrite iodepth=1 141,918 (-0.44%) 143,337 (+0.55%) randwrite iodepth=64 323,948 (-0.015%) 322,755 (-0.38%) You can find detailed benchmarking results here including the fio output, fio command-line, and guest libvirt domain XML: https://gitlab.com/stefanha/virt-playbooks/-/tree/io_uring-flags/notebook/fio-output https://gitlab.com/stefanha/virt-playbooks/-/blob/io_uring-flags/files/fio.sh https://gitlab.com/stefanha/virt-playbooks/-/blob/io_uring-flags/files/test.xml.j2 Stefan Hajnoczi (3): iothread: create AioContext in iothread_run() aio-posix: enable IORING_SETUP_SINGLE_ISSUER aio-posix: enable IORING_SETUP_COOP_TASKRUN | IORING_SETUP_TASKRUN_FLAG include/system/iothread.h | 1 - iothread.c | 140 +++++++++++++++++++++----------------- util/fdmon-io_uring.c | 26 ++++++- 3 files changed, 101 insertions(+), 66 deletions(-) -- 2.50.1