v3: - Add assertions documenting that ADD and REMOVE flags cannot be present together with DELETE_AIO_HANDLER [Kevin]
v2: - Performance improvements - Fix pre_sqe -> prep_sqe typo [Eric] - Add #endif terminator comment [Eric] - Fix spacing in aio_ctx_finalize() argument list [Eric] - Add new "block/io_uring: use non-vectored read/write when possible" patch [Eric] - Drop Patch 1 because multi-shot POLL_ADD has edge-triggered semantics instead of level-triggered semantics required by QEMU's AioContext APIs. The qemu-iotests 308 test case was hanging because block/export/fuse.c relies on level-triggered semantics. Luckily the performance reason for switching from one-shot to multi-shot has been solved by Patch 2 ("aio-posix: keep polling enabled with fdmon-io_uring.c"), so it's okay to use single-shot. - Add a new Patch 1. It's a bug fix for a user-after-free in fdmon-io_uring.c triggered by qemu-iotests iothreads-nbd-export. This patch series contains io_uring improvements: 1. Support the glib event loop in fdmon-io_uring. - aio-posix: fix race between io_uring CQE and AioHandler deletion - aio-posix: keep polling enabled with fdmon-io_uring.c - tests/unit: skip test-nested-aio-poll with io_uring - aio-posix: integrate fdmon into glib event loop 2. Enable fdmon-io_uring on hosts where io_uring is available at runtime. Otherwise continue using ppoll(2) or epoll(7). - aio: remove aio_context_use_g_source() 3. Add the new aio_add_sqe() API for submitting io_uring requests in the QEMU event loop. - aio: free AioContext when aio_context_new() fails - aio: add errp argument to aio_context_setup() - aio-posix: gracefully handle io_uring_queue_init() failure - aio-posix: add aio_add_sqe() API for user-defined io_uring requests - aio-posix: avoid EventNotifier for cqe_handler_bh 4. Use aio_add_sqe() in block/io_uring.c instead of creating a dedicated io_uring context for --blockdev aio=io_uring. This simplifies the code, reduces the number of file descriptors, and demonstrates the aio_add_sqe() API. - block/io_uring: use aio_add_sqe() - block/io_uring: use non-vectored read/write when possible The highlight is aio_add_sqe(), which is needed for the FUSE-over-io_uring Google Summer of Code project and other future QEMU features that natively use Linux io_uring functionality. rw bs iodepth aio iothread before after diff randread 4k 1 native 0 78353 84860 +8.3% randread 4k 64 native 0 262370 269823 +2.8% randwrite 4k 1 native 0 142703 144348 +1.2% randwrite 4k 64 native 0 259947 263895 +1.5% randread 4k 1 io_uring 0 76883 78270 +1.8% randread 4k 64 io_uring 0 269712 250513 -7.1% randwrite 4k 1 io_uring 0 143657 131481 -8.5% randwrite 4k 64 io_uring 0 274461 264785 -3.5% randread 4k 1 native 1 84080 84097 0.0% randread 4k 64 native 1 314650 311193 -1.1% randwrite 4k 1 native 1 172463 159993 -7.2% randwrite 4k 64 native 1 303091 299726 -1.1% randread 4k 1 io_uring 1 83415 84081 +0.8% randread 4k 64 io_uring 1 324797 318429 -2.0% randwrite 4k 1 io_uring 1 174421 172809 -0.9% randwrite 4k 64 io_uring 1 323394 312286 -3.4% Performance is in the same ballpark as without fdmon-io_uring. Results vary from run to run due to the timing/batching of requests (even with iodepth=1 due to 8 vCPUs using a single IOThread). Here is the performance from v1 for reference: rw bs iodepth aio iothread before after diff randread 4k 1 native 0 76281 79707 +4.5% randread 4k 64 native 0 255078 247293 -3.1% randwrite 4k 1 native 0 132706 123337 -7.1% randwrite 4k 64 native 0 275589 245192 -11% randread 4k 1 io_uring 0 75284 78023 +3.5% randread 4k 64 io_uring 0 254637 248222 -2.5% randwrite 4k 1 io_uring 0 126519 128641 +1.7% randwrite 4k 64 io_uring 0 258967 249266 -3.7% randread 4k 1 native 1 90557 88436 -2.3% randread 4k 64 native 1 290673 280456 -3.5% randwrite 4k 1 native 1 183015 169106 -7.6% randwrite 4k 64 native 1 281316 280078 -0.4% randread 4k 1 io_uring 1 92479 86983 -5.9% randread 4k 64 io_uring 1 304229 257730 -15.3% randwrite 4k 1 io_uring 1 183983 157425 -14.4% randwrite 4k 64 io_uring 1 299979 264156 -11.9% This series replaces the following older series that were held off from merging until the QEMU 10.1 development window opened and the performance results were collected: - "[PATCH 0/3] [RESEND] block: unify block and fdmon io_uring" - "[PATCH 0/4] aio-posix: integrate fdmon into glib event loop" Stefan Hajnoczi (12): aio-posix: fix race between io_uring CQE and AioHandler deletion aio-posix: keep polling enabled with fdmon-io_uring.c tests/unit: skip test-nested-aio-poll with io_uring aio-posix: integrate fdmon into glib event loop aio: remove aio_context_use_g_source() aio: free AioContext when aio_context_new() fails aio: add errp argument to aio_context_setup() aio-posix: gracefully handle io_uring_queue_init() failure aio-posix: add aio_add_sqe() API for user-defined io_uring requests aio-posix: avoid EventNotifier for cqe_handler_bh block/io_uring: use aio_add_sqe() block/io_uring: use non-vectored read/write when possible include/block/aio.h | 136 +++++++- include/block/raw-aio.h | 5 - util/aio-posix.h | 18 +- block/file-posix.c | 40 +-- block/io_uring.c | 508 ++++++++---------------------- stubs/io_uring.c | 32 -- tests/unit/test-aio.c | 7 +- tests/unit/test-nested-aio-poll.c | 13 +- util/aio-posix.c | 143 +++++---- util/aio-win32.c | 6 +- util/async.c | 55 +--- util/fdmon-epoll.c | 52 ++- util/fdmon-io_uring.c | 226 ++++++++++--- util/fdmon-poll.c | 88 +++++- block/trace-events | 12 +- stubs/meson.build | 3 - util/trace-events | 4 + 17 files changed, 710 insertions(+), 638 deletions(-) delete mode 100644 stubs/io_uring.c -- 2.50.1