On Thu, May 08, 2014 at 04:34:33PM +0200, Stefan Hajnoczi wrote: > v3: > * Add assertion if win32_aio_cleanup() is called without detaching [Paolo] > * Stash away NFSClient pointer for future use instead of AioContext pointer > [Peter Lieven] > * Use NFSClient->aio_context where possible instead of > bdrv_get_aio_context() [Peter Lieven] > * Stash away IscsiLun pointer for future use instead of AioContext pointer > [Peter Lieven] > * Use IScsiLun->aio_context where possible instead of bdrv_get_aio_context() > * Add Benoit's Reviewed-by: for unchanged quorum patch from v1 > > v2: > * Currently unaddressed: libiscsi nop timer re-entrancy, is it a problem? > [Peter Lieven] > * Protect bdrv_*_all() with AioContext acquire/release [Christian > Borntraeger] > * Convert raw-win32 and aio-win32 [Paolo] > * Rebase on latest curl.c changes [Fam] > * Add curl_attach_aio_context() assert since it shouldn't be called twice > [Fam] > * Drop curl comment that doesn't make sense in the new code [Fam] > * Replace g_slice_new()+memset() with g_slice_new0() in > dataplane/virtio-blk.c [Fam] > > This patch series switches virtio-blk data-plane from a custom Linux AIO > request queue to the QEMU block layer. The previous "raw files only" > limitation is lifted. All image formats and protocols can now be used with > virtio-blk data-plane. > > How to review this series > ------------------------- > I CCed the maintainer of each block driver that I modified. You probably > don't > need to review the entire series, just your patch. > > From now on fd handlers, timers, BHs, and event loop wait must explicitly use > BlockDriverState's AioContext instead of the main loop. Use > bdrv_get_aio_context(bs) to get the AioContext. The following function calls > need to be converted: > > * qemu_aio_set_fd_handler() -> aio_set_fd_handler() > * timer_new*() -> aio_timer_new() > * qemu_bh_new() -> aio_bh_new() > * qemu_aio_wait() -> aio_poll(aio_context, true) > > For simple block drivers this modification suffices and it is now safe to use > outside the QEMU global mutex. > > Block drivers that keep fd handlers, timers, or BHs registered when requests > have been drained need a little bit more work. Examples of this are network > block drivers with keepalive timers, like iSCSI. > > This series adds a new bdrv_set_aio_context(bs, aio_context) function that > moves a BlockDriverState into a new AioContext. This function calls the block > driver's optional .bdrv_detach_aio_context() and .bdrv_attach_aio_context() > functions. Implement detach/attach to move the fd handlers, timers, or BHs to > the new AioContext. > > Finally, block drivers that manage their own child nodes also need to > implement detach/attach because the generic block layer doesn't know about > their children. Both ->file and ->backing_hd are automatically taken care of > but blkverify, quorum, and VMDK need to manually propagate detach/attach to > their children. > > I have audited and modified all block drivers. Block driver maintainers, > please check I did it correctly and didn't break your code. > > Background > ---------- > The block layer is currently tied to the QEMU main loop for fd handlers, timer > callbacks, and BHs. This means that even on hosts with many cores, parts of > block I/O processing happen in one thread and depend on the QEMU global mutex. > > virtio-blk data-plane has shown that 1,000,000 IOPS is achievable if we use > additional threads that are not under the QEMU global mutex. > > It is necessary to make the QEMU block layer aware that there may be more than > one event loop. This way BlockDriverState can be used from a thread without > contention on the QEMU global mutex. > > This series builds on the aio_context_acquire/release() interface that allows > a > thread to temporarily grab an AioContext. We add bdrv_set_aio_context(bs, > aio_context) for changing which AioContext a BlockDriverState uses. > > The final patches convert virtio-blk data-plane to use the QEMU block layer > and > let the BlockDriverState run in the IOThread AioContext. > > What's next? > ------------ > I have already made block I/O throttling work in another AioContext and will > send the series out next week. > > In order to keep this series reviewable, I'm holding back those patches for > now. One could say, "throttling" them. > > Thank you, thank you, I'll be here all night! > > Stefan Hajnoczi (25): > block: use BlockDriverState AioContext > block: acquire AioContext in bdrv_*_all() > block: acquire AioContext in bdrv_drain_all() > block: add bdrv_set_aio_context() > blkdebug: use BlockDriverState's AioContext > blkverify: implement .bdrv_detach/attach_aio_context() > curl: implement .bdrv_detach/attach_aio_context() > gluster: use BlockDriverState's AioContext > iscsi: implement .bdrv_detach/attach_aio_context() > nbd: implement .bdrv_detach/attach_aio_context() > nfs: implement .bdrv_detach/attach_aio_context() > qed: use BlockDriverState's AioContext > quorum: implement .bdrv_detach/attach_aio_context() > block/raw-posix: implement .bdrv_detach/attach_aio_context() > block/linux-aio: fix memory and fd leak > block/raw-win32: create one QEMUWin32AIOState per BDRVRawState > block/raw-win32: implement .bdrv_detach/attach_aio_context() > rbd: use BlockDriverState's AioContext > sheepdog: implement .bdrv_detach/attach_aio_context() > ssh: use BlockDriverState's AioContext > vmdk: implement .bdrv_detach/attach_aio_context() > dataplane: use the QEMU block layer for I/O > dataplane: delete IOQueue since it is no longer used > dataplane: implement async flush > raw-posix: drop raw_get_aio_fd() since it is no longer used > > block.c | 133 +++++++++++++++++----- > block/blkdebug.c | 2 +- > block/blkverify.c | 47 +++++--- > block/curl.c | 192 +++++++++++++++++++------------- > block/gluster.c | 7 +- > block/iscsi.c | 80 ++++++++++---- > block/linux-aio.c | 24 +++- > block/nbd-client.c | 24 +++- > block/nbd-client.h | 4 + > block/nbd.c | 87 +++++++++------ > block/nfs.c | 81 ++++++++++---- > block/qed-table.c | 8 +- > block/qed.c | 35 +++++- > block/quorum.c | 48 ++++++-- > block/raw-aio.h | 8 ++ > block/raw-posix.c | 82 ++++++++------ > block/raw-win32.c | 54 ++++++--- > block/rbd.c | 5 +- > block/sheepdog.c | 118 +++++++++++++------- > block/ssh.c | 36 +++--- > block/vmdk.c | 23 ++++ > block/win32-aio.c | 27 ++++- > hw/block/dataplane/Makefile.objs | 2 +- > hw/block/dataplane/ioq.c | 117 -------------------- > hw/block/dataplane/ioq.h | 57 ---------- > hw/block/dataplane/virtio-blk.c | 232 > +++++++++++++++------------------------ > include/block/block.h | 20 ++-- > include/block/block_int.h | 36 ++++++ > 28 files changed, 929 insertions(+), 660 deletions(-) > delete mode 100644 hw/block/dataplane/ioq.c > delete mode 100644 hw/block/dataplane/ioq.h
Applied to my block tree: https://github.com/stefanha/qemu/commits/block Stefan