Re: [PATCH 05/12] block: remove AioContext locking
On Thu, Nov 30, 2023 at 03:31:37PM -0600, Eric Blake wrote: > On Wed, Nov 29, 2023 at 02:55:46PM -0500, Stefan Hajnoczi wrote: > > This is the big patch that removes > > aio_context_acquire()/aio_context_release() from the block layer and > > affected block layer users. > > > > There isn't a clean way to split this patch and the reviewers are likely > > the same group of people, so I decided to do it in one patch. > > > > Signed-off-by: Stefan Hajnoczi > > --- > > > +++ b/block.c > > @@ -7585,29 +7433,12 @@ void coroutine_fn bdrv_co_leave(BlockDriverState > > *bs, AioContext *old_ctx) > > > > void coroutine_fn bdrv_co_lock(BlockDriverState *bs) > > { > > -AioContext *ctx = bdrv_get_aio_context(bs); > > - > > -/* In the main thread, bs->aio_context won't change concurrently */ > > -assert(qemu_get_current_aio_context() == qemu_get_aio_context()); > > - > > -/* > > - * We're in coroutine context, so we already hold the lock of the main > > - * loop AioContext. Don't lock it twice to avoid deadlocks. > > - */ > > -assert(qemu_in_coroutine()); > > Is this assertion worth keeping in the short term?... Probably not because coroutine vs non-coroutine functions don't change in this patch series, so it's unlikely that this will break. > > > -if (ctx != qemu_get_aio_context()) { > > -aio_context_acquire(ctx); > > -} > > +/* TODO removed in next patch */ > > } > > ...I guess I'll see in the next patch. > > > > > void coroutine_fn bdrv_co_unlock(BlockDriverState *bs) > > { > > -AioContext *ctx = bdrv_get_aio_context(bs); > > - > > -assert(qemu_in_coroutine()); > > -if (ctx != qemu_get_aio_context()) { > > -aio_context_release(ctx); > > -} > > +/* TODO removed in next patch */ > > } > > Same comment. > > > +++ b/blockdev.c > > @@ -1395,7 +1352,6 @@ static void > > external_snapshot_action(TransactionAction *action, > > /* File name of the new image (for 'blockdev-snapshot-sync') */ > > const char *new_image_file; > > ExternalSnapshotState *state = g_new0(ExternalSnapshotState, 1); > > -AioContext *aio_context; > > uint64_t perm, shared; > > > > /* TODO We'll eventually have to take a writer lock in this function */ > > I'm guessing removal of the locking gets us one step closer to > implementing this TODO at a later time? Or is it now a stale comment? > Either way, it doesn't affect this patch. I'm not sure. Kevin can answer questions about the graph lock. > > +++ b/tests/unit/test-blockjob.c > > > -static void test_complete_in_standby(void) > > -{ > > > @@ -531,13 +402,5 @@ int main(int argc, char **argv) > > g_test_add_func("/blockjob/cancel/standby", test_cancel_standby); > > g_test_add_func("/blockjob/cancel/pending", test_cancel_pending); > > g_test_add_func("/blockjob/cancel/concluded", test_cancel_concluded); > > - > > -/* > > - * This test is flaky and sometimes fails in CI and otherwise: > > - * don't run unless user opts in via environment variable. > > - */ > > -if (getenv("QEMU_TEST_FLAKY_TESTS")) { > > -g_test_add_func("/blockjob/complete_in_standby", > > test_complete_in_standby); > > -} > > Looks like you ripped out this entire test, because it is no longer > viable. I might have mentioned it in the commit message, or squashed > the removal of this test into the earlier 02/12 patch. I have sent a separate patch to remove this test and once it's merged this hunk will disappear this patch series: https://lore.kernel.org/qemu-devel/20231127170210.422728-1-stefa...@redhat.com/ Stefan signature.asc Description: PGP signature
Re: [PATCH 05/12] block: remove AioContext locking
On Mon, Dec 04, 2023 at 03:33:57PM +0100, Kevin Wolf wrote: > Am 29.11.2023 um 20:55 hat Stefan Hajnoczi geschrieben: > > This is the big patch that removes > > aio_context_acquire()/aio_context_release() from the block layer and > > affected block layer users. > > > > There isn't a clean way to split this patch and the reviewers are likely > > the same group of people, so I decided to do it in one patch. > > > > Signed-off-by: Stefan Hajnoczi > > > @@ -7585,29 +7433,12 @@ void coroutine_fn bdrv_co_leave(BlockDriverState > > *bs, AioContext *old_ctx) > > > > void coroutine_fn bdrv_co_lock(BlockDriverState *bs) > > { > > -AioContext *ctx = bdrv_get_aio_context(bs); > > - > > -/* In the main thread, bs->aio_context won't change concurrently */ > > -assert(qemu_get_current_aio_context() == qemu_get_aio_context()); > > - > > -/* > > - * We're in coroutine context, so we already hold the lock of the main > > - * loop AioContext. Don't lock it twice to avoid deadlocks. > > - */ > > -assert(qemu_in_coroutine()); > > -if (ctx != qemu_get_aio_context()) { > > -aio_context_acquire(ctx); > > -} > > +/* TODO removed in next patch */ > > } > > It's still there at the end of the series. Will fix in v2. Thanks! signature.asc Description: PGP signature
Re: [PATCH 05/12] block: remove AioContext locking
Am 29.11.2023 um 20:55 hat Stefan Hajnoczi geschrieben: > This is the big patch that removes > aio_context_acquire()/aio_context_release() from the block layer and > affected block layer users. > > There isn't a clean way to split this patch and the reviewers are likely > the same group of people, so I decided to do it in one patch. > > Signed-off-by: Stefan Hajnoczi > @@ -7585,29 +7433,12 @@ void coroutine_fn bdrv_co_leave(BlockDriverState *bs, > AioContext *old_ctx) > > void coroutine_fn bdrv_co_lock(BlockDriverState *bs) > { > -AioContext *ctx = bdrv_get_aio_context(bs); > - > -/* In the main thread, bs->aio_context won't change concurrently */ > -assert(qemu_get_current_aio_context() == qemu_get_aio_context()); > - > -/* > - * We're in coroutine context, so we already hold the lock of the main > - * loop AioContext. Don't lock it twice to avoid deadlocks. > - */ > -assert(qemu_in_coroutine()); > -if (ctx != qemu_get_aio_context()) { > -aio_context_acquire(ctx); > -} > +/* TODO removed in next patch */ > } It's still there at the end of the series. > void coroutine_fn bdrv_co_unlock(BlockDriverState *bs) > { > -AioContext *ctx = bdrv_get_aio_context(bs); > - > -assert(qemu_in_coroutine()); > -if (ctx != qemu_get_aio_context()) { > -aio_context_release(ctx); > -} > +/* TODO removed in next patch */ > } This one, too. Reviewed-by: Kevin Wolf
Re: [PATCH 05/12] block: remove AioContext locking
On Wed, Nov 29, 2023 at 02:55:46PM -0500, Stefan Hajnoczi wrote: > This is the big patch that removes > aio_context_acquire()/aio_context_release() from the block layer and > affected block layer users. > > There isn't a clean way to split this patch and the reviewers are likely > the same group of people, so I decided to do it in one patch. > > Signed-off-by: Stefan Hajnoczi > --- > +++ b/block.c > @@ -7585,29 +7433,12 @@ void coroutine_fn bdrv_co_leave(BlockDriverState *bs, > AioContext *old_ctx) > > void coroutine_fn bdrv_co_lock(BlockDriverState *bs) > { > -AioContext *ctx = bdrv_get_aio_context(bs); > - > -/* In the main thread, bs->aio_context won't change concurrently */ > -assert(qemu_get_current_aio_context() == qemu_get_aio_context()); > - > -/* > - * We're in coroutine context, so we already hold the lock of the main > - * loop AioContext. Don't lock it twice to avoid deadlocks. > - */ > -assert(qemu_in_coroutine()); Is this assertion worth keeping in the short term?... > -if (ctx != qemu_get_aio_context()) { > -aio_context_acquire(ctx); > -} > +/* TODO removed in next patch */ > } ...I guess I'll see in the next patch. > > void coroutine_fn bdrv_co_unlock(BlockDriverState *bs) > { > -AioContext *ctx = bdrv_get_aio_context(bs); > - > -assert(qemu_in_coroutine()); > -if (ctx != qemu_get_aio_context()) { > -aio_context_release(ctx); > -} > +/* TODO removed in next patch */ > } Same comment. > +++ b/blockdev.c > @@ -1395,7 +1352,6 @@ static void external_snapshot_action(TransactionAction > *action, > /* File name of the new image (for 'blockdev-snapshot-sync') */ > const char *new_image_file; > ExternalSnapshotState *state = g_new0(ExternalSnapshotState, 1); > -AioContext *aio_context; > uint64_t perm, shared; > > /* TODO We'll eventually have to take a writer lock in this function */ I'm guessing removal of the locking gets us one step closer to implementing this TODO at a later time? Or is it now a stale comment? Either way, it doesn't affect this patch. > +++ b/migration/block.c > @@ -270,7 +270,6 @@ static int mig_save_device_bulk(QEMUFile *f, > BlkMigDevState *bmds) > > if (bmds->shared_base) { > qemu_mutex_lock_iothread(); > -aio_context_acquire(blk_get_aio_context(bb)); ... > @@ -313,22 +311,10 @@ static int mig_save_device_bulk(QEMUFile *f, > BlkMigDevState *bmds) > block_mig_state.submitted++; > blk_mig_unlock(); > > -/* We do not know if bs is under the main thread (and thus does > - * not acquire the AioContext when doing AIO) or rather under > - * dataplane. Thus acquire both the iothread mutex and the > - * AioContext. > - * > - * This is ugly and will disappear when we make bdrv_* thread-safe, > - * without the need to acquire the AioContext. > - */ > -qemu_mutex_lock_iothread(); > -aio_context_acquire(blk_get_aio_context(bmds->blk)); Will conflict, but with trivial resolution, with your other thread renaming things to qemu_bql_lock(). > +++ b/tests/unit/test-blockjob.c > -static void test_complete_in_standby(void) > -{ > @@ -531,13 +402,5 @@ int main(int argc, char **argv) > g_test_add_func("/blockjob/cancel/standby", test_cancel_standby); > g_test_add_func("/blockjob/cancel/pending", test_cancel_pending); > g_test_add_func("/blockjob/cancel/concluded", test_cancel_concluded); > - > -/* > - * This test is flaky and sometimes fails in CI and otherwise: > - * don't run unless user opts in via environment variable. > - */ > -if (getenv("QEMU_TEST_FLAKY_TESTS")) { > -g_test_add_func("/blockjob/complete_in_standby", > test_complete_in_standby); > -} Looks like you ripped out this entire test, because it is no longer viable. I might have mentioned it in the commit message, or squashed the removal of this test into the earlier 02/12 patch. Reviewed-by: Eric Blake -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
Re: [PATCH 05/12] block: remove AioContext locking
On 29/11/2023 19:55, Stefan Hajnoczi wrote: This is the big patch that removes aio_context_acquire()/aio_context_release() from the block layer and affected block layer users. There isn't a clean way to split this patch and the reviewers are likely the same group of people, so I decided to do it in one patch. Signed-off-by: Stefan Hajnoczi --- include/block/block-global-state.h | 9 +- include/block/block-io.h | 3 +- include/block/snapshot.h | 2 - block.c| 234 +- block/block-backend.c | 14 -- block/copy-before-write.c | 22 +-- block/export/export.c | 22 +-- block/io.c | 45 + block/mirror.c | 19 -- block/monitor/bitmap-qmp-cmds.c| 20 +- block/monitor/block-hmp-cmds.c | 29 --- block/qapi-sysemu.c| 27 +-- block/qapi.c | 18 +- block/raw-format.c | 5 - block/replication.c| 58 +- block/snapshot.c | 22 +-- block/write-threshold.c| 6 - blockdev.c | 307 + blockjob.c | 18 -- hw/block/dataplane/virtio-blk.c| 10 - hw/block/dataplane/xen-block.c | 17 +- hw/block/virtio-blk.c | 45 + hw/core/qdev-properties-system.c | 9 - job.c | 16 -- migration/block.c | 33 +--- migration/migration-hmp-cmds.c | 3 - migration/savevm.c | 22 --- net/colo-compare.c | 2 - qemu-img.c | 4 - qemu-io.c | 10 +- qemu-nbd.c | 2 - replay/replay-debugging.c | 4 - tests/unit/test-bdrv-drain.c | 51 + tests/unit/test-bdrv-graph-mod.c | 6 - tests/unit/test-block-iothread.c | 31 --- tests/unit/test-blockjob.c | 137 - tests/unit/test-replication.c | 11 -- util/async.c | 4 - util/vhost-user-server.c | 3 - scripts/block-coroutine-wrapper.py | 3 - tests/tsan/suppressions.tsan | 1 - 41 files changed, 102 insertions(+), 1202 deletions(-) Reviewed-by: Paul Durrant