Re: [PATCH 05/12] block: remove AioContext locking

2023-12-04 Thread Stefan Hajnoczi
On Thu, Nov 30, 2023 at 03:31:37PM -0600, Eric Blake wrote:
> On Wed, Nov 29, 2023 at 02:55:46PM -0500, Stefan Hajnoczi wrote:
> > This is the big patch that removes
> > aio_context_acquire()/aio_context_release() from the block layer and
> > affected block layer users.
> > 
> > There isn't a clean way to split this patch and the reviewers are likely
> > the same group of people, so I decided to do it in one patch.
> > 
> > Signed-off-by: Stefan Hajnoczi 
> > ---
> 
> > +++ b/block.c
> > @@ -7585,29 +7433,12 @@ void coroutine_fn bdrv_co_leave(BlockDriverState 
> > *bs, AioContext *old_ctx)
> >  
> >  void coroutine_fn bdrv_co_lock(BlockDriverState *bs)
> >  {
> > -AioContext *ctx = bdrv_get_aio_context(bs);
> > -
> > -/* In the main thread, bs->aio_context won't change concurrently */
> > -assert(qemu_get_current_aio_context() == qemu_get_aio_context());
> > -
> > -/*
> > - * We're in coroutine context, so we already hold the lock of the main
> > - * loop AioContext. Don't lock it twice to avoid deadlocks.
> > - */
> > -assert(qemu_in_coroutine());
> 
> Is this assertion worth keeping in the short term?...

Probably not because coroutine vs non-coroutine functions don't change
in this patch series, so it's unlikely that this will break.

> 
> > -if (ctx != qemu_get_aio_context()) {
> > -aio_context_acquire(ctx);
> > -}
> > +/* TODO removed in next patch */
> >  }
> 
> ...I guess I'll see in the next patch.
> 
> >  
> >  void coroutine_fn bdrv_co_unlock(BlockDriverState *bs)
> >  {
> > -AioContext *ctx = bdrv_get_aio_context(bs);
> > -
> > -assert(qemu_in_coroutine());
> > -if (ctx != qemu_get_aio_context()) {
> > -aio_context_release(ctx);
> > -}
> > +/* TODO removed in next patch */
> >  }
> 
> Same comment.
> 
> > +++ b/blockdev.c
> > @@ -1395,7 +1352,6 @@ static void 
> > external_snapshot_action(TransactionAction *action,
> >  /* File name of the new image (for 'blockdev-snapshot-sync') */
> >  const char *new_image_file;
> >  ExternalSnapshotState *state = g_new0(ExternalSnapshotState, 1);
> > -AioContext *aio_context;
> >  uint64_t perm, shared;
> >  
> >  /* TODO We'll eventually have to take a writer lock in this function */
> 
> I'm guessing removal of the locking gets us one step closer to
> implementing this TODO at a later time?  Or is it now a stale comment?
> Either way, it doesn't affect this patch.

I'm not sure. Kevin can answer questions about the graph lock.

> > +++ b/tests/unit/test-blockjob.c
> 
> > -static void test_complete_in_standby(void)
> > -{
> 
> > @@ -531,13 +402,5 @@ int main(int argc, char **argv)
> >  g_test_add_func("/blockjob/cancel/standby", test_cancel_standby);
> >  g_test_add_func("/blockjob/cancel/pending", test_cancel_pending);
> >  g_test_add_func("/blockjob/cancel/concluded", test_cancel_concluded);
> > -
> > -/*
> > - * This test is flaky and sometimes fails in CI and otherwise:
> > - * don't run unless user opts in via environment variable.
> > - */
> > -if (getenv("QEMU_TEST_FLAKY_TESTS")) {
> > -g_test_add_func("/blockjob/complete_in_standby", 
> > test_complete_in_standby);
> > -}
> 
> Looks like you ripped out this entire test, because it is no longer
> viable.  I might have mentioned it in the commit message, or squashed
> the removal of this test into the earlier 02/12 patch.

I have sent a separate patch to remove this test and once it's merged
this hunk will disappear this patch series:
https://lore.kernel.org/qemu-devel/20231127170210.422728-1-stefa...@redhat.com/

Stefan


signature.asc
Description: PGP signature


Re: [PATCH 05/12] block: remove AioContext locking

2023-12-04 Thread Stefan Hajnoczi
On Mon, Dec 04, 2023 at 03:33:57PM +0100, Kevin Wolf wrote:
> Am 29.11.2023 um 20:55 hat Stefan Hajnoczi geschrieben:
> > This is the big patch that removes
> > aio_context_acquire()/aio_context_release() from the block layer and
> > affected block layer users.
> > 
> > There isn't a clean way to split this patch and the reviewers are likely
> > the same group of people, so I decided to do it in one patch.
> > 
> > Signed-off-by: Stefan Hajnoczi 
> 
> > @@ -7585,29 +7433,12 @@ void coroutine_fn bdrv_co_leave(BlockDriverState 
> > *bs, AioContext *old_ctx)
> >  
> >  void coroutine_fn bdrv_co_lock(BlockDriverState *bs)
> >  {
> > -AioContext *ctx = bdrv_get_aio_context(bs);
> > -
> > -/* In the main thread, bs->aio_context won't change concurrently */
> > -assert(qemu_get_current_aio_context() == qemu_get_aio_context());
> > -
> > -/*
> > - * We're in coroutine context, so we already hold the lock of the main
> > - * loop AioContext. Don't lock it twice to avoid deadlocks.
> > - */
> > -assert(qemu_in_coroutine());
> > -if (ctx != qemu_get_aio_context()) {
> > -aio_context_acquire(ctx);
> > -}
> > +/* TODO removed in next patch */
> >  }
> 
> It's still there at the end of the series.

Will fix in v2. Thanks!


signature.asc
Description: PGP signature


Re: [PATCH 05/12] block: remove AioContext locking

2023-12-04 Thread Kevin Wolf
Am 29.11.2023 um 20:55 hat Stefan Hajnoczi geschrieben:
> This is the big patch that removes
> aio_context_acquire()/aio_context_release() from the block layer and
> affected block layer users.
> 
> There isn't a clean way to split this patch and the reviewers are likely
> the same group of people, so I decided to do it in one patch.
> 
> Signed-off-by: Stefan Hajnoczi 

> @@ -7585,29 +7433,12 @@ void coroutine_fn bdrv_co_leave(BlockDriverState *bs, 
> AioContext *old_ctx)
>  
>  void coroutine_fn bdrv_co_lock(BlockDriverState *bs)
>  {
> -AioContext *ctx = bdrv_get_aio_context(bs);
> -
> -/* In the main thread, bs->aio_context won't change concurrently */
> -assert(qemu_get_current_aio_context() == qemu_get_aio_context());
> -
> -/*
> - * We're in coroutine context, so we already hold the lock of the main
> - * loop AioContext. Don't lock it twice to avoid deadlocks.
> - */
> -assert(qemu_in_coroutine());
> -if (ctx != qemu_get_aio_context()) {
> -aio_context_acquire(ctx);
> -}
> +/* TODO removed in next patch */
>  }

It's still there at the end of the series.

>  void coroutine_fn bdrv_co_unlock(BlockDriverState *bs)
>  {
> -AioContext *ctx = bdrv_get_aio_context(bs);
> -
> -assert(qemu_in_coroutine());
> -if (ctx != qemu_get_aio_context()) {
> -aio_context_release(ctx);
> -}
> +/* TODO removed in next patch */
>  }

This one, too.

Reviewed-by: Kevin Wolf 




Re: [PATCH 05/12] block: remove AioContext locking

2023-11-30 Thread Eric Blake
On Wed, Nov 29, 2023 at 02:55:46PM -0500, Stefan Hajnoczi wrote:
> This is the big patch that removes
> aio_context_acquire()/aio_context_release() from the block layer and
> affected block layer users.
> 
> There isn't a clean way to split this patch and the reviewers are likely
> the same group of people, so I decided to do it in one patch.
> 
> Signed-off-by: Stefan Hajnoczi 
> ---

> +++ b/block.c
> @@ -7585,29 +7433,12 @@ void coroutine_fn bdrv_co_leave(BlockDriverState *bs, 
> AioContext *old_ctx)
>  
>  void coroutine_fn bdrv_co_lock(BlockDriverState *bs)
>  {
> -AioContext *ctx = bdrv_get_aio_context(bs);
> -
> -/* In the main thread, bs->aio_context won't change concurrently */
> -assert(qemu_get_current_aio_context() == qemu_get_aio_context());
> -
> -/*
> - * We're in coroutine context, so we already hold the lock of the main
> - * loop AioContext. Don't lock it twice to avoid deadlocks.
> - */
> -assert(qemu_in_coroutine());

Is this assertion worth keeping in the short term?...

> -if (ctx != qemu_get_aio_context()) {
> -aio_context_acquire(ctx);
> -}
> +/* TODO removed in next patch */
>  }

...I guess I'll see in the next patch.

>  
>  void coroutine_fn bdrv_co_unlock(BlockDriverState *bs)
>  {
> -AioContext *ctx = bdrv_get_aio_context(bs);
> -
> -assert(qemu_in_coroutine());
> -if (ctx != qemu_get_aio_context()) {
> -aio_context_release(ctx);
> -}
> +/* TODO removed in next patch */
>  }

Same comment.

> +++ b/blockdev.c
> @@ -1395,7 +1352,6 @@ static void external_snapshot_action(TransactionAction 
> *action,
>  /* File name of the new image (for 'blockdev-snapshot-sync') */
>  const char *new_image_file;
>  ExternalSnapshotState *state = g_new0(ExternalSnapshotState, 1);
> -AioContext *aio_context;
>  uint64_t perm, shared;
>  
>  /* TODO We'll eventually have to take a writer lock in this function */

I'm guessing removal of the locking gets us one step closer to
implementing this TODO at a later time?  Or is it now a stale comment?
Either way, it doesn't affect this patch.

> +++ b/migration/block.c
> @@ -270,7 +270,6 @@ static int mig_save_device_bulk(QEMUFile *f, 
> BlkMigDevState *bmds)
>  
>  if (bmds->shared_base) {
>  qemu_mutex_lock_iothread();
> -aio_context_acquire(blk_get_aio_context(bb));
...
> @@ -313,22 +311,10 @@ static int mig_save_device_bulk(QEMUFile *f, 
> BlkMigDevState *bmds)
>  block_mig_state.submitted++;
>  blk_mig_unlock();
>  
> -/* We do not know if bs is under the main thread (and thus does
> - * not acquire the AioContext when doing AIO) or rather under
> - * dataplane.  Thus acquire both the iothread mutex and the
> - * AioContext.
> - *
> - * This is ugly and will disappear when we make bdrv_* thread-safe,
> - * without the need to acquire the AioContext.
> - */
> -qemu_mutex_lock_iothread();
> -aio_context_acquire(blk_get_aio_context(bmds->blk));

Will conflict, but with trivial resolution, with your other thread
renaming things to qemu_bql_lock().


> +++ b/tests/unit/test-blockjob.c

> -static void test_complete_in_standby(void)
> -{

> @@ -531,13 +402,5 @@ int main(int argc, char **argv)
>  g_test_add_func("/blockjob/cancel/standby", test_cancel_standby);
>  g_test_add_func("/blockjob/cancel/pending", test_cancel_pending);
>  g_test_add_func("/blockjob/cancel/concluded", test_cancel_concluded);
> -
> -/*
> - * This test is flaky and sometimes fails in CI and otherwise:
> - * don't run unless user opts in via environment variable.
> - */
> -if (getenv("QEMU_TEST_FLAKY_TESTS")) {
> -g_test_add_func("/blockjob/complete_in_standby", 
> test_complete_in_standby);
> -}

Looks like you ripped out this entire test, because it is no longer
viable.  I might have mentioned it in the commit message, or squashed
the removal of this test into the earlier 02/12 patch.

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org




Re: [PATCH 05/12] block: remove AioContext locking

2023-11-30 Thread Paul Durrant

On 29/11/2023 19:55, Stefan Hajnoczi wrote:

This is the big patch that removes
aio_context_acquire()/aio_context_release() from the block layer and
affected block layer users.

There isn't a clean way to split this patch and the reviewers are likely
the same group of people, so I decided to do it in one patch.

Signed-off-by: Stefan Hajnoczi 
---
  include/block/block-global-state.h |   9 +-
  include/block/block-io.h   |   3 +-
  include/block/snapshot.h   |   2 -
  block.c| 234 +-
  block/block-backend.c  |  14 --
  block/copy-before-write.c  |  22 +--
  block/export/export.c  |  22 +--
  block/io.c |  45 +
  block/mirror.c |  19 --
  block/monitor/bitmap-qmp-cmds.c|  20 +-
  block/monitor/block-hmp-cmds.c |  29 ---
  block/qapi-sysemu.c|  27 +--
  block/qapi.c   |  18 +-
  block/raw-format.c |   5 -
  block/replication.c|  58 +-
  block/snapshot.c   |  22 +--
  block/write-threshold.c|   6 -
  blockdev.c | 307 +
  blockjob.c |  18 --
  hw/block/dataplane/virtio-blk.c|  10 -
  hw/block/dataplane/xen-block.c |  17 +-
  hw/block/virtio-blk.c  |  45 +
  hw/core/qdev-properties-system.c   |   9 -
  job.c  |  16 --
  migration/block.c  |  33 +---
  migration/migration-hmp-cmds.c |   3 -
  migration/savevm.c |  22 ---
  net/colo-compare.c |   2 -
  qemu-img.c |   4 -
  qemu-io.c  |  10 +-
  qemu-nbd.c |   2 -
  replay/replay-debugging.c  |   4 -
  tests/unit/test-bdrv-drain.c   |  51 +
  tests/unit/test-bdrv-graph-mod.c   |   6 -
  tests/unit/test-block-iothread.c   |  31 ---
  tests/unit/test-blockjob.c | 137 -
  tests/unit/test-replication.c  |  11 --
  util/async.c   |   4 -
  util/vhost-user-server.c   |   3 -
  scripts/block-coroutine-wrapper.py |   3 -
  tests/tsan/suppressions.tsan   |   1 -
  41 files changed, 102 insertions(+), 1202 deletions(-)



Reviewed-by: Paul Durrant