Re: [PATCH 1/8] block: Call .bdrv_co_create(_opts) unlocked

2023-05-15 Thread Eric Blake


On Mon, May 15, 2023 at 06:19:41PM +0200, Kevin Wolf wrote:
> > > @@ -3724,8 +3726,10 @@ qcow2_co_create(BlockdevCreateOptions 
> > > *create_options, Error **errp)
> > >  goto out;
> > >  }
> > >  
> > > +bdrv_graph_co_rdlock();
> > >  ret = qcow2_alloc_clusters(blk_bs(blk), 3 * cluster_size);
> > >  if (ret < 0) {
> > > +bdrv_graph_co_rdunlock();
> > >  error_setg_errno(errp, -ret, "Could not allocate clusters for 
> > > qcow2 "
> > >   "header and refcount table");
> > >  goto out;
> > > @@ -3743,6 +3747,8 @@ qcow2_co_create(BlockdevCreateOptions 
> > > *create_options, Error **errp)
> > >  
> > >  /* Create a full header (including things like feature table) */
> > >  ret = qcow2_update_header(blk_bs(blk));
> > > +bdrv_graph_co_rdunlock();
> > > +
> > 
> > If we ever inject any 'goto out' in the elided lines, we're in
> > trouble.  Would this be any safer by wrapping the intervening
> > statements in a scope-guarded lock?
> 
> TSA doesn't understand these guards, which is why they are only
> annotated as assertions (I think we talked about this in my previous
> series), at the cost of leaving unlocking unchecked. So in cases where
> the scope isn't the full function, individual calls are better at the
> moment. Once clang implements support for __attribute__((cleanup)), we
> can maybe change this.

Yeah, LOTS of people are waiting on clang __attribute__((cleanup))
analysis sanity ;)

> 
> Of course, TSA solves the very maintenance problem you're concerned
> about: With a 'goto out' added, compilation on clang fails because it
> sees that there is a code path that doesn't unlock. So at least it makes
> the compromise not terrible.
> 
> For example, if I comment out the unlock in the error case in the first,
> this is what I get:
> 
> ../block/qcow2.c:3825:5: error: mutex 'graph_lock' is not held on every path 
> through here [-Werror,-Wthread-safety-analysis]
> blk_co_unref(blk);
> ^
> ../block/qcow2.c:3735:5: note: mutex acquired here
> bdrv_graph_co_rdlock();
> ^
> 1 error generated.

I'm sold!  The very reason you can't use a cleanup scope-guard
(because clang can't see through the annotation) is also the reason
why clang is able to flag your error if you don't properly clean up by
hand.  So while it is more tedious to maintain, we've finally got
compiler assistance to something that used to be human-only, which is
a step forwards even if it is more to type in the short term.

Thanks for testing that.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org




Re: [PATCH 1/8] block: Call .bdrv_co_create(_opts) unlocked

2023-05-15 Thread Kevin Wolf
Am 12.05.2023 um 18:12 hat Eric Blake geschrieben:
> 
> On Wed, May 10, 2023 at 10:35:54PM +0200, Kevin Wolf wrote:
> > 
> > These are functions that modify the graph, so they must be able to take
> > a writer lock. This is impossible if they already hold the reader lock.
> > If they need a reader lock for some of their operations, they should
> > take it internally.
> > 
> > Many of them go through blk_*(), which will always take the lock itself.
> > Direct calls of bdrv_*() need to take the reader lock. Note that while
> > locking for bdrv_co_*() calls is checked by TSA, this is not the case
> > for the mixed_coroutine_fns bdrv_*(). Holding the lock is still required
> > when they are called from coroutine context like here!
> > 
> > This effectively reverts 4ec8df0183, but adds some internal locking
> > instead.
> > 
> > Signed-off-by: Kevin Wolf 
> > ---
> 
> > +++ b/block/qcow2.c
> 
> > -static int coroutine_fn
> > +static int coroutine_fn GRAPH_UNLOCKED
> >  qcow2_co_create(BlockdevCreateOptions *create_options, Error **errp)
> >  {
> >  BlockdevCreateOptionsQcow2 *qcow2_opts;
> > @@ -3724,8 +3726,10 @@ qcow2_co_create(BlockdevCreateOptions 
> > *create_options, Error **errp)
> >  goto out;
> >  }
> >  
> > +bdrv_graph_co_rdlock();
> >  ret = qcow2_alloc_clusters(blk_bs(blk), 3 * cluster_size);
> >  if (ret < 0) {
> > +bdrv_graph_co_rdunlock();
> >  error_setg_errno(errp, -ret, "Could not allocate clusters for 
> > qcow2 "
> >   "header and refcount table");
> >  goto out;
> > @@ -3743,6 +3747,8 @@ qcow2_co_create(BlockdevCreateOptions 
> > *create_options, Error **errp)
> >  
> >  /* Create a full header (including things like feature table) */
> >  ret = qcow2_update_header(blk_bs(blk));
> > +bdrv_graph_co_rdunlock();
> > +
> 
> If we ever inject any 'goto out' in the elided lines, we're in
> trouble.  Would this be any safer by wrapping the intervening
> statements in a scope-guarded lock?

TSA doesn't understand these guards, which is why they are only
annotated as assertions (I think we talked about this in my previous
series), at the cost of leaving unlocking unchecked. So in cases where
the scope isn't the full function, individual calls are better at the
moment. Once clang implements support for __attribute__((cleanup)), we
can maybe change this.

Of course, TSA solves the very maintenance problem you're concerned
about: With a 'goto out' added, compilation on clang fails because it
sees that there is a code path that doesn't unlock. So at least it makes
the compromise not terrible.

For example, if I comment out the unlock in the error case in the first,
this is what I get:

../block/qcow2.c:3825:5: error: mutex 'graph_lock' is not held on every path 
through here [-Werror,-Wthread-safety-analysis]
blk_co_unref(blk);
^
../block/qcow2.c:3735:5: note: mutex acquired here
bdrv_graph_co_rdlock();
^
1 error generated.

Kevin




Re: [PATCH 1/8] block: Call .bdrv_co_create(_opts) unlocked

2023-05-12 Thread Eric Blake


On Wed, May 10, 2023 at 10:35:54PM +0200, Kevin Wolf wrote:
> 
> These are functions that modify the graph, so they must be able to take
> a writer lock. This is impossible if they already hold the reader lock.
> If they need a reader lock for some of their operations, they should
> take it internally.
> 
> Many of them go through blk_*(), which will always take the lock itself.
> Direct calls of bdrv_*() need to take the reader lock. Note that while
> locking for bdrv_co_*() calls is checked by TSA, this is not the case
> for the mixed_coroutine_fns bdrv_*(). Holding the lock is still required
> when they are called from coroutine context like here!
> 
> This effectively reverts 4ec8df0183, but adds some internal locking
> instead.
> 
> Signed-off-by: Kevin Wolf 
> ---

> +++ b/block/qcow2.c

> -static int coroutine_fn
> +static int coroutine_fn GRAPH_UNLOCKED
>  qcow2_co_create(BlockdevCreateOptions *create_options, Error **errp)
>  {
>  BlockdevCreateOptionsQcow2 *qcow2_opts;
> @@ -3724,8 +3726,10 @@ qcow2_co_create(BlockdevCreateOptions *create_options, 
> Error **errp)
>  goto out;
>  }
>  
> +bdrv_graph_co_rdlock();
>  ret = qcow2_alloc_clusters(blk_bs(blk), 3 * cluster_size);
>  if (ret < 0) {
> +bdrv_graph_co_rdunlock();
>  error_setg_errno(errp, -ret, "Could not allocate clusters for qcow2 "
>   "header and refcount table");
>  goto out;
> @@ -3743,6 +3747,8 @@ qcow2_co_create(BlockdevCreateOptions *create_options, 
> Error **errp)
>  
>  /* Create a full header (including things like feature table) */
>  ret = qcow2_update_header(blk_bs(blk));
> +bdrv_graph_co_rdunlock();
> +

If we ever inject any 'goto out' in the elided lines, we're in
trouble.  Would this be any safer by wrapping the intervening
statements in a scope-guarded lock?

But what you have is correct, despite my worries about potential
maintenance concerns.

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org




[PATCH 1/8] block: Call .bdrv_co_create(_opts) unlocked

2023-05-10 Thread Kevin Wolf
These are functions that modify the graph, so they must be able to take
a writer lock. This is impossible if they already hold the reader lock.
If they need a reader lock for some of their operations, they should
take it internally.

Many of them go through blk_*(), which will always take the lock itself.
Direct calls of bdrv_*() need to take the reader lock. Note that while
locking for bdrv_co_*() calls is checked by TSA, this is not the case
for the mixed_coroutine_fns bdrv_*(). Holding the lock is still required
when they are called from coroutine context like here!

This effectively reverts 4ec8df0183, but adds some internal locking
instead.

Signed-off-by: Kevin Wolf 
---
 include/block/block-global-state.h |  8 +++
 include/block/block_int-common.h   |  4 ++--
 block.c|  1 -
 block/create.c |  1 -
 block/crypto.c | 25 ++--
 block/parallels.c  |  6 ++---
 block/qcow.c   |  6 ++---
 block/qcow2.c  | 37 +++---
 block/qed.c|  6 ++---
 block/raw-format.c |  2 +-
 block/vdi.c| 11 +
 block/vhdx.c   |  8 ---
 block/vmdk.c   | 27 ++
 block/vpc.c|  6 ++---
 14 files changed, 78 insertions(+), 70 deletions(-)

diff --git a/include/block/block-global-state.h 
b/include/block/block-global-state.h
index 2d93423d35..f347199bff 100644
--- a/include/block/block-global-state.h
+++ b/include/block/block-global-state.h
@@ -58,14 +58,14 @@ BlockDriver *bdrv_find_protocol(const char *filename,
 Error **errp);
 BlockDriver *bdrv_find_format(const char *format_name);
 
-int coroutine_fn GRAPH_RDLOCK
+int coroutine_fn GRAPH_UNLOCKED
 bdrv_co_create(BlockDriver *drv, const char *filename, QemuOpts *opts,
Error **errp);
 
-int co_wrapper_bdrv_rdlock bdrv_create(BlockDriver *drv, const char *filename,
-   QemuOpts *opts, Error **errp);
+int co_wrapper bdrv_create(BlockDriver *drv, const char *filename,
+   QemuOpts *opts, Error **errp);
 
-int coroutine_fn GRAPH_RDLOCK
+int coroutine_fn GRAPH_UNLOCKED
 bdrv_co_create_file(const char *filename, QemuOpts *opts, Error **errp);
 
 BlockDriverState *bdrv_new(void);
diff --git a/include/block/block_int-common.h b/include/block/block_int-common.h
index 4909876756..a19b85a6e1 100644
--- a/include/block/block_int-common.h
+++ b/include/block/block_int-common.h
@@ -245,10 +245,10 @@ struct BlockDriver {
 BlockDriverState *bs, QDict *options, int flags, Error **errp);
 void (*bdrv_close)(BlockDriverState *bs);
 
-int coroutine_fn GRAPH_RDLOCK_PTR (*bdrv_co_create)(
+int coroutine_fn GRAPH_UNLOCKED_PTR (*bdrv_co_create)(
 BlockdevCreateOptions *opts, Error **errp);
 
-int coroutine_fn GRAPH_RDLOCK_PTR (*bdrv_co_create_opts)(
+int coroutine_fn GRAPH_UNLOCKED_PTR (*bdrv_co_create_opts)(
 BlockDriver *drv, const char *filename, QemuOpts *opts, Error **errp);
 
 int (*bdrv_amend_options)(BlockDriverState *bs,
diff --git a/block.c b/block.c
index dad9a4fa43..e8b51a4391 100644
--- a/block.c
+++ b/block.c
@@ -533,7 +533,6 @@ int coroutine_fn bdrv_co_create(BlockDriver *drv, const 
char *filename,
 int ret;
 GLOBAL_STATE_CODE();
 ERRP_GUARD();
-assert_bdrv_graph_readable();
 
 if (!drv->bdrv_co_create_opts) {
 error_setg(errp, "Driver '%s' does not support image creation",
diff --git a/block/create.c b/block/create.c
index bf67b9947c..6b23a21675 100644
--- a/block/create.c
+++ b/block/create.c
@@ -43,7 +43,6 @@ static int coroutine_fn blockdev_create_run(Job *job, Error 
**errp)
 int ret;
 
 GLOBAL_STATE_CODE();
-GRAPH_RDLOCK_GUARD();
 
 job_progress_set_remaining(&s->common, 1);
 ret = s->drv->bdrv_co_create(s->opts, errp);
diff --git a/block/crypto.c b/block/crypto.c
index 30093cff9b..6ee8d46d30 100644
--- a/block/crypto.c
+++ b/block/crypto.c
@@ -99,12 +99,10 @@ struct BlockCryptoCreateData {
 };
 
 
-static int block_crypto_create_write_func(QCryptoBlock *block,
-  size_t offset,
-  const uint8_t *buf,
-  size_t buflen,
-  void *opaque,
-  Error **errp)
+static int coroutine_fn GRAPH_UNLOCKED
+block_crypto_create_write_func(QCryptoBlock *block, size_t offset,
+   const uint8_t *buf, size_t buflen, void *opaque,
+   Error **errp)
 {
 struct BlockCryptoCreateData *data = opaque;
 ssize_t ret;
@@ -117,10 +115,9 @@ static int block_crypto_create_write_func(QCryptoBlock 
*block,
 return 0;
 }
 
-static int block_c