Suggestions hold pattern.. programmers obviously understand control logic.
Systems built with operational flaws. Will freeze up when u start repairing
it while it's running.

On Thu, Jan 9, 2025, 4:48 AM Kevin Wolf <kw...@redhat.com> wrote:

> Am 08.01.2025 um 13:46 hat Fiona Ebner geschrieben:
> > Setting blk->root is a graph change operation and thus needs to be
> > protected by the block graph write lock in blk_remove_bs(). The
> > assignment to blk->root in blk_insert_bs() is already protected by
> > the block graph write lock.
>
> Hm, if that's the case, then we should also enforce this in the
> declaration of BlockBackend:
>
>     BdrvChild * GRAPH_RDLOCK_PTR root;
>
> However, this results in more compiler failures that we need to fix. You
> caught the only remaining writer, but the lock is only fully effective
> if all readers take it, too.
>
> > In particular, the graph read lock in blk_co_do_flush() could
> > previously not ensure that blk_bs(blk) would always return the same
> > value during the locked section, which could lead to a segfault [0] in
> > combination with migration [1].
> >
> > From the user-provided backtraces in the forum thread [1], it seems
> > like blk_co_do_flush() managed to get past the
> > blk_co_is_available(blk) check, meaning that blk_bs(blk) returned a
> > non-NULL value during the check, but then, when calling
> > bdrv_co_flush(), blk_bs(blk) returned NULL.
> >
> > [0]:
> >
> > > 0  bdrv_primary_child (bs=bs@entry=0x0) at ../block.c:8287
> > > 1  bdrv_co_flush (bs=0x0) at ../block/io.c:2948
> > > 2  bdrv_co_flush_entry (opaque=0x7a610affae90) at block/block-gen.c:901
> >
> > [1]: https://forum.proxmox.com/threads/158072
> >
> > Cc: qemu-sta...@nongnu.org
> > Signed-off-by: Fiona Ebner <f.eb...@proxmox.com>
> > ---
> >  block/block-backend.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/block/block-backend.c b/block/block-backend.c
> > index c93a7525ad..9678615318 100644
> > --- a/block/block-backend.c
> > +++ b/block/block-backend.c
> > @@ -887,9 +887,9 @@ void blk_remove_bs(BlockBackend *blk)
> >       */
> >      blk_drain(blk);
> >      root = blk->root;
> > -    blk->root = NULL;
> >
> >      bdrv_graph_wrlock();
> > +    blk->root = NULL;
> >      bdrv_root_unref_child(root);
> >      bdrv_graph_wrunlock();
> >  }
>
> I think the 'root = blk->root' needs to be inside the locked section,
> too. Otherwise blk->root could change during bdrv_graph_wrlock() (which
> has a nested event loop) and root would be stale. I assume clang would
> complain about this with the added GRAPH_RDLOCK_PTR.
>
> Kevin
>
>
>

Reply via email to