Am 08.01.2025 um 13:46 hat Fiona Ebner geschrieben: > Setting blk->root is a graph change operation and thus needs to be > protected by the block graph write lock in blk_remove_bs(). The > assignment to blk->root in blk_insert_bs() is already protected by > the block graph write lock.
Hm, if that's the case, then we should also enforce this in the declaration of BlockBackend: BdrvChild * GRAPH_RDLOCK_PTR root; However, this results in more compiler failures that we need to fix. You caught the only remaining writer, but the lock is only fully effective if all readers take it, too. > In particular, the graph read lock in blk_co_do_flush() could > previously not ensure that blk_bs(blk) would always return the same > value during the locked section, which could lead to a segfault [0] in > combination with migration [1]. > > From the user-provided backtraces in the forum thread [1], it seems > like blk_co_do_flush() managed to get past the > blk_co_is_available(blk) check, meaning that blk_bs(blk) returned a > non-NULL value during the check, but then, when calling > bdrv_co_flush(), blk_bs(blk) returned NULL. > > [0]: > > > 0 bdrv_primary_child (bs=bs@entry=0x0) at ../block.c:8287 > > 1 bdrv_co_flush (bs=0x0) at ../block/io.c:2948 > > 2 bdrv_co_flush_entry (opaque=0x7a610affae90) at block/block-gen.c:901 > > [1]: https://forum.proxmox.com/threads/158072 > > Cc: qemu-sta...@nongnu.org > Signed-off-by: Fiona Ebner <f.eb...@proxmox.com> > --- > block/block-backend.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/block/block-backend.c b/block/block-backend.c > index c93a7525ad..9678615318 100644 > --- a/block/block-backend.c > +++ b/block/block-backend.c > @@ -887,9 +887,9 @@ void blk_remove_bs(BlockBackend *blk) > */ > blk_drain(blk); > root = blk->root; > - blk->root = NULL; > > bdrv_graph_wrlock(); > + blk->root = NULL; > bdrv_root_unref_child(root); > bdrv_graph_wrunlock(); > } I think the 'root = blk->root' needs to be inside the locked section, too. Otherwise blk->root could change during bdrv_graph_wrlock() (which has a nested event loop) and root would be stale. I assume clang would complain about this with the added GRAPH_RDLOCK_PTR. Kevin