Am 27.03.2023 um 13:39 hat Kevin Wolf geschrieben: > blk_get_geometry() eventually calls bdrv_nb_sectors(), which is a > co_wrapper_mixed_bdrv_rdlock. This means that when it is called from > coroutine context, it already assume to have the graph locked. > > However, virtio_blk_sect_range_ok() in block/export/virtio-blk-handler.c > (used by vhost-user-blk and VDUSE exports) runs in a coroutine, but > doesn't take the graph lock - blk_*() functions are generally expected > to do that internally. This causes an assertion failure when accessing > an export for the first time if it runs in an iothread. > > This is an example of the crash: > > $ ./storage-daemon/qemu-storage-daemon --object iothread,id=th0 --blockdev > file,filename=/home/kwolf/images/hd.img,node-name=disk --export > vhost-user-blk,addr.type=unix,addr.path=/tmp/vhost.sock,node-name=disk,id=exp0,iothread=th0 > qemu-storage-daemon: ../block/graph-lock.c:268: void > assert_bdrv_graph_readable(void): Assertion `qemu_in_main_thread() || > reader_count()' failed. > > (gdb) bt
Oops, git helpfully removed the "comments"... (gdb) bt #0 0x00007ffff6eafe5c in __pthread_kill_implementation () from /lib64/libc.so.6 #1 0x00007ffff6e5fa76 in raise () from /lib64/libc.so.6 #2 0x00007ffff6e497fc in abort () from /lib64/libc.so.6 #3 0x00007ffff6e4971b in __assert_fail_base.cold () from /lib64/libc.so.6 #4 0x00007ffff6e58656 in __assert_fail () from /lib64/libc.so.6 #5 0x00005555556337a3 in assert_bdrv_graph_readable () at ../block/graph-lock.c:268 #6 0x00005555555fd5a2 in bdrv_co_nb_sectors (bs=0x5555564c5ef0) at ../block.c:5847 #7 0x00005555555ee949 in bdrv_nb_sectors (bs=0x5555564c5ef0) at block/block-gen.c:256 #8 0x00005555555fd6b9 in bdrv_get_geometry (bs=0x5555564c5ef0, nb_sectors_ptr=0x7fffef7fedd0) at ../block.c:5884 #9 0x000055555562ad6d in blk_get_geometry (blk=0x5555564cb200, nb_sectors_ptr=0x7fffef7fedd0) at ../block/block-backend.c:1624 #10 0x00005555555ddb74 in virtio_blk_sect_range_ok (blk=0x5555564cb200, block_size=512, sector=0, size=512) at ../block/export/virtio-blk-handler.c:44 #11 0x00005555555dd80d in virtio_blk_process_req (handler=0x5555564cbb98, in_iov=0x7fffe8003830, out_iov=0x7fffe8003860, in_num=1, out_num=0) at ../block/export/virtio-blk-handler.c:189 #12 0x00005555555dd546 in vu_blk_virtio_process_req (opaque=0x7fffe8003800) at ../block/export/vhost-user-blk-server.c:66 #13 0x00005555557bf4a1 in coroutine_trampoline (i0=-402635264, i1=32767) at ../util/coroutine-ucontext.c:177 #14 0x00007ffff6e75c20 in ?? () from /lib64/libc.so.6 #15 0x00007fffefffa870 in ?? () #16 0x0000000000000000 in ?? () I'm adding this back while applying (but with indentation this time so that git doesn't interpret it as comments). Kevin > Fix this by creating a new blk_co_get_geometry() that takes the lock, > and changing blk_get_geometry() to be a co_wrapper_mixed around it. > > To make the resulting code cleaner, virtio-blk-handler.c can directly > call the coroutine version now (though that wouldn't be necessary for > fixing the bug, taking the lock in blk_co_get_geometry() is what fixes > it). > > Fixes: 8ab8140a04cf771d63e9754d6ba6c1e676bfe507 > Reported-by: Lukáš Doktor <[email protected]> > Signed-off-by: Kevin Wolf <[email protected]>
