On Thu, Sep 11, 2025 at 02:13:47PM +0200, Kevin Wolf wrote: > Am 11.09.2025 um 13:21 hat Thomas Huth geschrieben: > > On 10/09/2025 18.08, Kevin Wolf wrote: > > > Am 10.09.2025 um 17:16 hat Thomas Huth geschrieben: > > > > > > > > Hi, > > > > > > > > when running "./check -luks" in the qemu-iotests directory, > > > > some tests are failing for me: > > > > > > > > 295 296 inactive-node-nbd luks-detached-header > > > > > > > > Is that a known problem already? > > > > > > Not to me anyway. > > > > > > > FWIW, 295 is failing with the following output: > > > > > > > > 295 fail [17:03:01] [17:03:17] 15.7s failed, > > > > exit status 1 > > > > [...] > > > > +EWARNING:qemu.machine.machine:qemu received signal 6; command: > > > > "/home/thuth/tmp/qemu-build/qemu-system-x86_64 -display none -vga none > > > > -chardev socket,id=mon,fd=5 -mon chardev=mon,mode=control -chardev > > > > socket,id=qtest,fd=3 -qtest chardev:qtest -accel qtest -nodefaults > > > > -display none -accel qtest" > > > > +EEWARNING:qemu.machine.machine:qemu received signal 6; command: > > > > "/home/thuth/tmp/qemu-build/qemu-system-x86_64 -display none -vga none > > > > -chardev socket,id=mon,fd=6 -mon chardev=mon,mode=control -chardev > > > > socket,id=qtest,fd=3 -qtest chardev:qtest -accel qtest -nodefaults > > > > -display none -accel qtest" > > > > +EEWARNING:qemu.machine.machine:qemu received signal 6; command: > > > > "/home/thuth/tmp/qemu-build/qemu-system-x86_64 -display none -vga none > > > > -chardev socket,id=mon,fd=10 -mon chardev=mon,mode=control -chardev > > > > socket,id=qtest,fd=3 -qtest chardev:qtest -accel qtest -nodefaults > > > > -display none -accel qtest" > > > > +E > > > > [...] > > > > > > > > etc. > > > > > > > > 296 looks very similar (also a "qemu received signal 6" error), > > > > but the others look like this: > > > > > > When it gets signal 6 (i.e. SIGABRT), that usually means that you should > > > have a look at the coredump. > > > > With "-p" I additionally get this error message in the log: > > > > qemu-system-x86_64: ../../devel/qemu/block/graph-lock.c:294: > > bdrv_graph_rdlock_main_loop: Assertion `!qemu_in_coroutine()' failed. > > > > With -gdb I can get a back trace that looks like this: > > > > Thread 1 "qemu-system-x86" received signal SIGABRT, Aborted. > > 0x00007ffff4ba7e9c in __pthread_kill_implementation () from > > target:/lib64/libc.so.6 > > --Type <RET> for more, q to quit, c to continue without paging-- > > #0 0x00007ffff4ba7e9c in __pthread_kill_implementation () from > > target:/lib64/libc.so.6 > > #1 0x00007ffff4b4df3e in raise () from target:/lib64/libc.so.6 > > #2 0x00007ffff4b356d0 in abort () from target:/lib64/libc.so.6 > > #3 0x00007ffff4b35639 in __assert_fail_base.cold () from > > target:/lib64/libc.so.6 > > #4 0x0000555555574eae in bdrv_graph_rdlock_main_loop () at > > ../../devel/qemu/block/graph-lock.c:294 > > #5 0x0000555555aa2f43 in graph_lockable_auto_lock_mainloop (x=<optimized > > out>) at /home/thuth/devel/qemu/include/block/graph-lock.h:275 > > #6 block_crypto_read_func (block=<optimized out>, offset=4096, > > buf=0x555558324100 "", buflen=256000, opaque=0x555558a259d0, > > errp=0x555558a8c370) > > at ../../devel/qemu/block/crypto.c:71 > > #7 0x0000555555a5a308 in qcrypto_block_luks_load_key > > (block=block@entry=0x555558686ec0, slot_idx=slot_idx@entry=0, > > password=password@entry=0x555558626050 "hunter0", > > masterkey=masterkey@entry=0x55555886b2a0 "", > > readfunc=readfunc@entry=0x555555aa2f10 <block_crypto_read_func>, > > opaque=opaque@entry=0x555558a259d0, errp=0x555558a8c370) > > at ../../devel/qemu/crypto/block-luks.c:927 > > #8 0x0000555555a5ba7e in qcrypto_block_luks_find_key > > (block=0x555558686ec0, password=0x555558626050 "hunter0", > > masterkey=0x55555886b2a0 "", > > readfunc=0x555555aa2f10 <block_crypto_read_func>, > > opaque=0x555558a259d0, errp=0x555558a8c370) at > > ../../devel/qemu/crypto/block-luks.c:1045 > > #9 qcrypto_block_luks_amend_add_keyslot (block=0x555558686ec0, > > readfunc=0x555555aa2f10 <block_crypto_read_func>, > > writefunc=0x555555aa2e50 <block_crypto_write_func>, > > opaque=0x555558a259d0, opts_luks=0x7fffec5fff38, force=<optimized out>, > > errp=0x555558a8c370) > > at ../../devel/qemu/crypto/block-luks.c:1673 > > #10 qcrypto_block_luks_amend_options (block=0x555558686ec0, > > readfunc=0x555555aa2f10 <block_crypto_read_func>, > > writefunc=0x555555aa2e50 <block_crypto_write_func>, > > opaque=0x555558a259d0, options=0x7fffec5fff30, force=<optimized out>, > > errp=0x555558a8c370) > > at ../../devel/qemu/crypto/block-luks.c:1865 > > #11 0x0000555555aa3852 in block_crypto_amend_options_generic_luks > > (bs=<optimized out>, amend_options=<optimized out>, force=<optimized out>, > > errp=<optimized out>) at ../../devel/qemu/block/crypto.c:949 > > #12 0x0000555555aa38e9 in block_crypto_co_amend_luks (bs=<optimized out>, > > opts=<optimized out>, force=<optimized out>, errp=<optimized out>) > > at ../../devel/qemu/block/crypto.c:1008 > > #13 0x0000555555a96030 in blockdev_amend_run (job=0x555558a8c2b0, > > errp=0x555558a8c370) at ../../devel/qemu/block/amend.c:52 > > #14 0x0000555555a874ad in job_co_entry (opaque=0x555558a8c2b0) at > > ../../devel/qemu/job.c:1112 > > #15 0x0000555555bdc41b in coroutine_trampoline (i0=<optimized out>, > > i1=<optimized out>) at ../../devel/qemu/util/coroutine-ucontext.c:175 > > #16 0x00007ffff4b68f70 in ?? () from target:/lib64/libc.so.6 > > #17 0x00007fffffffc310 in ?? () > > #18 0x0000000000000000 in ?? () > > Hm, so block_crypto_read_func() isn't prepared to be called in coroutine > context, but block_crypto_co_amend_luks() still calls it from a > coroutine. The indirection of going through QCrypto won't make it easier > to fix this properly.
Historically block_crypto_read_func() didn't care/know whether it was in a coroutine or not. Bisect tells me the regression was caused by commit 1f051dcbdf2e4b6f518db731c84e304b2b9d15ce Author: Kevin Wolf <kw...@redhat.com> Date: Fri Oct 27 17:53:33 2023 +0200 block: Protect bs->file with graph_lock which added GLOBAL_STATE_CODE(); GRAPH_RDLOCK_GUARD_MAINLOOP(); > It seems to me that while block_crypto_read/write_func are effectively > no_coroutine_fn, qcrypto_block_amend_options() should really take > function pointers that can be called from coroutines. It is called from > both coroutine and non-coroutine code paths, so should the function > pointers be coroutine_mixed_fn or do we want to change the callers? > > Either way, we should add the appropriate coroutine markers to the > QCrypto interfaces to show the intention at least. I'm unclear why QCrypto needs to know about coroutines at all ? It just wants a function pointer that will send or recv a blob of data. In the case of the block layer these functions end up doing I/O via the block APIs, but QCrypto doesn't care about this impl detail. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|