On Wed, Oct 22, 2025 at 11:02:28AM +0200, Kevin Wolf wrote: > Am 21.10.2025 um 21:10 hat Stefan Hajnoczi geschrieben: > > On Thu, Oct 09, 2025 at 06:59:20PM +0200, Kevin Wolf wrote: > > > Am 09.10.2025 um 17:46 hat Kevin Wolf geschrieben: > > > > Am 10.09.2025 um 19:56 hat Stefan Hajnoczi geschrieben: > > > > > There is no need for aio_context_use_g_source() now that epoll(7) and > > > > > io_uring(7) file descriptor monitoring works with the glib event loop. > > > > > AioContext doesn't need to be notified that GSource is being used. > > > > > > > > > > Signed-off-by: Stefan Hajnoczi <[email protected]> > > > > > Reviewed-by: Eric Blake <[email protected]> > > > > > > > > We should probably mention in the commit message that this causes the > > > > default fdmon on Linux to change from poll to io_uring. It's a small > > > > code change, but it makes QEMU use a completely different code path by > > > > default. > > > > > > Just to make sure, I ran 'make check' after this patch and it's failing > > > for me: > > > > > > 10/401 qemu:qtest+qtest-x86_64 / qtest-x86_64/ahci-test > > > TIMEOUT 150.02s killed by signal 15 SIGTERM > > > 133/401 qemu:unit / test-aio > > > TIMEOUT 30.01s killed by signal 15 SIGTERM > > > 137/401 qemu:unit / test-bdrv-drain > > > TIMEOUT 30.01s killed by signal 15 SIGTERM > > > 142/401 qemu:unit / test-block-iothread > > > TIMEOUT 30.01s killed by signal 15 SIGTERM > > > 192/401 qemu:doc+rust / rust-bql-rs-doctests > > > FAIL 0.84s exit status 101 > > > 311/401 qemu:block / io-qcow2-267 > > > ERROR 3.20s exit status 1 > > > 321/401 qemu:block / io-qcow2-copy-before-write > > > TIMEOUT 180.01s killed by signal 15 SIGTERM > > > > > > Some of them look unrelated, but I have confirmed that the three unit > > > tests still pass before this patch (and still hang after the complete > > > series). > > > > I can't reproduce these failures, regardless of whether sysctl > > kernel.io_uring_disabled is 0 or 1. > > > > Can you launch the unit tests from your terminal and post the output? > > > > $ cd qemu > > $ build/tests/unit/test-aio > > TAP version 14 > # random seed: R02S48dcdde28634143f18bad3947c52d334 > 1..27 > # Start of aio tests > # Start of bh tests > ok 1 /aio/bh/schedule > ok 2 /aio/bh/schedule10 > ok 3 /aio/bh/cancel > ok 4 /aio/bh/delete > ok 5 /aio/bh/flush > # Start of callback-delete tests > ok 6 /aio/bh/callback-delete/one > ok 7 /aio/bh/callback-delete/many > # End of callback-delete tests > # End of bh tests > # Start of event tests > ok 8 /aio/event/add-remove > ok 9 /aio/event/wait > ok 10 /aio/event/flush > # Start of wait tests > ok 11 /aio/event/wait/no-flush-cb > # End of wait tests > # End of event tests > # Start of timer tests > > > $ build/tests/unit/test-bdrv-drain > > TAP version 14 > # random seed: R02S7d6ba0fc81d5b90d323813d680a30644 > 1..30 > # Start of bdrv-drain tests > ok 1 /bdrv-drain/nested > ok 2 /bdrv-drain/set_aio_context > # Start of driver-cb tests > > > > $ build/tests/unit/test-block-iothread > > TAP version 14 > # random seed: R02Sf81baf68887daa9b86be5c72b99df589 > 1..22 > # Start of sync-op tests > ok 1 /sync-op/pread > ok 2 /sync-op/pwrite > ok 3 /sync-op/preadv > ok 4 /sync-op/pwritev > ok 5 /sync-op/preadv_part > ok 6 /sync-op/pwritev_part > ok 7 /sync-op/pwrite_compressed > ok 8 /sync-op/pwrite_zeroes > ok 9 /sync-op/load_vmstate > ok 10 /sync-op/save_vmstate > ok 11 /sync-op/pdiscard > ok 12 /sync-op/truncate > ok 13 /sync-op/block_status > ok 14 /sync-op/flush > ok 15 /sync-op/check > ok 16 /sync-op/activate > # End of sync-op tests > # Start of attach tests > > > That will show exactly which sub-test case is hanging. > > Other information that might help: your host kernel version and liburing > > version. > > This is a F42 system. > > kernel-6.16.12-200.fc42.x86_64 > liburing-2.9-1.fc42.x86_64 > > If you can't reproduce or find a hypothesis what's happening, I can try > to debug one of the hanging processes.
Unfortunately I haven't been able to reproduce it on my system. It's
a F42 machine with the same package versions as your machine.
The test-aio timer tests look like good candidates for debugging. It is
likely that the test is either getting to an infinite do {} while
(!aio_poll(ctx, false)) loop or to an aio_poll(ctx, true) call that
hangs.
Thanks for your help with debugging!
Stefan
signature.asc
Description: PGP signature
