Re: [PATCH 3/3] monitor: only run coroutine commands in qemu_aio_context

2024-02-13 Thread Stefan Hajnoczi
On Sat, 3 Feb 2024 at 06:30, Michael Tokarev  wrote:
>
> 03.02.2024 12:01, Michael Tokarev wrote:
> ...
> > This change broke something in 7.2. I'm still debugging it, will
> > come with a follow-up once some more details are found, I'll also
> > check current master with and without this commit.
> >
> > The prob happens with multiple suspend-resume cycles, - with this
> > change applied, guest does not work as expected after *second*
> > suspend-resume.
>
> So, it turned out the prob here exists on master too, and manifests
> itself the same way on 7.2.9 or on 8.2.1, - in all cases where we
> have this change applied it works (or breaks) equally.
>
> A (simple) reproducer so far is a hibernate test, - it fails *only*
> after suspend-to-ram, but works fine after just hibernate.
>
> I used just an initrd (with a drive image used for swap -
> for hibernation space).
>
>   qemu-img create s.img 256M
>   mkswap s.img
>   qemu-system-x86_64 \
>-serial stdio -vga none -display none -parallel none -net none \
>-machine q35 \
>-drive file=s.img,if=ide,format=raw \
>-m 256 \
>-monitor unix:ttyS0,server,nowait \
>-kernel /boot/vmlinuz-6.1.0-15-amd64 \
>-initrd /boot/initrd.img-6.1.0-15-amd64 \
>-append "shell=/bin/sh console=ttyS0 root=none"
>
>   There, in the guest (it has busybox only here):
>   # swapon /dev/sda
>   # echo mem > /sys/power/state
>   (system_wakeup on the monitor)
>   # echo disk > /sys/power/state
>
> The system will hibernate but *not* turn off power, qemu
> will continue running, while all console messages are the
> same as when it works fine.  qemu process is spinning up
> with 100% cpu usage at this stage.
>
> Without the intermediate suspend-to-ram or without the
> commit in question, qemu process will exit normally at
> this stage.
>
> This is a somewhat patalogical test case, but I see it as an
> indicator of something else being wrong, like we aren't saving
> or restoring some state now which we should do.
>
> The tight loop also suggests we're not having success in there.

I'm unable to reproduce this. QEMU v8.2.0, v8.1.0, and even v7.2.0 all
spin with 100% CPU usage on my machine. It looks to me like this
behavior is not a regression caused by this commit.

Do v7.2.0, v8.1.0, and v8.2.0 terminate QEMU on your machine?

Stefan



Re: [PATCH 3/3] monitor: only run coroutine commands in qemu_aio_context

2024-02-03 Thread Michael Tokarev

03.02.2024 12:01, Michael Tokarev wrote:
...

This change broke something in 7.2. I'm still debugging it, will
come with a follow-up once some more details are found, I'll also
check current master with and without this commit.

The prob happens with multiple suspend-resume cycles, - with this
change applied, guest does not work as expected after *second*
suspend-resume.


So, it turned out the prob here exists on master too, and manifests
itself the same way on 7.2.9 or on 8.2.1, - in all cases where we
have this change applied it works (or breaks) equally.

A (simple) reproducer so far is a hibernate test, - it fails *only*
after suspend-to-ram, but works fine after just hibernate.

I used just an initrd (with a drive image used for swap -
for hibernation space).

 qemu-img create s.img 256M
 mkswap s.img
 qemu-system-x86_64 \
  -serial stdio -vga none -display none -parallel none -net none \
  -machine q35 \
  -drive file=s.img,if=ide,format=raw \
  -m 256 \
  -monitor unix:ttyS0,server,nowait \
  -kernel /boot/vmlinuz-6.1.0-15-amd64 \
  -initrd /boot/initrd.img-6.1.0-15-amd64 \
  -append "shell=/bin/sh console=ttyS0 root=none"

 There, in the guest (it has busybox only here):
 # swapon /dev/sda
 # echo mem > /sys/power/state
 (system_wakeup on the monitor)
 # echo disk > /sys/power/state

The system will hibernate but *not* turn off power, qemu
will continue running, while all console messages are the
same as when it works fine.  qemu process is spinning up
with 100% cpu usage at this stage.

Without the intermediate suspend-to-ram or without the
commit in question, qemu process will exit normally at
this stage.

This is a somewhat patalogical test case, but I see it as an
indicator of something else being wrong, like we aren't saving
or restoring some state now which we should do.

The tight loop also suggests we're not having success in there.

Thanks,

/mjt



Re: [PATCH 3/3] monitor: only run coroutine commands in qemu_aio_context

2024-02-03 Thread Michael Tokarev

16.01.2024 22:00, Stefan Hajnoczi пишет:

monitor_qmp_dispatcher_co() runs in the iohandler AioContext that is not
polled during nested event loops. The coroutine currently reschedules
itself in the main loop's qemu_aio_context AioContext, which is polled
during nested event loops. One known problem is that QMP device-add
calls drain_call_rcu(), which temporarily drops the BQL, leading to all
sorts of havoc like other vCPU threads re-entering device emulation code
while another vCPU thread is waiting in device emulation code with
aio_poll().

Paolo Bonzini suggested running non-coroutine QMP handlers in the
iohandler AioContext. This avoids trouble with nested event loops. His
original idea was to move coroutine rescheduling to
monitor_qmp_dispatch(), but I resorted to moving it to qmp_dispatch()
because we don't know if the QMP handler needs to run in coroutine
context in monitor_qmp_dispatch(). monitor_qmp_dispatch() would have
been nicer since it's associated with the monitor implementation and not
as general as qmp_dispatch(), which is also used by qemu-ga.

A number of qemu-iotests need updated .out files because the order of
QMP events vs QMP responses has changed.

Solves Issue #1933.

Fixes: 7bed89958bfbf40df9ca681cefbdca63abdde39d ("device_core: use drain_call_rcu in 
in qmp_device_add")
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2215192
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2214985
Buglink: https://issues.redhat.com/browse/RHEL-17369
Signed-off-by: Stefan Hajnoczi 


This change broke something in 7.2. I'm still debugging it, will
come with a follow-up once some more details are found, I'll also
check current master with and without this commit.

The prob happens with multiple suspend-resume cycles, - with this
change applied, guest does not work as expected after *second*
suspend-resume.

JFYI for now.

Thanks,

/mjt



Re: [PATCH 3/3] monitor: only run coroutine commands in qemu_aio_context

2024-01-18 Thread Fiona Ebner
Am 16.01.24 um 20:00 schrieb Stefan Hajnoczi:
> monitor_qmp_dispatcher_co() runs in the iohandler AioContext that is not
> polled during nested event loops. The coroutine currently reschedules
> itself in the main loop's qemu_aio_context AioContext, which is polled
> during nested event loops. One known problem is that QMP device-add
> calls drain_call_rcu(), which temporarily drops the BQL, leading to all
> sorts of havoc like other vCPU threads re-entering device emulation code
> while another vCPU thread is waiting in device emulation code with
> aio_poll().
> 
> Paolo Bonzini suggested running non-coroutine QMP handlers in the
> iohandler AioContext. This avoids trouble with nested event loops. His
> original idea was to move coroutine rescheduling to
> monitor_qmp_dispatch(), but I resorted to moving it to qmp_dispatch()
> because we don't know if the QMP handler needs to run in coroutine
> context in monitor_qmp_dispatch(). monitor_qmp_dispatch() would have
> been nicer since it's associated with the monitor implementation and not
> as general as qmp_dispatch(), which is also used by qemu-ga.
> 
> A number of qemu-iotests need updated .out files because the order of
> QMP events vs QMP responses has changed.
> 
> Solves Issue #1933.
> 
> Fixes: 7bed89958bfbf40df9ca681cefbdca63abdde39d ("device_core: use 
> drain_call_rcu in in qmp_device_add")
> Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2215192
> Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2214985
> Buglink: https://issues.redhat.com/browse/RHEL-17369
> Signed-off-by: Stefan Hajnoczi 

With the patch I can no longer see any do_qmp_dispatch_bh() calls run in
vCPU threads.

I also did a bit of smoke testing with some other QMP and QGA commands
and didn't find any obvious breakage, so:

Tested-by: Fiona Ebner 

P.S.

Unfortunately, the patch does not solve the other issue I came across
back then [0] with snapshot_save_job_bh() being executed during a vCPU
thread's aio_poll(). See also [1].

I suppose this would need some other mechanism to solve or could it also
be scheduled to the iohandler AioContext? It's not directly related to
your patch of course, just mentioning it, because it's a similar theme.

[0]:
https://lore.kernel.org/qemu-devel/31757c45-695d-4408-468c-c2de560af...@proxmox.com/
[1]: https://gitlab.com/qemu-project/qemu/-/issues/2111

Best Regards,
Fiona




Re: [PATCH 3/3] monitor: only run coroutine commands in qemu_aio_context

2024-01-17 Thread Kevin Wolf
Am 16.01.2024 um 20:00 hat Stefan Hajnoczi geschrieben:
> monitor_qmp_dispatcher_co() runs in the iohandler AioContext that is not
> polled during nested event loops. The coroutine currently reschedules
> itself in the main loop's qemu_aio_context AioContext, which is polled
> during nested event loops. One known problem is that QMP device-add
> calls drain_call_rcu(), which temporarily drops the BQL, leading to all
> sorts of havoc like other vCPU threads re-entering device emulation code
> while another vCPU thread is waiting in device emulation code with
> aio_poll().
> 
> Paolo Bonzini suggested running non-coroutine QMP handlers in the
> iohandler AioContext. This avoids trouble with nested event loops. His
> original idea was to move coroutine rescheduling to
> monitor_qmp_dispatch(), but I resorted to moving it to qmp_dispatch()
> because we don't know if the QMP handler needs to run in coroutine
> context in monitor_qmp_dispatch(). monitor_qmp_dispatch() would have
> been nicer since it's associated with the monitor implementation and not
> as general as qmp_dispatch(), which is also used by qemu-ga.
> 
> A number of qemu-iotests need updated .out files because the order of
> QMP events vs QMP responses has changed.
> 
> Solves Issue #1933.
> 
> Fixes: 7bed89958bfbf40df9ca681cefbdca63abdde39d ("device_core: use 
> drain_call_rcu in in qmp_device_add")
> Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2215192
> Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2214985
> Buglink: https://issues.redhat.com/browse/RHEL-17369
> Signed-off-by: Stefan Hajnoczi 

Reviewed-by: Kevin Wolf 




[PATCH 3/3] monitor: only run coroutine commands in qemu_aio_context

2024-01-16 Thread Stefan Hajnoczi
monitor_qmp_dispatcher_co() runs in the iohandler AioContext that is not
polled during nested event loops. The coroutine currently reschedules
itself in the main loop's qemu_aio_context AioContext, which is polled
during nested event loops. One known problem is that QMP device-add
calls drain_call_rcu(), which temporarily drops the BQL, leading to all
sorts of havoc like other vCPU threads re-entering device emulation code
while another vCPU thread is waiting in device emulation code with
aio_poll().

Paolo Bonzini suggested running non-coroutine QMP handlers in the
iohandler AioContext. This avoids trouble with nested event loops. His
original idea was to move coroutine rescheduling to
monitor_qmp_dispatch(), but I resorted to moving it to qmp_dispatch()
because we don't know if the QMP handler needs to run in coroutine
context in monitor_qmp_dispatch(). monitor_qmp_dispatch() would have
been nicer since it's associated with the monitor implementation and not
as general as qmp_dispatch(), which is also used by qemu-ga.

A number of qemu-iotests need updated .out files because the order of
QMP events vs QMP responses has changed.

Solves Issue #1933.

Fixes: 7bed89958bfbf40df9ca681cefbdca63abdde39d ("device_core: use 
drain_call_rcu in in qmp_device_add")
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2215192
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2214985
Buglink: https://issues.redhat.com/browse/RHEL-17369
Signed-off-by: Stefan Hajnoczi 
---
 monitor/qmp.c | 17 
 qapi/qmp-dispatch.c   | 24 +-
 tests/qemu-iotests/060.out|  4 +-
 tests/qemu-iotests/071.out|  4 +-
 tests/qemu-iotests/081.out| 16 ++--
 tests/qemu-iotests/087.out| 12 +--
 tests/qemu-iotests/108.out|  2 +-
 tests/qemu-iotests/109|  4 +-
 tests/qemu-iotests/109.out| 78 ---
 tests/qemu-iotests/117.out|  2 +-
 tests/qemu-iotests/120.out|  2 +-
 tests/qemu-iotests/127.out|  2 +-
 tests/qemu-iotests/140.out|  2 +-
 tests/qemu-iotests/143.out|  2 +-
 tests/qemu-iotests/156.out|  2 +-
 tests/qemu-iotests/176.out| 16 ++--
 tests/qemu-iotests/182.out|  2 +-
 tests/qemu-iotests/183.out|  4 +-
 tests/qemu-iotests/184.out| 32 
 tests/qemu-iotests/185|  6 +-
 tests/qemu-iotests/185.out| 45 +--
 tests/qemu-iotests/191.out| 16 ++--
 tests/qemu-iotests/195.out| 16 ++--
 tests/qemu-iotests/223.out| 16 ++--
 tests/qemu-iotests/227.out| 32 
 tests/qemu-iotests/247.out|  2 +-
 tests/qemu-iotests/273.out|  8 +-
 tests/qemu-iotests/308|  4 +-
 tests/qemu-iotests/308.out|  4 +-
 tests/qemu-iotests/tests/file-io-error|  5 +-
 tests/qemu-iotests/tests/iothreads-resize.out |  2 +-
 tests/qemu-iotests/tests/qsd-jobs.out |  4 +-
 32 files changed, 207 insertions(+), 180 deletions(-)

diff --git a/monitor/qmp.c b/monitor/qmp.c
index 6eee450fe4..a239945e8d 100644
--- a/monitor/qmp.c
+++ b/monitor/qmp.c
@@ -321,14 +321,6 @@ void coroutine_fn monitor_qmp_dispatcher_co(void *data)
 qemu_coroutine_yield();
 }
 
-/*
- * Move the coroutine from iohandler_ctx to qemu_aio_context for
- * executing the command handler so that it can make progress if it
- * involves an AIO_WAIT_WHILE().
- */
-aio_co_schedule(qemu_get_aio_context(), qmp_dispatcher_co);
-qemu_coroutine_yield();
-
 /* Process request */
 if (req_obj->req) {
 if (trace_event_get_state(TRACE_MONITOR_QMP_CMD_IN_BAND)) {
@@ -355,15 +347,6 @@ void coroutine_fn monitor_qmp_dispatcher_co(void *data)
 }
 
 qmp_request_free(req_obj);
-
-/*
- * Yield and reschedule so the main loop stays responsive.
- *
- * Move back to iohandler_ctx so that nested event loops for
- * qemu_aio_context don't start new monitor commands.
- */
-aio_co_schedule(iohandler_get_aio_context(), qmp_dispatcher_co);
-qemu_coroutine_yield();
 }
 qatomic_set(_dispatcher_co, NULL);
 }
diff --git a/qapi/qmp-dispatch.c b/qapi/qmp-dispatch.c
index 28b6bb..176b549473 100644
--- a/qapi/qmp-dispatch.c
+++ b/qapi/qmp-dispatch.c
@@ -206,9 +206,31 @@ QDict *coroutine_mixed_fn qmp_dispatch(const 
QmpCommandList *cmds, QObject *requ
 assert(!(oob && qemu_in_coroutine()));
 assert(monitor_cur() == NULL);
 if (!!(cmd->options & QCO_COROUTINE) == qemu_in_coroutine()) {
+if