Re: [PATCH 0/3] monitor: only run coroutine commands in qemu_aio_context

2024-02-02 Thread YangHang Liu
It's easily for me to encounter " ../block/qcow2.c:5263:
ImageInfoSpecific *qcow2_get_specific_info(BlockDriverState *, Error
**): Assertion `false' failed" issue during 1Q vhost-user interface +
RT VM + post-copy migration

After applying this patch, the issue is still not reproduced even if I
repeat the same migration test for 60 times.

Tested-by: Yanghang Liu 


Best Regards,
YangHang Liu

On Mon, Jan 29, 2024 at 7:39 PM Peter Maydell  wrote:
>
> On Tue, 16 Jan 2024 at 19:01, Stefan Hajnoczi  wrote:
> >
> > Several bugs have been reported related to how QMP commands are rescheduled 
> > in
> > qemu_aio_context:
> > - https://gitlab.com/qemu-project/qemu/-/issues/1933
> > - https://issues.redhat.com/browse/RHEL-17369
> > - https://bugzilla.redhat.com/show_bug.cgi?id=2215192
> > - https://bugzilla.redhat.com/show_bug.cgi?id=2214985
> >
> > The first instance of the bug interacted with drain_call_rcu() temporarily
> > dropping the BQL and resulted in vCPU threads entering device emulation code
> > simultaneously (something that should never happen). I set out to make
> > drain_call_rcu() safe to use in this environment, but Paolo and Kevin 
> > discussed
> > the possibility of avoiding rescheduling the monitor_qmp_dispatcher_co()
> > coroutine for non-coroutine commands. This would prevent monitor commands 
> > from
> > running during vCPU thread aio_poll() entirely and addresses the root cause.
> >
> > This patch series implements this idea. qemu-iotests is sensitive to the 
> > exact
> > order in which QMP events and responses are emitted. Running QMP handlers in
> > the iohandler AioContext causes some QMP events to be ordered differently 
> > than
> > before. It is therefore necessary to adjust the reference output in many 
> > test
> > cases. The actual QMP code change is small and everything else is just to 
> > make
> > qemu-iotests happy.
>
> Hi; we have a suspicion that this change has resulted in a flaky-CI
> test: iotest-144 sometimes fails, apparently because a "return"
> result from QMP isn't always returned at the same place in relation
> to other QMP events. Could you have a look at it?
>
> https://gitlab.com/qemu-project/qemu/-/issues/2126
>
> thanks
> -- PMM
>




Re: [PATCH 0/3] monitor: only run coroutine commands in qemu_aio_context

2024-01-29 Thread Peter Maydell
On Tue, 16 Jan 2024 at 19:01, Stefan Hajnoczi  wrote:
>
> Several bugs have been reported related to how QMP commands are rescheduled in
> qemu_aio_context:
> - https://gitlab.com/qemu-project/qemu/-/issues/1933
> - https://issues.redhat.com/browse/RHEL-17369
> - https://bugzilla.redhat.com/show_bug.cgi?id=2215192
> - https://bugzilla.redhat.com/show_bug.cgi?id=2214985
>
> The first instance of the bug interacted with drain_call_rcu() temporarily
> dropping the BQL and resulted in vCPU threads entering device emulation code
> simultaneously (something that should never happen). I set out to make
> drain_call_rcu() safe to use in this environment, but Paolo and Kevin 
> discussed
> the possibility of avoiding rescheduling the monitor_qmp_dispatcher_co()
> coroutine for non-coroutine commands. This would prevent monitor commands from
> running during vCPU thread aio_poll() entirely and addresses the root cause.
>
> This patch series implements this idea. qemu-iotests is sensitive to the exact
> order in which QMP events and responses are emitted. Running QMP handlers in
> the iohandler AioContext causes some QMP events to be ordered differently than
> before. It is therefore necessary to adjust the reference output in many test
> cases. The actual QMP code change is small and everything else is just to make
> qemu-iotests happy.

Hi; we have a suspicion that this change has resulted in a flaky-CI
test: iotest-144 sometimes fails, apparently because a "return"
result from QMP isn't always returned at the same place in relation
to other QMP events. Could you have a look at it?

https://gitlab.com/qemu-project/qemu/-/issues/2126

thanks
-- PMM



Re: [PATCH 0/3] monitor: only run coroutine commands in qemu_aio_context

2024-01-18 Thread Michael Tokarev

16.01.2024 22:00, Stefan Hajnoczi wrote:

Several bugs have been reported related to how QMP commands are rescheduled in
qemu_aio_context:
- https://gitlab.com/qemu-project/qemu/-/issues/1933
- https://issues.redhat.com/browse/RHEL-17369
- https://bugzilla.redhat.com/show_bug.cgi?id=2215192
- https://bugzilla.redhat.com/show_bug.cgi?id=2214985

The first instance of the bug interacted with drain_call_rcu() temporarily
dropping the BQL and resulted in vCPU threads entering device emulation code
simultaneously (something that should never happen). I set out to make
drain_call_rcu() safe to use in this environment, but Paolo and Kevin discussed
the possibility of avoiding rescheduling the monitor_qmp_dispatcher_co()
coroutine for non-coroutine commands. This would prevent monitor commands from
running during vCPU thread aio_poll() entirely and addresses the root cause.

This patch series implements this idea. qemu-iotests is sensitive to the exact
order in which QMP events and responses are emitted. Running QMP handlers in
the iohandler AioContext causes some QMP events to be ordered differently than
before. It is therefore necessary to adjust the reference output in many test
cases. The actual QMP code change is small and everything else is just to make
qemu-iotests happy.


This seems to be -stable material too, once it is applied to master.
It's difficult for me to catch the issue (#1933 @gitlab) locally, but I did
have some success there, and it always work after this patch is applied.
It also seem to work fine on 7.2, not only on 8.2, fwiw.

Thanks,

/mjt



[PATCH 0/3] monitor: only run coroutine commands in qemu_aio_context

2024-01-16 Thread Stefan Hajnoczi
Several bugs have been reported related to how QMP commands are rescheduled in
qemu_aio_context:
- https://gitlab.com/qemu-project/qemu/-/issues/1933
- https://issues.redhat.com/browse/RHEL-17369
- https://bugzilla.redhat.com/show_bug.cgi?id=2215192
- https://bugzilla.redhat.com/show_bug.cgi?id=2214985

The first instance of the bug interacted with drain_call_rcu() temporarily
dropping the BQL and resulted in vCPU threads entering device emulation code
simultaneously (something that should never happen). I set out to make
drain_call_rcu() safe to use in this environment, but Paolo and Kevin discussed
the possibility of avoiding rescheduling the monitor_qmp_dispatcher_co()
coroutine for non-coroutine commands. This would prevent monitor commands from
running during vCPU thread aio_poll() entirely and addresses the root cause.

This patch series implements this idea. qemu-iotests is sensitive to the exact
order in which QMP events and responses are emitted. Running QMP handlers in
the iohandler AioContext causes some QMP events to be ordered differently than
before. It is therefore necessary to adjust the reference output in many test
cases. The actual QMP code change is small and everything else is just to make
qemu-iotests happy.

If you have bugs related to the same issue, please retest them with these
patches. Thanks!

Stefan Hajnoczi (3):
  iotests: add filter_qmp_generated_node_ids()
  iotests: port 141 to Python for reliable QMP testing
  monitor: only run coroutine commands in qemu_aio_context

 monitor/qmp.c |  17 -
 qapi/qmp-dispatch.c   |  24 +-
 tests/qemu-iotests/060.out|   4 +-
 tests/qemu-iotests/071.out|   4 +-
 tests/qemu-iotests/081.out|  16 +-
 tests/qemu-iotests/087.out|  12 +-
 tests/qemu-iotests/108.out|   2 +-
 tests/qemu-iotests/109|   4 +-
 tests/qemu-iotests/109.out|  78 ++---
 tests/qemu-iotests/117.out|   2 +-
 tests/qemu-iotests/120.out|   2 +-
 tests/qemu-iotests/127.out|   2 +-
 tests/qemu-iotests/140.out|   2 +-
 tests/qemu-iotests/141| 297 +++---
 tests/qemu-iotests/141.out| 190 +++
 tests/qemu-iotests/143.out|   2 +-
 tests/qemu-iotests/156.out|   2 +-
 tests/qemu-iotests/176.out|  16 +-
 tests/qemu-iotests/182.out|   2 +-
 tests/qemu-iotests/183.out|   4 +-
 tests/qemu-iotests/184.out|  32 +-
 tests/qemu-iotests/185|   6 +-
 tests/qemu-iotests/185.out|  45 ++-
 tests/qemu-iotests/191.out|  16 +-
 tests/qemu-iotests/195.out|  16 +-
 tests/qemu-iotests/223.out|  16 +-
 tests/qemu-iotests/227.out|  32 +-
 tests/qemu-iotests/247.out|   2 +-
 tests/qemu-iotests/273.out|   8 +-
 tests/qemu-iotests/308|   4 +-
 tests/qemu-iotests/308.out|   4 +-
 tests/qemu-iotests/iotests.py |   7 +
 tests/qemu-iotests/tests/file-io-error|   5 +-
 tests/qemu-iotests/tests/iothreads-resize.out |   2 +-
 tests/qemu-iotests/tests/qsd-jobs.out |   4 +-
 35 files changed, 375 insertions(+), 506 deletions(-)

-- 
2.43.0