On 03/19/2018 05:10 PM, Stefan Hajnoczi wrote: > On Mon, Mar 19, 2018 at 9:35 AM, QingFeng Hao <[email protected]> > wrote: >> Test case 185 failed since commit 4486e89c219 --- "vl: introduce >> vm_shutdown()". >> It's because of the newly introduced function vm_shutdown calls >> bdrv_drain_all, >> which is called later by bdrv_close_all. bdrv_drain_all resumes the jobs >> that doubles the speed and offset is doubled. >> Some jobs' status are changed as well. >> >> Thus, let's not call bdrv_drain_all in vm_shutdown. >> >> Signed-off-by: QingFeng Hao <[email protected]> >> --- >> cpus.c | 5 +++-- >> 1 file changed, 3 insertions(+), 2 deletions(-) >> >> diff --git a/cpus.c b/cpus.c >> index 2e6701795b..ae2962508c 100644 >> --- a/cpus.c >> +++ b/cpus.c >> @@ -1006,8 +1006,9 @@ static int do_vm_stop(RunState state, bool send_stop) >> qapi_event_send_stop(&error_abort); >> } >> } >> - >> - bdrv_drain_all(); >> + if (send_stop) { >> + bdrv_drain_all(); >> + } > > Thanks for looking into this bug! > > This if statement breaks the contract of the function when send_stop > is false. Drain ensures that pending I/O completes and that device > emulation finishes before this function returns. This is important > for live migration, where there must be no more guest-related activity > after vm_stop(). > > This patch relies on the caller invoking bdrv_close_all() immediately > after do_vm_stop(), which is not documented and could lead to more > bugs in the future. > > I suggest a different solution: > > Test 185 relies on internal QEMU behavior which can change from time > to time. Buffer sizes and block job iteration counts are > implementation details that are not part of qapi-schema.json or other > documented behavior. > > In fact, the existing test case was already broken in IOThread mode > since iothread_stop_all() -> bdrv_set_aio_context() also includes a > bdrv_drain() with the side-effect of an extra blockjob iteration. > > After 4486e89c219 ("vl: introduce vm_shutdown()") both IOThread and > non-IOThread mode perform the drain. This has caused the test > failure. > > Instead of modifying QEMU to keep the same arbitrary internal behavior > as before, please send a patch that updates tests/qemu-iotests/185.out > with the latest output.
Wouldnt it be better if the test actually masks out the values the are not really part of the "agreed upon" behaviour? Wouldnt block_job_offset from common.filter be what we want?
