On 07.10.2020 12:57, Philippe Mathieu-Daudé wrote:
On 10/7/20 10:51 AM, Pavel Dovgalyuk wrote:
On 07.10.2020 11:23, Thomas Huth wrote:
On 07/10/2020 09.13, Philippe Mathieu-Daudé wrote:
On 10/7/20 7:20 AM, Philippe Mathieu-Daudé wrote:
On 10/7/20 1:07 AM, John Snow wrote:
I'm seeing this gitlab test fail quite often in my Python work; I
don't
*think* this has anything to do with my patches, but maybe I need
to try
and bisect this more aggressively.
[...]
w.r.t. the error in your build, I told Thomas about the
test_ppc_mac99/day15/invaders.elf timeouting but he said this is
not his area. Richard has been looking yesterday to see if it is
a TCG regression, and said the test either finished/crashed raising
SIGCHLD, but Avocado parent is still waiting for a timeout, so the
children become zombie and the test hang.
Expected output:
Quiescing Open Firmware ...
Booting Linux via __start() @ 0x01000000 ...
But QEMU exits in replay_char_write_event_load():
Quiescing Open Firmware ...
qemu-system-ppc: Missing character write event in the replay log
$ echo $?
1
Latest events are CHECKPOINT CHECKPOINT INTERRUPT INTERRUPT INTERRUPT.
Replay file is ~22MiB. End of record using "system_powerdown + quit"
in HMP.
I guess we have 2 bugs:
- replay log
- avocado doesn't catch children exit(1)
Quick reproducer:
$ make qemu-system-ppc check-venv
$ tests/venv/bin/python -m \
avocado --show=app,console,replay \
run --job-timeout 300 -t machine:mac99 \
tests/acceptance/replay_kernel.py
Thanks, that was helpful. ... and the winner is:
commit 55adb3c45620c31f29978f209e2a44a08d34e2da
Author: John Snow <js...@redhat.com>
Date: Fri Jul 24 01:23:00 2020 -0400
Subject: ide: cancel pending callbacks on SRST
... starting with this commit, the tests starts failing. John, any
idea what
might be causing this?
This patch includes the following lines:
+ aio_bh_schedule_oneshot(qemu_get_aio_context(),
+ ide_bus_perform_srst, bus);
replay_bh_schedule_oneshot_event should be used instead of this
function, because it synchronizes non-deterministic BHs.
Why do we have 2 different functions? BH are already complex
enough, and we need to also think about the replay API...
There is note about it in docs/devel/replay.txt
It's hard to protect the guest state from incorrect operations.
There was the similar problem with icount - everyone who modify
translate modules, needs to take it info account.
But now we have record/replay tests that assert that patches do not
break icount/rr.
What about the other cases such vhost-user (blk/net), virtio-blk?
I do not know much about these modules.
The main idea is the following: if the code works with the guest state,
it should deal with icount and rr functions.
Pavel Dovgalyuk