On 1/31/19 3:37 AM, Cleber Rosa wrote: > On 1/22/19 11:07 AM, Alex Bennée wrote: >> Cleber Rosa <cr...@redhat.com> writes: >> >>> Just like the previous tests, boots a Linux kernel on a ppc64 target >>> using the pseries machine. >> >> So running this on my rather slow SynQuacer I get: >> >> (04/16) >> /home/alex/lsrc/qemu.git/tests/acceptance/boot_linux_console.py:BootLinuxConsole.test_ppc64_pseries: >> INTERRUPTED: Test reported status but did not finish\nRunner error >> occurred: Timeout reached\nOriginal status: ERROR\n{'name': >> '04-/home/alex/lsrc/qemu.git/tests/acceptance/boot_linux_console.py:BootLinuxConsole.test_ppc64_pseries', >> 'logdir': '/home/alex/lsrc/qemu.git/te... (60.93 s) >> >> which I'm guessing is a timeout occurring. >> > > Yes, that's what's happening. It's hard to pinpoint, and control, the > sluggishness points in such a test running on a different environment. > For this one execution, I do trust your assessment, and it's most > certainly caused by your "slow SynQuacer", spending too much running > emulation code. > > But, I'd like to mention that there are other possibilities. One is > that you're hitting a "asset fetcher bug" that I recently fixed in > Avocado[1] (fix to be available on 68.0, to be released next Monday, Feb > 4th). > > Even with that bug fixed, I feel like it's unfair to test code to spend > its time waiting to download a file when it's not testing *the file > download itself*. Because of that, there are plans to add an (optional) > job pre-processing step that will make sure the needed assets are in the > cache ahead of time[2][3]. > >> I wonder if that means we should: >> >> a) set timeouts longer for when running on TCG
I hit the same problem with VM tests, and suggested a poor "increase timeout" patch [1]. Then Peter sent a different patch [2] which happens to inadvertently Dictionary resolve my problem, since the longer a VM took to boot on the Cavium ThunderX I have access is 288 seconds, which is closely below the 300 seconds limit =) I understood nobody seemed to really care about testing the x86 TCG backend this way, so I didn't worry much. [1] http://lists.nongnu.org/archive/html/qemu-devel/2018-07/msg03416.html [2] http://lists.nongnu.org/archive/html/qemu-devel/2018-08/msg04977.html >> or >> b) split tests into TCG and KVM tests and select KVM tests on appropriate >> HW >> > > I wonder the same, and I believe this falls into a similar scenario > we've seen with the setup of console devices in the QEMUMachine class. > I've started by setting the device types defined at the framework level, > and then reverted to the machine's default devices (using '-serial'), > because the "default" behavior of QEMU is usually what a test writer > wants when not setting something else explicitly. > >> The qemu.py code has (slightly flawed) logic for detecting KVM and >> passing --enable-kvm. Maybe we should be doing that here? >> > > I'm not sure. IMO, the common question is: should we magically (at a > framework level) configure tests based on probed host environment > characteristics? I feel like we should attempt to minimize that for the > sake of tests being more obvious and more easily reproducible. I agree we shouldn't randomly test different features, but rather explicitly add 2 tests (TCG/KVM), and if it is not possible to run a test, mark it as SKIPPED. An user with KVM available would then have to run --filter-out=tcg, or build QEMU with --disable-tcg. > And because of that, I'd go, *initially*, with an approach more similar > to your option "b". > > Having said that, we don't want to rewrite most tests just to be able to > test with either KVM or TCG, if the tests are not explicitly testing KVM > or TCG. At this point, using KVM or TCG is test/framework > *configuration*, and in Avocado we hope to solve this by having the > executed tests easily identifiable and reproducible (a test ID will > contain a information about the options passed, and a replay of the job > will apply the same configuration). > > For now, I think the best approach is to increase the timeout, because I > think it's much worse to have to deal with false negatives (longer > execution times that don't really mean a failure), than having a test > possibly taking some more time to finish. > > And sorry for extremely the long answer! > - Cleber. > > [1] - https://github.com/avocado-framework/avocado/pull/2996 > [2] - > https://trello.com/c/WPd4FrIy/1479-add-support-to-specify-assets-in-test-docstring > [3] - https://trello.com/c/CKP7YS6G/1481-on-cache-check-for-asset-fetcher >