On Mon, Aug 08, 2022 at 01:57:17PM +0200, Thomas Huth wrote: > > Hi! > > Seems like we're getting more timeouts in the CI pipelines since commit > 2649a72555e ("Allow test to run without uffd") enabled the migration tests > in more scenarios. > > For example: > > https://gitlab.com/qemu-project/qemu/-/jobs/2821578332#L49 > > You can see that the migration-test ran for more than 20 minutes for each > target (x86 and aarch64)! I think that's way too much by default.
Definitely too much. > I had a check whether there is one subtest taking a lot of time, but it > rather seems like each of the migration test is taking 40 to 50 seconds in > the CI: > > https://gitlab.com/thuth/qemu/-/jobs/2825365836#L44 Normally with CI we expect a constant slowdown factor, eg x2. I expect with migration though, we're triggering behaviour whereby the guest workload is generating dirty pages quicker than we can migrate them over localhost. The balance in this can quickly tip to create an exponential slowdown. > Given the fact that we're running more than 30 migration tests, this quickly > sums up to 20 minutes and more. > > Could we maybe focus on running only the most important migration tests in > quick mode, and only run the full suite under an "if (g_test_slow())" > statement? THe GitLab shared runners in particular i think are going to impact the migration tests, given that the runners are overcommitted, pre-emptiable instances. If we want reliability we may need to restrict it to just do migration qtests on the private runners, since we have predictable compute resource available on those. I'm not sure if 'g_test_slow' gives us enough granularity though, as if we enable that, it'll impact the whole test suite, not just migration tests. Not sure of the best answer here for how to toggle it. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|