On Wed, May 22, 2019 at 09:04:17PM -0400, Cleber Rosa wrote: > > > ----- Original Message ----- > > From: "Eduardo Habkost" <ehabk...@redhat.com> > > To: "Cleber Rosa" <cr...@redhat.com> > > Cc: "Philippe Mathieu-Daudé" <f4...@amsat.org>, qemu-devel@nongnu.org, > > "Aleksandar Rikalo" <arik...@wavecomp.com>, > > "Aleksandar Markovic" <aleksandar.m.m...@gmail.com>, "Aleksandar Markovic" > > <amarko...@wavecomp.com>, "Aurelien > > Jarno" <aurel...@aurel32.net>, "Wainer dos Santos Moschetta" > > <waine...@redhat.com> > > Sent: Wednesday, May 22, 2019 7:07:05 PM > > Subject: Re: [Qemu-devel] [PATCH 0/4] mips: Add more Avocado tests > > > > On Wed, May 22, 2019 at 05:46:06PM -0400, Cleber Rosa wrote: > > > > > > > > > ----- Original Message ----- > > > > From: "Eduardo Habkost" <ehabk...@redhat.com> > > > > To: "Philippe Mathieu-Daudé" <f4...@amsat.org> > > > > Cc: qemu-devel@nongnu.org, "Aleksandar Rikalo" <arik...@wavecomp.com>, > > > > "Aleksandar Markovic" > > > > <aleksandar.m.m...@gmail.com>, "Aleksandar Markovic" > > > > <amarko...@wavecomp.com>, "Cleber Rosa" <cr...@redhat.com>, > > > > "Aurelien Jarno" <aurel...@aurel32.net>, "Wainer dos Santos Moschetta" > > > > <waine...@redhat.com> > > > > Sent: Wednesday, May 22, 2019 5:12:30 PM > > > > Subject: Re: [Qemu-devel] [PATCH 0/4] mips: Add more Avocado tests > > > > > > > > On Tue, May 21, 2019 at 01:19:06AM +0200, Philippe Mathieu-Daudé wrote: > > > > > Hi, > > > > > > > > > > It was a rainy week-end here, so I invested it to automatize some > > > > > of my MIPS tests. > > > > > > > > > > The BootLinuxSshTest is not Global warming friendly, it is not > > > > > meant to run on a CI system but rather on a workstation previous > > > > > to post a pull request. > > > > > It can surely be improved, but it is a good starting point. > > > > > > > > Until we actually have a mechanism to exclude the test case on > > > > travis-ci, I will remove patch 4/4 from the queue. Aleksandar, > > > > please don't merge patch 4/4 yet or it will break travis-ci. > > > > > > > > Cleber, Wainer, is it already possible to make "avocado run" skip > > > > tests tagged with "slow"? > > > > > > > > > > The mechanism exists, but we haven't tagged any test so far as slow. > > > > > > Should we define/document a criteria for a test to be slow? Given > > > that this is highly subjective, we have to think of: > > > > > > * Will we consider the average or maximum run time (the timeout > > > definition)? > > > > > > * For a single test, what is "slow"? Some rough numbers from Travis > > > CI[1] to help us with guidelines: > > > - boot_linux_console.py:BootLinuxConsole.test_x86_64_pc: PASS (6.04 s) > > > - boot_linux_console.py:BootLinuxConsole.test_arm_virt: PASS (2.91 s) > > > - > > > > > > linux_initrd.py:LinuxInitrd.test_with_2gib_file_should_work_with_linux_v4_16: > > > PASS (18.14 s) > > > - boot_linux.py:BootLinuxAarch64.test_virt: PASS (396.88 s) > > > > I don't think we need to overthink this. Whatever objective > > criteria we choose, I'm sure we'll have to adapt them later due > > to real world problems. > > > > e.g.: is 396 seconds too slow? I don't know, it depends: does it > > break Travis and other CI systems often because of timeouts? If > > yes, then we should probably tag it as slow. > > > > It's not only that. We're close to a point where we'll need to > determine whether "make check-acceptance" will work as a generic > enough default for most user on their environments and most CI > systems. > > As an example, this job ran 5 fairly slow tests (which I'm preparing > to send): > > https://travis-ci.org/clebergnu/qemu/jobs/535967210#L3518 > > Those are justifiably slow, given the fact that they boot a full > Fedora 30 system using TCG. The job has a cumulative execution time > of ~39 minutes. That leaves only 11 minutes to spare on the Travis > CI environment. If they all exercised close to their 600s allowances > (timeout), the Travis job would have failed. > > Having said that, if a CI failure is supposed to be a major breakage, > which I believe it's the right mind set and a worthy goal, we should > limit the amount of tests we run so that their *maximum* execution > time does not exceed the maximum job time limit. > > > If having subjective criteria is really a problem (I don't think > > it is), then we can call the tag "skip_travis", and stop worrying > > about defining what exactly is "slow". > > > > > > > > > > * Do we want to set a maximum job timeout? This way we can skip > > > tests after a given amount of time has passed. Currently we interrupt > > > the test running when the job timeout is reached, but it's possible > > > to add a option so that no new tests will be started, but currently > > > running ones will be waited on. > > > > I'm not sure I understand the suggestion to skip tests. If we > > skip tests after a timeout, how would we differentiate a test > > being expectedly slow from a QEMU hang? > > > > -- > > Eduardo > > > > Basically, what I meant is that we could attempt something like: > > * Job "Brave" > - 50 tests, each with 60 seconds timeout = 50 min max > - 60 tests, each with 1 second timeout = 1 min max > > If Job "Brave" is run on a system such as Travis, it *can* fail, > because it can go over the maximum Travis CI job limit of 50 min. > We could set an Avocado job timeout of say, 48 minutes, and tell > Avocado to mark the tests it wasn't able to spawn as "SKIPPED", > and do not report an overall error condition.
Oh, that would be a nice feature. But while we don't have it, the following proposal would work too. > > But, if we want to be more conservative (which I now realize is > the best mindset for this situation), we should stick to something > like: > > * Job "Coward" > - 47 tests, each with 60 seconds timeout = 47 min max > - 60 tests, each with 1 second timeout = 1 min max > > So my proposal is that we should: > > * Give ample timeouts to test (at least 2x their average > run time on Travis CI) > > * Define the standard job (make check-acceptance) as a set > of tests that can run under the Travis CI job (discounted > the average QEMU build time) Agreed. > > This means that: > > * We'd tag some tests as "not-default", filtering them out > of "make check-acceptance" > > * Supposing a developer is using a machine as least as powerful > as the Travis CI environment, and assuming a build time of > 10 minutes, his "make check-acceptance" maximum execution > time would be in the order of ~39 minutes. > > I can work on adding the missing Avocado features, such as > the ability to list/count the maximum job time for the given test > selection. This should help us to maintain sound CI jobs, and good > user experience. Sounds good to me. > > And finally, I'm sorry that I did overthink this... but I know > that the time for hard choices are coming fast. The above proposals are cool, I don't think they are overthinking. I only meant that we shouldn't be looking to a formal definition of "slow", because "what exactly is a slow job?" isn't the important question we should be asking. "How to avoid timeouts on CI jobs" is the important question, and your proposals above help us address that. -- Eduardo