Re: [Qemu-devel] [PATCH 0/4] mips: Add more Avocado tests

Eduardo Habkost Wed, 22 May 2019 19:04:50 -0700

On Wed, May 22, 2019 at 09:04:17PM -0400, Cleber Rosa wrote:
> 
> 
> ----- Original Message -----
> > From: "Eduardo Habkost" <ehabk...@redhat.com>
> > To: "Cleber Rosa" <cr...@redhat.com>
> > Cc: "Philippe Mathieu-Daudé" <f4...@amsat.org>, qemu-devel@nongnu.org, 
> > "Aleksandar Rikalo" <arik...@wavecomp.com>,
> > "Aleksandar Markovic" <aleksandar.m.m...@gmail.com>, "Aleksandar Markovic" 
> > <amarko...@wavecomp.com>, "Aurelien
> > Jarno" <aurel...@aurel32.net>, "Wainer dos Santos Moschetta" 
> > <waine...@redhat.com>
> > Sent: Wednesday, May 22, 2019 7:07:05 PM
> > Subject: Re: [Qemu-devel] [PATCH 0/4] mips: Add more Avocado tests
> > 
> > On Wed, May 22, 2019 at 05:46:06PM -0400, Cleber Rosa wrote:
> > > 
> > > 
> > > ----- Original Message -----
> > > > From: "Eduardo Habkost" <ehabk...@redhat.com>
> > > > To: "Philippe Mathieu-Daudé" <f4...@amsat.org>
> > > > Cc: qemu-devel@nongnu.org, "Aleksandar Rikalo" <arik...@wavecomp.com>,
> > > > "Aleksandar Markovic"
> > > > <aleksandar.m.m...@gmail.com>, "Aleksandar Markovic"
> > > > <amarko...@wavecomp.com>, "Cleber Rosa" <cr...@redhat.com>,
> > > > "Aurelien Jarno" <aurel...@aurel32.net>, "Wainer dos Santos Moschetta"
> > > > <waine...@redhat.com>
> > > > Sent: Wednesday, May 22, 2019 5:12:30 PM
> > > > Subject: Re: [Qemu-devel] [PATCH 0/4] mips: Add more Avocado tests
> > > > 
> > > > On Tue, May 21, 2019 at 01:19:06AM +0200, Philippe Mathieu-Daudé wrote:
> > > > > Hi,
> > > > > 
> > > > > It was a rainy week-end here, so I invested it to automatize some
> > > > > of my MIPS tests.
> > > > > 
> > > > > The BootLinuxSshTest is not Global warming friendly, it is not
> > > > > meant to run on a CI system but rather on a workstation previous
> > > > > to post a pull request.
> > > > > It can surely be improved, but it is a good starting point.
> > > > 
> > > > Until we actually have a mechanism to exclude the test case on
> > > > travis-ci, I will remove patch 4/4 from the queue.  Aleksandar,
> > > > please don't merge patch 4/4 yet or it will break travis-ci.
> > > > 
> > > > Cleber, Wainer, is it already possible to make "avocado run" skip
> > > > tests tagged with "slow"?
> > > > 
> > > 
> > > The mechanism exists, but we haven't tagged any test so far as slow.
> > > 
> > > Should we define/document a criteria for a test to be slow?  Given
> > > that this is highly subjective, we have to think of:
> > > 
> > >  * Will we consider the average or maximum run time (the timeout
> > >    definition)?
> > >  
> > >  * For a single test, what is "slow"? Some rough numbers from Travis
> > >    CI[1] to help us with guidelines:
> > >    - boot_linux_console.py:BootLinuxConsole.test_x86_64_pc:  PASS (6.04 s)
> > >    - boot_linux_console.py:BootLinuxConsole.test_arm_virt:  PASS (2.91 s)
> > >    -
> > >    
> > > linux_initrd.py:LinuxInitrd.test_with_2gib_file_should_work_with_linux_v4_16:
> > >    PASS (18.14 s)
> > >    - boot_linux.py:BootLinuxAarch64.test_virt:  PASS (396.88 s)
> > 
> > I don't think we need to overthink this.  Whatever objective
> > criteria we choose, I'm sure we'll have to adapt them later due
> > to real world problems.
> > 
> > e.g.: is 396 seconds too slow?  I don't know, it depends: does it
> > break Travis and other CI systems often because of timeouts?  If
> > yes, then we should probably tag it as slow.
> > 
> 
> It's not only that.  We're close to a point where we'll need to
> determine whether "make check-acceptance" will work as a generic
> enough default for most user on their environments and most CI
> systems.
> 
> As an example, this job ran 5 fairly slow tests (which I'm preparing
> to send):
> 
>   https://travis-ci.org/clebergnu/qemu/jobs/535967210#L3518
> 
> Those are justifiably slow, given the fact that they boot a full
> Fedora 30 system using TCG.  The job has a cumulative execution time
> of ~39 minutes.  That leaves only 11 minutes to spare on the Travis
> CI environment.  If they all exercised close to their 600s allowances
> (timeout), the Travis job would have failed. 
> 
> Having said that, if a CI failure is supposed to be a major breakage,
> which I believe it's the right mind set and a worthy goal, we should
> limit the amount of tests we run so that their *maximum* execution
> time does not exceed the maximum job time limit.
> 
> > If having subjective criteria is really a problem (I don't think
> > it is), then we can call the tag "skip_travis", and stop worrying
> > about defining what exactly is "slow".
> > 
> > 
> > > 
> > >  * Do we want to set a maximum job timeout?  This way we can skip
> > >    tests after a given amount of time has passed.  Currently we interrupt
> > >    the test running when the job timeout is reached, but it's possible
> > >    to add a option so that no new tests will be started, but currently
> > >    running ones will be waited on.
> > 
> > I'm not sure I understand the suggestion to skip tests.  If we
> > skip tests after a timeout, how would we differentiate a test
> > being expectedly slow from a QEMU hang?
> > 
> > --
> > Eduardo
> > 
> 
> Basically, what I meant is that we could attempt something like:
> 
>  * Job "Brave"
>   - 50 tests, each with 60 seconds timeout = 50 min max
>   - 60 tests, each with 1 second timeout  = 1 min max
> 
> If Job "Brave" is run on a system such as Travis, it *can* fail,
> because it can go over the maximum Travis CI job limit of 50 min.
> We could set an Avocado job timeout of say, 48 minutes, and tell
> Avocado to mark the tests it wasn't able to spawn as "SKIPPED",
> and do not report an overall error condition.


Oh, that would be a nice feature.  But while we don't have it,
the following proposal would work too.

> 
> But, if we want to be more conservative (which I now realize is
> the best mindset for this situation), we should stick to something
> like:
> 
>  * Job "Coward"
>   - 47 tests, each with 60 seconds timeout = 47 min max
>   - 60 tests, each with 1 second timeout  = 1 min max
> 
> So my proposal is that we should:
> 
>  * Give ample timeouts to test (at least 2x their average
>    run time on Travis CI)
> 
>  * Define the standard job (make check-acceptance) as a set
>    of tests that can run under the Travis CI job (discounted
>    the average QEMU build time)

Agreed.

> 
> This means that:
> 
>  * We'd tag some tests as "not-default", filtering them out
>    of "make check-acceptance"
> 
>  * Supposing a developer is using a machine as least as powerful
>    as the Travis CI environment, and assuming a build time of
>    10 minutes, his "make check-acceptance" maximum execution
>    time would be in the order of ~39 minutes.
> 
> I can work on adding the missing Avocado features, such as
> the ability to list/count the maximum job time for the given test
> selection. This should help us to maintain sound CI jobs, and good
> user experience.

Sounds good to me.

> 
> And finally, I'm sorry that I did overthink this... but I know
> that the time for hard choices are coming fast.

The above proposals are cool, I don't think they are
overthinking.

I only meant that we shouldn't be looking to a formal definition
of "slow", because "what exactly is a slow job?" isn't the
important question we should be asking.  "How to avoid timeouts
on CI jobs" is the important question, and your proposals above
help us address that.

-- 
Eduardo

Re: [Qemu-devel] [PATCH 0/4] mips: Add more Avocado tests

Reply via email to