That's pretty awesome. And what nice documentation!

Clicking through https://github.com/apache/airflow/issues/10118 and
https://github.com/apache/airflow/pull/10768 it looks like the actual
quarantining / unquarantining is manual, yes? So we could reach this level
with JUnit categories for Java anyhow. We would just want a good way to get
test-level history to review, which I think the Jenkins Build History
plugin now gives us. It would be great to have automation to let us know
when a test becomes stable or flaky.

Kenn

On Tue, Mar 16, 2021 at 5:29 PM Tyson Hamilton <[email protected]> wrote:

> The Apache Airflow project has some interesting automation around flaky
> tests. They annotate such flaky tests as 'quarantined', those quarantined
> tests still run (maybe even with retries?) but won't fail a test suite.
> Quarantined tests are run in a separate scheduled job, when they start
> passing, they are no longer quarantined. Github issues are updated with the
> status.
>
> [1]:
> https://github.com/apache/airflow/blob/master/CI.rst#scheduled-quarantined-builds
>
> On Tue, Mar 16, 2021 at 4:06 PM Kenneth Knowles <[email protected]> wrote:
>
>> I expect the suite to be permared, right? Because of some thing or
>> another flaking at all times.
>>
>> Kenn
>>
>> On Tue, Mar 16, 2021 at 2:13 PM Alex Amato <[email protected]> wrote:
>>
>>> Is it possible to make the presubmit auto retry all failed tests a few
>>> times? (and maybe generate a report of a list of flakey tests).
>>> Then you don't need to disable/isolate the flakey tests.
>>>
>>> If this is not possible, or hard to setup, then manually moving them to
>>> a different suite sounds like a good idea.
>>>
>>> On Tue, Mar 16, 2021 at 2:11 PM Pablo Estrada <[email protected]>
>>> wrote:
>>>
>>>> Hi all,
>>>> In Beam, we sometimes hit the issue of having one or two test cases
>>>> that are particularly flaky, and we deactivate them.
>>>> This is completely reasonable to me, because we need to keep good
>>>> testing signal on our primary suites.
>>>> The danger of deactivating these tests is that, although we have good
>>>> practices to file JIRA issues to re-enable them, it is still easy for these
>>>> issues and tests to be forgotten.
>>>> Of course, ideally, the solution is "do not forget old deactivated
>>>> tests" - and we should adopt practices to ensure that.
>>>>
>>>> I think, to strengthen our practices, we can reinforce them with a
>>>> pragmatic choice: Instead of fully deactivating tests, we can make them run
>>>> in a separate suite of Flaky tests. Why would this help?
>>>>
>>>> - It would allow us to make sure that flaky tests continue to *be able
>>>> to run*.
>>>> - It would remind us that we have flaky tests that need fixing.
>>>> - It would allow us to experiment fixes to these tests on the Flaky
>>>> suite, and once they're reliable, move them to the main suite.
>>>>
>>>> Does this make sense to others?
>>>> Best
>>>> -P.
>>>>
>>>

Reply via email to