In recent months we have been triaging high frequency (>=30 times/week) 
failures in automated tests.  We find that we are fixing 35% of the bugs and 
disabling 23% of them.

The great news is we are fixing many of the issues.  The sad news is we are 
disabling tests, but usually only after giving a bug 2+ weeks of time to get 
fixed.

In March, we want to find a way to disable the teststhat are causing the most 
pain or are most likely not to be fixed, without unduly jeopardizing the chance 
that these bugs will be fixed.  We propose:
1) all high frequency (>=30/week) intermittent failure bugs will have 2 weeks 
from initial triage to get fixed, otherwise we will disable the test case.
2) all very high frequency bugs (>=75/week) will have 1 week from initial 
triage to get fixed, otherwise we will disable the test case.

We still plan to only pester once/week.  If a test has fallen out of our 
definition of high frequency, we are happy to make a note in the bug and adjust 
expectations.

Since we are changing this, we expect a few more disabled tests, but do not 
expect us to shift the balance of fixed vs disabled.  We also expect our Orange 
Factor to be <7.0 by the end of the month.

Thanks to everyone for their on-going efforts to fix frequent intermittent test 
failures; together we can make test results more reliable and less confusing 
for everyone.

Here is a blog post with more data and information about the project:
https://elvis314.wordpress.com/2017/03/07/project-stockwell-reduce-intermittents-march-2017/
_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Reply via email to