Erick Erickson created SOLR-12016:
-------------------------------------

             Summary: Reduce noise from flakey tests
                 Key: SOLR-12016
                 URL: https://issues.apache.org/jira/browse/SOLR-12016
             Project: Solr
          Issue Type: Improvement
      Security Level: Public (Default Security Level. Issues are Public)
          Components: Tests
    Affects Versions: 7.2, master (8.0)
            Reporter: Erick Erickson
            Assignee: Erick Erickson


We had a discussion of this topic on the dev list, look for a thread titled: 
"Test failures are out of control.....". I'll try to summarize that discussion 
here and we can move this JIRA forward. This may become an umbrella issue.

Current situation concerns:

> There is so much noise from flakey tests (particularly Solr tests) that they 
> are difficult to use.
> The number of tests that regularly fail is increasing
> Failures are being ignored
> The number of failing tests makes releasing more difficult.
> The number of failing tests make it harder to determine whether recent 
> changes actually caused problems. Running the tests again until they succeed 
> is used commonly at present, which is not robust.
> e-mail notifications of failing tests are largely being ignored.

Propsal:

> Mark all currently "flakey" tests as BadApple or AwaitsFix
> Run Jenkins jobs with BadApple (and/or AwaitsFix) enabled and disabled. 
> Frequency TBD, depends partly on whether we can label emails from these runs 
> for easy filtering of the two flavors.
>> Label these runs with something suitable in the subject line (wish list)
> Weekly reports on the tests labeled BadApple or AwaitsFix
>> Perhaps this could be incorporated in the reports linked below (wish list)
> Committers should enable BadApple (or AwaitsFix) regularly as a sanity check. 
> Leave these as defaults.
> We start getting _much_ more aggressive about not allowing _new_ flakey tests.

NOTE: It's perfectly acceptable to have failing flakey tests as long as someone 
is activey working on _fixing_ them.

Concerns with solution

> Decreases test coverage
> Decreases visibility of flakey tests, making fixing them less likely.
> Some tools (see below) that report on bad tests will not see tests that are 
> annotated with BadApple or AwaitsFix.
> Running unit tests and reporting errors are being conflated

To be decided:

> Can we label e-mails with failing tests with something in the subject line 
> identifying whether they were run with BadApple/Awaits fix enabled or 
> disabled? Can someone volunteer?
> Is there any difference between BadApple and AwaitsFix? If not should we 
> deprecate one? I propose we just use AwaitsFix and deprecate BadApple.
> Can the automated reports (see below) be enhanced to also report tests 
> labeled BadApple or AwaitsFix?

Useful tools:

> Steve Rowe's work on a Jenkins job to reproduce test failures (LUCENE-8106) 
> Hoss has worked on aggregating all test failures from the 3 Jenkins systems 
> (ASF, Policeman, and Steve's), downloading the test results & logs, and 
> running some reports/stats on failures.
  >> http://fucit.org/solr-jenkins-reports/
  >> https://github.com/hossman/jenkins-reports/
  >> http://fucit.org/solr-jenkins-reports/failure-report.html

I've assigned this JIRA to myslef, but all volunteers welcome, especially 
anything that changes the build system.....

I've decided to make this a SOLR jira on the theory that most of the offending 
tests are in the Solr hive, any sub-tasks for touching the build system can go 
under LUCENE if wanted.

Also, I expect to add the annotation to some more tests for a few days as 
infrequent failures occur. Once we have stability (defined by there being 
little noise) that'll stop.

3 BadApple 23 AwaitsFix annotations are currently in the code, linked to these 
issues:
HADOOP-14044
HADOOP-9893
LUCENE-3869
LUCENE-5575")
LUCENE-5595
LUCENE-5737
LUCENE-6709
LUCENE-7161
SOLR-2715
SOLR-6213
SOLR-6443
SOLR-6944
SOLR-7736
SOLR-9036
SOLR-10071
SOLR-10107
SOLR-10136
SOLR-10734
SOLR-10191
SOLR-11134
SOLR-11458
SOLR-11714
SOLR-11974

Solr JIRAS about bad tests
SOLR-2175
SOLR-4147
SOLR-5880
SOLR-6423
SOLR-6944
SOLR-6961
SOLR-6974
SOLR-8122
SOLR-8182
SOLR-9869
SOLR-10053
SOLR-10070
SOLR-10071
SOLR-10139
SOLR-10287
SOLR-10815
SOLR-11911





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to