Erick Erickson created SOLR-12037:
-------------------------------------

             Summary: Reduce noise from flakey tests
                 Key: SOLR-12037
                 URL: https://issues.apache.org/jira/browse/SOLR-12037
             Project: Solr
          Issue Type: Improvement
      Security Level: Public (Default Security Level. Issues are Public)
          Components: Tests
    Affects Versions: 7.2
            Reporter: Erick Erickson
            Assignee: Erick Erickson


Recreating SOLR-12016. Please do NOT delete this without discussion. NOTE: 
Uwe's build system modifications originally on 12016 have been incorporated 
into SOLR-12028.

Current situation concerns:
> There is so much noise from flakey tests (particularly Solr tests) that they 
> are difficult to use.
> The number of tests that regularly fail is increasing
> Failures are being ignored
> The number of failing tests makes releasing more difficult.
> The number of failing tests make it harder to determine whether recent 
> changes actually caused problems. Running the tests again until they succeed 
> is used commonly at present, which is not robust.
> e-mail notifications of failing tests are largely being ignored.
Propsal:
> Mark all currently "flakey" tests as BadApple or AwaitsFix
> Run Jenkins jobs with BadApple (and/or AwaitsFix) enabled and disabled. 
> Frequency TBD, depends partly on whether we can label emails from these runs 
> for easy filtering of the two flavors.
>> Label these runs with something suitable in the subject line (wish list)
> Weekly reports on the tests labeled BadApple or AwaitsFix
>> Perhaps this could be incorporated in the reports linked below (wish list)
> Committers should enable BadApple (or AwaitsFix) regularly as a sanity check. 
> Leave these as defaults.
> We start getting much more aggressive about not allowing new flakey tests.
NOTE: It's perfectly acceptable to have failing flakey tests as long as someone 
is activey working on fixing them.
Concerns with solution
> Decreases test coverage
> Decreases visibility of flakey tests, making fixing them less likely.
> Some tools (see below) that report on bad tests will not see tests that are 
> annotated with BadApple or AwaitsFix.
> Running unit tests and reporting errors are being conflated
To be decided:
> Can we label e-mails with failing tests with something in the subject line 
> identifying whether they were run with BadApple/Awaits fix enabled or 
> disabled? Can someone volunteer?
> Is there any difference between BadApple and AwaitsFix? If not should we 
> deprecate one? I propose we just use AwaitsFix and deprecate BadApple.
> Can the automated reports (see below) be enhanced to also report tests 
> labeled BadApple or AwaitsFix?
Useful tools:
> Steve Rowe's work on a Jenkins job to reproduce test failures (LUCENE-8106) 
> Hoss has worked on aggregating all test failures from the 3 Jenkins systems 
> (ASF, Policeman, and Steve's), downloading the test results & logs, and 
> running some reports/stats on failures.
>> http://fucit.org/solr-jenkins-reports/
>> https://github.com/hossman/jenkins-reports/
>> http://fucit.org/solr-jenkins-reports/failure-report.html
I've assigned this JIRA to myslef, but all volunteers welcome, especially 
anything that changes the build system.....
I've decided to make this a SOLR jira on the theory that most of the offending 
tests are in the Solr hive, any sub-tasks for touching the build system can go 
under LUCENE if wanted.
Also, I expect to add the annotation to some more tests for a few days as 
infrequent failures occur. Once we have stability (defined by there being 
little noise) that'll stop.
3 BadApple 23 AwaitsFix annotations are currently in the code, linked to these 
issues:

HADOOP-9893
LUCENE-3869
LUCENE-5575")
LUCENE-5595
LUCENE-5737
LUCENE-6709
LUCENE-7161
SOLR-2715
SOLR-6213
SOLR-6443
SOLR-6944
SOLR-10071
SOLR-10136
SOLR-10734
SOLR-11974
Solr JIRAS about bad tests
SOLR-2175
SOLR-4147
SOLR-5880
SOLR-6423
SOLR-6944
SOLR-6961
SOLR-6974
SOLR-8122
SOLR-8182
SOLR-9869
SOLR-10053
SOLR-10070
SOLR-10071
SOLR-10139
SOLR-10287

SOLR-11911




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to