Uwe Schindler updated SOLR-12016:
> Reduce noise from flakey tests
> Key: SOLR-12016
> URL: https://issues.apache.org/jira/browse/SOLR-12016
> Project: Solr
> Issue Type: Improvement
> Security Level: Public(Default Security Level. Issues are Public)
> Components: Tests
> Affects Versions: 7.2, master (8.0)
> Reporter: Erick Erickson
> Assignee: Erick Erickson
> Priority: Major
> Attachments: SOLR-12016-buildsystem.patch
> We had a discussion of this topic on the dev list, look for a thread titled:
> "Test failures are out of control.....". I'll try to summarize that
> discussion here and we can move this JIRA forward. This may become an
> umbrella issue.
> Current situation concerns:
> > There is so much noise from flakey tests (particularly Solr tests) that
> > they are difficult to use.
> > The number of tests that regularly fail is increasing
> > Failures are being ignored
> > The number of failing tests makes releasing more difficult.
> > The number of failing tests make it harder to determine whether recent
> > changes actually caused problems. Running the tests again until they
> > succeed is used commonly at present, which is not robust.
> > e-mail notifications of failing tests are largely being ignored.
> > Mark all currently "flakey" tests as BadApple or AwaitsFix
> > Run Jenkins jobs with BadApple (and/or AwaitsFix) enabled and disabled.
> > Frequency TBD, depends partly on whether we can label emails from these
> > runs for easy filtering of the two flavors.
> >> Label these runs with something suitable in the subject line (wish list)
> > Weekly reports on the tests labeled BadApple or AwaitsFix
> >> Perhaps this could be incorporated in the reports linked below (wish list)
> > Committers should enable BadApple (or AwaitsFix) regularly as a sanity
> > check. Leave these as defaults.
> > We start getting _much_ more aggressive about not allowing _new_ flakey
> > tests.
> NOTE: It's perfectly acceptable to have failing flakey tests as long as
> someone is activey working on _fixing_ them.
> Concerns with solution
> > Decreases test coverage
> > Decreases visibility of flakey tests, making fixing them less likely.
> > Some tools (see below) that report on bad tests will not see tests that are
> > annotated with BadApple or AwaitsFix.
> > Running unit tests and reporting errors are being conflated
> To be decided:
> > Can we label e-mails with failing tests with something in the subject line
> > identifying whether they were run with BadApple/Awaits fix enabled or
> > disabled? Can someone volunteer?
> > Is there any difference between BadApple and AwaitsFix? If not should we
> > deprecate one? I propose we just use AwaitsFix and deprecate BadApple.
> > Can the automated reports (see below) be enhanced to also report tests
> > labeled BadApple or AwaitsFix?
> Useful tools:
> > Steve Rowe's work on a Jenkins job to reproduce test failures (LUCENE-8106)
> > Hoss has worked on aggregating all test failures from the 3 Jenkins systems
> > (ASF, Policeman, and Steve's), downloading the test results & logs, and
> > running some reports/stats on failures.
> >> http://fucit.org/solr-jenkins-reports/
> >> https://github.com/hossman/jenkins-reports/
> >> http://fucit.org/solr-jenkins-reports/failure-report.html
> I've assigned this JIRA to myslef, but all volunteers welcome, especially
> anything that changes the build system.....
> I've decided to make this a SOLR jira on the theory that most of the
> offending tests are in the Solr hive, any sub-tasks for touching the build
> system can go under LUCENE if wanted.
> Also, I expect to add the annotation to some more tests for a few days as
> infrequent failures occur. Once we have stability (defined by there being
> little noise) that'll stop.
> 3 BadApple 23 AwaitsFix annotations are currently in the code, linked to
> these issues:
> Solr JIRAS about bad tests
This message was sent by Atlassian JIRA
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org