Re: Test failures are out of control......

Erick Erickson Wed, 21 Feb 2018 14:53:44 -0800

Yonik:

Good discussion. I'm not wedded to a particular solution, it's just
the current direction is not sustainable.

I'll back up a bit and see if I can state my goals more clearly, it
looks like we're arguing for much the same thing.

> I want e-mail messages with test failures to be worth looking at. When I see 
> that a test fail, I don't want to waste a time trying to figure out whether 
> it's something newly introduced or not. I also want some less painful way to 
> say "this change broke tests" rather than "this change may or may not have 
> broken tests. Could somebody beast the old and new versions 100 times and 
> hope that's enough to make a determination?". This looks like your (a).

> When I make a change, I want to be able to quickly determine whether my 
> changes likely the cause of test failures or not. This looks like your (b). 
> If we annotate all flakey tests that would be a significant help since it 
> would be easy to glance at the test to see if it's a known flakey test or 
> not. Armed with that knowledge I can be more comfortable with having it 
> succeed a few times and chalking it up to flakey tests.

> I want to stop the downward trend we've been experiencing lately with more 
> and more tests failing.

An annotation makes that possible I think, although I'm not clear on
why @Flakey this is superior to @BadApple. There are exactly three
BadApple annotations in the entire code base at present, is there
enough value in introducing another annotation to make it worthwhile?
Or could we just figure out whether any of those three tests that use
@BadApple should be changed to, say, @Ignore and then use @BadApple
for the rest? Perhaps we change the build system to enable BadApple by
default when running locally (or, conversely, enabling BadApple on
Jenkins).

Alternatively would it be possible to turn off e-mail notifications of
failures for @Flakey (or @BadApple, whatever) tests? That would do
too. That probably has the added advantage of allowing some reporting
tools to continue to function.

bq: And we can *always* decide to prevent new flakey tests, regardless
of what we do about the existing flakey tests...

We haven't been doing this though, flakey tests have been
proliferating. Mark's tool hasn't been run since last August unless
there's a newer URL than I'm looking at:
http://solr-tests.bitballoon.com/. I'm not so interested in what we
_could_ do as what we _are_ doing. And even tools such as this require
someone to monitor/complain/whimper. And I don't see volunteers
stepping forward. It's much easier to have a system where any failure
is unusual than count on people to wade through voluminous output.

bq: Just because we are frustrated doesn't mean that *any* change is positive.

Of course not. But nobody else seems to be bringing the topic up so I
thought I would.

On Wed, Feb 21, 2018 at 1:49 PM, Yonik Seeley <ysee...@gmail.com> wrote:
> On Wed, Feb 21, 2018 at 3:26 PM, Erick Erickson <erickerick...@gmail.com> 
> wrote:
>> Yonik:
>>
>> What I'm frustrated by now is that variations on these themes haven't
>> cured the problem, and it's spun out of control and is getting worse.
>
> I understand, but what problem(s) are you trying to solve?  Just
> because we are frustrated doesn't mean that *any* change is positive.
> Some changes can have a definite negative affect on software quality.
>
> You didn't respond to the main thrust of my message, so let me try to
> explain it again more succinctly:
>
> Flakey Test Problems:
> a) Flakey tests create so much noise that people no longer pay
> attention to the automated reporting via email.
> b) When running unit tests manually before a commit (i.e. "ant test")
> a flakey test can fail.
>
> Solutions:
> We cloud fix (a) by marking as flakey and having a new target
> "non-flakey" that is run by the jenkins jobs that are currently run
> continuously.
> For (b) "ant test" should still include the flakey tests since it's
> better to have to re-run a seemingly unrelated test to determine if
> one broke something rather than increase committed bugs due to loss of
> test coverage.  It's a pain, but perhaps it should be.  It's a real
> problem that needs fixing and @Ignoring it won't work as a better
> mechanism to get it fixed.  Sweeping it under the rug would seem to
> ensure that it gets less attention.
>
> And we can *always* decide to prevent new flakey tests, regardless of
> what we do about the existing flakey tests.  Mark's tool is a good way
> to see what the current list of flakey tests is.
>
> -Yonik
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Test failures are out of control......

Reply via email to