It stands out to me that Google a) does pre-submit test runs, b) has a
dedicated team to identify and work with flaky tests, c) has some kind of
quarantining, and d) doesn't report on a test being flaky unless it fails
3x in a row if it's marked as flaky. While we obviously can't pursue b
since we're volunteer OSS, and a isn't really tractable with the current
build infra, c and d seem like they could be tractable.

We've talked previously about quarantining flaky failures w/an annotation
and then slowly re-integrating them as we fixed them. At the time, the
general concern was that we could false negative something as flaky that
was a genuine defect and/or tests remaining in a flaky state into
perpetuity. In retrospect, I've come around to thinking that using a flaky
annotation for quarantining and treating those failures as LHF as an easy
way for people to get involved in the project plus reducing the noise on
new work would be worth the risk of some test atrophy if we collectively
fail to have discipline.

In general it seems like a multi-pronged approach is the only realistic way
to keep this problem under control.

On Fri, Feb 16, 2018 at 8:12 AM, Jason Brown <jasedbr...@gmail.com> wrote:

> Hi,
>
> I'm ecstatic others are now running the tests and, more importantly, that
> we're having the conversation.
>
> I've become convinced we cannot always have 100% green tests. I am reminded
> of this [1] blog post from Google when thinking about flaky tests.
> The TL;DR is "flakiness happens", to the tune of about 1.5% of all tests
> across Google.
>
> I am in no way advocating that we simply turn a blind eye to broken or
> flaky tests, or shrug our shoulders
> and rubber stamp a vote, but instead to accept it when it reasonably
> applies.
> To achieve this, we might need to have discussion at vote/release time (if
> not sooner) to triage flaky tests, but I see that as a good thing.
>
> Thanks,
>
> -Jason
>
> [1]
> https://testing.googleblog.com/2016/05/flaky-tests-at-
> google-and-how-we.html
>
>
>
> On Fri, Feb 16, 2018 at 12:47 AM, Dinesh Joshi <
> dinesh.jo...@yahoo.com.invalid> wrote:
>
> > I'm new to this project and here are my two cents.
> > If there are tests that are constantly failing or flaky and you have had
> > releases despite their failures, then they're not useful and can be
> > disabled. They can always be reenabled if they are in fact valuable.
> Having
> > 100% blue dashboard is not idealistic IMHO. Hardware failures are harder
> > but they can be addressed too.
> > I could pitch in to fix the noisy tests or just help in other ways to get
> > the dashboard to blue.
> > Dinesh
> > On Thursday, February 15, 2018, 1:14:33 PM PST, Josh McKenzie <
> > jmcken...@apache.org> wrote:
> >  >
> > > We’ve said in the past that we don’t release without green tests. The
> PMC
> > > gets to vote and enforce it. If you don’t vote yes without seeing the
> > test
> > > results, that enforces it.
> >
> > I think this is noble and ideal in theory. In practice, the tests take
> long
> > enough, hardware infra has proven flaky enough, and the tests
> *themselves*
> > flaky enough, that there's been a consistent low-level of test failure
> > noise that makes separating signal from noise in this context very time
> > consuming. Reference 3.11-test-all for example re:noise:
> > https://builds.apache.org/view/A-D/view/Cassandra/job/
> > Cassandra-3.11-test-all/test/?width=1024&height=768
> >
> > Having spearheaded burning test failures to 0 multiple times and have
> them
> > regress over time, my gut intuition is we should have one person as our
> > Source of Truth with a less-flaky source for release-vetting CI
> (dedicated
> > hardware, circle account, etc) we can use as a reference to vote on
> release
> > SHA's.
> >
> > We’ve declared this a requirement multiple times
> >
> > Declaring things != changed behavior, and thus != changed culture. The
> > culture on this project is one of having a constant low level of test
> > failure noise in our CI as a product of our working processes. Unless we
> > change those (actually block release w/out green board, actually
> > aggressively block merge w/any failing tests, aggressively retroactively
> > track down test failures on a daily basis and RCA), the situation won't
> > improve. Given that this is a volunteer organization / project, that kind
> > of daily time investment is a big ask.
> >
> > On Thu, Feb 15, 2018 at 1:10 PM, Jeff Jirsa <jji...@gmail.com> wrote:
> >
> > > Moving this to it’s own thread:
> > >
> > > We’ve declared this a requirement multiple times and then we
> occasionally
> > > get a critical issue and have to decide whether it’s worth the delay. I
> > > assume Jason’s earlier -1 on attempt 1 was an enforcement of that
> earlier
> > > stated goal.
> > >
> > > It’s up to the PMC. We’ve said in the past that we don’t release
> without
> > > green tests. The PMC gets to vote and enforce it. If you don’t vote yes
> > > without seeing the test results, that enforces it.
> > >
> > > --
> > > Jeff Jirsa
> > >
> > >
> > > > On Feb 15, 2018, at 9:49 AM, Josh McKenzie <jmcken...@apache.org>
> > wrote:
> > > >
> > > > What would it take for us to get green utest/dtests as a blocking
> part
> > of
> > > > the release process? i.e. "for any given SHA, here's a link to the
> > tests
> > > > that passed" in the release vote email?
> > > >
> > > > That being said, +1.
> > > >
> > > >> On Wed, Feb 14, 2018 at 4:33 PM, Nate McCall <zznat...@gmail.com>
> > > wrote:
> > > >>
> > > >> +1
> > > >>
> > > >> On Thu, Feb 15, 2018 at 9:40 AM, Michael Shuler <
> > mich...@pbandjelly.org
> > > >
> > > >> wrote:
> > > >>> I propose the following artifacts for release as 3.0.16.
> > > >>>
> > > >>> sha1: 890f319142ddd3cf2692ff45ff28e71001365e96
> > > >>> Git:
> > > >>> http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=
> > > >> shortlog;h=refs/tags/3.0.16-tentative
> > > >>> Artifacts:
> > > >>> https://repository.apache.org/content/repositories/
> > > >> orgapachecassandra-1157/org/apache/cassandra/apache-
> cassandra/3.0.16/
> > > >>> Staging repository:
> > > >>> https://repository.apache.org/content/repositories/
> > > >> orgapachecassandra-1157/
> > > >>>
> > > >>> Debian and RPM packages are available here:
> > > >>> http://people.apache.org/~mshuler
> > > >>>
> > > >>> *** This release addresses an important fix for CASSANDRA-14092 ***
> > > >>>    "Max ttl of 20 years will overflow localDeletionTime"
> > > >>>    https://issues.apache.org/jira/browse/CASSANDRA-14092
> > > >>>
> > > >>> The vote will be open for 72 hours (longer if needed).
> > > >>>
> > > >>> [1]: (CHANGES.txt) https://goo.gl/rLj59Z
> > > >>> [2]: (NEWS.txt) https://goo.gl/EkrT4G
> > > >>>
> > > >>> ------------------------------------------------------------
> > ---------
> > > >>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > > >>> For additional commands, e-mail: dev-h...@cassandra.apache.org
> > > >>>
> > > >>
> > > >> ------------------------------------------------------------
> ---------
> > > >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > > >> For additional commands, e-mail: dev-h...@cassandra.apache.org
> > > >>
> > > >>
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > > For additional commands, e-mail: dev-h...@cassandra.apache.org
> > >
> > >
> >
>

Reply via email to