>
> we already have a way to confirm flakiness on circle by running the test
> repeatedly N times. Like 100 or 500. That has proven to work very well
> so far, at least for me. #collaborating #justfyi
>

It does not prove that it is the test flakiness. It still can be a bug in
the code which occurs intermittently under some rare conditions


- - -- --- ----- -------- -------------
Jacek Lewandowski


On Tue, Nov 2, 2021 at 7:46 AM Berenguer Blasi <berenguerbl...@gmail.com>
wrote:

> Hi,
>
> we already have a way to confirm flakiness on circle by running the test
> repeatedly N times. Like 100 or 500. That has proven to work very well
> so far, at least for me. #collaborating #justfyi
>
> On the 60+ failures it is not as bad as it looks. Let me explain. I have
> been tracking failures in 4.0 and trunk daily, it's grown as a habit in
> me after the 4.0 push. And 4.0 and trunk were hovering around <10
> failures solidly (you can check jenkins ci graphs). The random bisect or
> fix was needed leaving behind 3 or 4 tests that have defeated already 2
> or 3 committers, so the really tough guys. I am reasonably convinced
> once the 60+ failures fix merges we'll be back to the <10 failures with
> relative little effort.
>
> So we're just in the middle of a 'fix' but overall we shouldn't be as
> bad as it looks now as we've been quite good at keeping CI green-ish imo.
>
> Also +1 to releasable branches, which whatever we settle it means it is
> not a wall of failures, bc of reasons explained like the hidden costs etc
>
> My 2cts.
>
> On 2/11/21 6:07, Jacek Lewandowski wrote:
> >> I don’t think means guaranteeing there are no failing tests (though
> >> ideally this would also happen), but about ensuring our best practices
> are
> >> followed for every merge. 4.0 took so long to release because of the
> amount
> >> of hidden work that was created by merging work that didn’t meet the
> >> standard for release.
> >>
> > Tests are sometimes considered flaky because they fail intermittently but
> > it may not be related to the insufficiently consistent test
> implementation
> > and can reveal some real problem in the production code. I saw that in
> > various codebases and I think that it would be great if each such test
> (or
> > test group) was guaranteed to have a ticket and some preliminary analysis
> > was done to confirm it is just a test problem before releasing the new
> > version
> >
> > Historically we have also had significant pressure to backport features
> to
> >> earlier versions due to the cost and risk of upgrading. If we maintain
> >> broader version compatibility for upgrade, and reduce the risk of
> adopting
> >> newer versions, then this pressure is also reduced significantly. Though
> >> perhaps we will stick to our guns here anyway, as there seems to be
> renewed
> >> pressure to limit work in GA releases to bug fixes exclusively. It
> remains
> >> to be seen if this holds.
> >
> > Are there any precise requirements for supported upgrade and downgrade
> > paths?
> >
> > Thanks
> > - - -- --- ----- -------- -------------
> > Jacek Lewandowski
> >
> >
> > On Sat, Oct 30, 2021 at 4:07 PM bened...@apache.org <bened...@apache.org
> >
> > wrote:
> >
> >>> How do we define what "releasable trunk" means?
> >> For me, the major criteria is ensuring that work is not merged that is
> >> known to require follow-up work, or could reasonably have been known to
> >> require follow-up work if better QA practices had been followed.
> >>
> >> So, a big part of this is ensuring we continue to exceed our targets for
> >> improved QA. For me this means trying to weave tools like Harry and the
> >> Simulator into our development workflow early on, but we’ll see how well
> >> these tools gain broader adoption. This also means focus in general on
> >> possible negative effects of a change.
> >>
> >> I think we could do with producing guidance documentation for how to
> >> approach QA, where we can record our best practices and evolve them as
> we
> >> discover flaws or pitfalls, either for ergonomics or for bug discovery.
> >>
> >>> What are the benefits of having a releasable trunk as defined here?
> >> If we want to have any hope of meeting reasonable release cadences _and_
> >> the high project quality we expect today, then I think a ~shippable
> trunk
> >> policy is an absolute necessity.
> >>
> >> I don’t think means guaranteeing there are no failing tests (though
> >> ideally this would also happen), but about ensuring our best practices
> are
> >> followed for every merge. 4.0 took so long to release because of the
> amount
> >> of hidden work that was created by merging work that didn’t meet the
> >> standard for release.
> >>
> >> Historically we have also had significant pressure to backport features
> to
> >> earlier versions due to the cost and risk of upgrading. If we maintain
> >> broader version compatibility for upgrade, and reduce the risk of
> adopting
> >> newer versions, then this pressure is also reduced significantly. Though
> >> perhaps we will stick to our guns here anyway, as there seems to be
> renewed
> >> pressure to limit work in GA releases to bug fixes exclusively. It
> remains
> >> to be seen if this holds.
> >>
> >>> What are the costs?
> >> I think the costs are quite low, perhaps even negative. Hidden work
> >> produced by merges that break things can be much more costly than
> getting
> >> the work right first time, as attribution is much more challenging.
> >>
> >> One cost that is created, however, is for version compatibility as we
> >> cannot say “well, this is a minor version bump so we don’t need to
> support
> >> downgrade”. But I think we should be investing in this anyway for
> operator
> >> simplicity and confidence, so I actually see this as a benefit as well.
> >>
> >>> Full disclosure: running face-first into 60+ failing tests on trunk
> >> I have to apologise here. CircleCI did not uncover these problems,
> >> apparently due to some way it resolves dependencies, and so I am
> >> responsible for a significant number of these and have been quite sick
> >> since.
> >>
> >> I think a push to eliminate flaky tests will probably help here in
> future,
> >> though, and perhaps the project needs to have some (low) threshold of
> flaky
> >> or failing tests at which point we block merges to force a correction.
> >>
> >>
> >> From: Joshua McKenzie <jmcken...@apache.org>
> >> Date: Saturday, 30 October 2021 at 14:00
> >> To: dev@cassandra.apache.org <dev@cassandra.apache.org>
> >> Subject: [DISCUSS] Releasable trunk and quality
> >> We as a project have gone back and forth on the topic of quality and the
> >> notion of a releasable trunk for quite a few years. If people are
> >> interested, I'd like to rekindle this discussion a bit and see if we're
> >> happy with where we are as a project or if we think there's steps we
> should
> >> take to change the quality bar going forward. The following questions
> have
> >> been rattling around for me for awhile:
> >>
> >> 1. How do we define what "releasable trunk" means? All reviewed by M
> >> committers? Passing N% of tests? Passing all tests plus some other
> metrics
> >> (manual testing, raising the number of reviewers, test coverage, usage
> in
> >> dev or QA environments, etc)? Something else entirely?
> >>
> >> 2. With a definition settled upon in #1, what steps, if any, do we need
> to
> >> take to get from where we are to having *and keeping* that releasable
> >> trunk? Anything to codify there?
> >>
> >> 3. What are the benefits of having a releasable trunk as defined here?
> What
> >> are the costs? Is it worth pursuing? What are the alternatives (for
> >> instance: a freeze before a release + stabilization focus by the
> community
> >> i.e. 4.0 push or the tock in tick-tock)?
> >>
> >> Given the large volumes of work coming down the pike with CEP's, this
> seems
> >> like a good time to at least check in on this topic as a community.
> >>
> >> Full disclosure: running face-first into 60+ failing tests on trunk when
> >> going through the commit process for denylisting this week brought this
> >> topic back up for me (reminds me of when I went to merge CDC back in 3.6
> >> and those test failures riled me up... I sense a pattern ;))
> >>
> >> Looking forward to hearing what people think.
> >>
> >> ~Josh
> >>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>

Reply via email to