> > we already have a way to confirm flakiness on circle by running the test > repeatedly N times. Like 100 or 500. That has proven to work very well > so far, at least for me. #collaborating #justfyi >
It does not prove that it is the test flakiness. It still can be a bug in the code which occurs intermittently under some rare conditions - - -- --- ----- -------- ------------- Jacek Lewandowski On Tue, Nov 2, 2021 at 7:46 AM Berenguer Blasi <berenguerbl...@gmail.com> wrote: > Hi, > > we already have a way to confirm flakiness on circle by running the test > repeatedly N times. Like 100 or 500. That has proven to work very well > so far, at least for me. #collaborating #justfyi > > On the 60+ failures it is not as bad as it looks. Let me explain. I have > been tracking failures in 4.0 and trunk daily, it's grown as a habit in > me after the 4.0 push. And 4.0 and trunk were hovering around <10 > failures solidly (you can check jenkins ci graphs). The random bisect or > fix was needed leaving behind 3 or 4 tests that have defeated already 2 > or 3 committers, so the really tough guys. I am reasonably convinced > once the 60+ failures fix merges we'll be back to the <10 failures with > relative little effort. > > So we're just in the middle of a 'fix' but overall we shouldn't be as > bad as it looks now as we've been quite good at keeping CI green-ish imo. > > Also +1 to releasable branches, which whatever we settle it means it is > not a wall of failures, bc of reasons explained like the hidden costs etc > > My 2cts. > > On 2/11/21 6:07, Jacek Lewandowski wrote: > >> I don’t think means guaranteeing there are no failing tests (though > >> ideally this would also happen), but about ensuring our best practices > are > >> followed for every merge. 4.0 took so long to release because of the > amount > >> of hidden work that was created by merging work that didn’t meet the > >> standard for release. > >> > > Tests are sometimes considered flaky because they fail intermittently but > > it may not be related to the insufficiently consistent test > implementation > > and can reveal some real problem in the production code. I saw that in > > various codebases and I think that it would be great if each such test > (or > > test group) was guaranteed to have a ticket and some preliminary analysis > > was done to confirm it is just a test problem before releasing the new > > version > > > > Historically we have also had significant pressure to backport features > to > >> earlier versions due to the cost and risk of upgrading. If we maintain > >> broader version compatibility for upgrade, and reduce the risk of > adopting > >> newer versions, then this pressure is also reduced significantly. Though > >> perhaps we will stick to our guns here anyway, as there seems to be > renewed > >> pressure to limit work in GA releases to bug fixes exclusively. It > remains > >> to be seen if this holds. > > > > Are there any precise requirements for supported upgrade and downgrade > > paths? > > > > Thanks > > - - -- --- ----- -------- ------------- > > Jacek Lewandowski > > > > > > On Sat, Oct 30, 2021 at 4:07 PM bened...@apache.org <bened...@apache.org > > > > wrote: > > > >>> How do we define what "releasable trunk" means? > >> For me, the major criteria is ensuring that work is not merged that is > >> known to require follow-up work, or could reasonably have been known to > >> require follow-up work if better QA practices had been followed. > >> > >> So, a big part of this is ensuring we continue to exceed our targets for > >> improved QA. For me this means trying to weave tools like Harry and the > >> Simulator into our development workflow early on, but we’ll see how well > >> these tools gain broader adoption. This also means focus in general on > >> possible negative effects of a change. > >> > >> I think we could do with producing guidance documentation for how to > >> approach QA, where we can record our best practices and evolve them as > we > >> discover flaws or pitfalls, either for ergonomics or for bug discovery. > >> > >>> What are the benefits of having a releasable trunk as defined here? > >> If we want to have any hope of meeting reasonable release cadences _and_ > >> the high project quality we expect today, then I think a ~shippable > trunk > >> policy is an absolute necessity. > >> > >> I don’t think means guaranteeing there are no failing tests (though > >> ideally this would also happen), but about ensuring our best practices > are > >> followed for every merge. 4.0 took so long to release because of the > amount > >> of hidden work that was created by merging work that didn’t meet the > >> standard for release. > >> > >> Historically we have also had significant pressure to backport features > to > >> earlier versions due to the cost and risk of upgrading. If we maintain > >> broader version compatibility for upgrade, and reduce the risk of > adopting > >> newer versions, then this pressure is also reduced significantly. Though > >> perhaps we will stick to our guns here anyway, as there seems to be > renewed > >> pressure to limit work in GA releases to bug fixes exclusively. It > remains > >> to be seen if this holds. > >> > >>> What are the costs? > >> I think the costs are quite low, perhaps even negative. Hidden work > >> produced by merges that break things can be much more costly than > getting > >> the work right first time, as attribution is much more challenging. > >> > >> One cost that is created, however, is for version compatibility as we > >> cannot say “well, this is a minor version bump so we don’t need to > support > >> downgrade”. But I think we should be investing in this anyway for > operator > >> simplicity and confidence, so I actually see this as a benefit as well. > >> > >>> Full disclosure: running face-first into 60+ failing tests on trunk > >> I have to apologise here. CircleCI did not uncover these problems, > >> apparently due to some way it resolves dependencies, and so I am > >> responsible for a significant number of these and have been quite sick > >> since. > >> > >> I think a push to eliminate flaky tests will probably help here in > future, > >> though, and perhaps the project needs to have some (low) threshold of > flaky > >> or failing tests at which point we block merges to force a correction. > >> > >> > >> From: Joshua McKenzie <jmcken...@apache.org> > >> Date: Saturday, 30 October 2021 at 14:00 > >> To: dev@cassandra.apache.org <dev@cassandra.apache.org> > >> Subject: [DISCUSS] Releasable trunk and quality > >> We as a project have gone back and forth on the topic of quality and the > >> notion of a releasable trunk for quite a few years. If people are > >> interested, I'd like to rekindle this discussion a bit and see if we're > >> happy with where we are as a project or if we think there's steps we > should > >> take to change the quality bar going forward. The following questions > have > >> been rattling around for me for awhile: > >> > >> 1. How do we define what "releasable trunk" means? All reviewed by M > >> committers? Passing N% of tests? Passing all tests plus some other > metrics > >> (manual testing, raising the number of reviewers, test coverage, usage > in > >> dev or QA environments, etc)? Something else entirely? > >> > >> 2. With a definition settled upon in #1, what steps, if any, do we need > to > >> take to get from where we are to having *and keeping* that releasable > >> trunk? Anything to codify there? > >> > >> 3. What are the benefits of having a releasable trunk as defined here? > What > >> are the costs? Is it worth pursuing? What are the alternatives (for > >> instance: a freeze before a release + stabilization focus by the > community > >> i.e. 4.0 push or the tock in tick-tock)? > >> > >> Given the large volumes of work coming down the pike with CEP's, this > seems > >> like a good time to at least check in on this topic as a community. > >> > >> Full disclosure: running face-first into 60+ failing tests on trunk when > >> going through the commit process for denylisting this week brought this > >> topic back up for me (reminds me of when I went to merge CDC back in 3.6 > >> and those test failures riled me up... I sense a pattern ;)) > >> > >> Looking forward to hearing what people think. > >> > >> ~Josh > >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > For additional commands, e-mail: dev-h...@cassandra.apache.org > >