I think, given your revised statements around the bugs discovered with the 
flaky tests, and given that these don’t seem to have been serious bugs, I’m 
comfortable with a two week period post-RC2.


From: Benjamin Lerer <ble...@apache.org>
Date: Tuesday, 15 June 2021 at 12:41
To: dev@cassandra.apache.org <dev@cassandra.apache.org>
Subject: Re: Are we ready for 4.0.0 (GA) ?
>
> We do have to cut another RC given the seriousness of CASSANDRA-16735
> though, right?


I do not disagree with that. I just would like to see us more precise with
our expectations for releasing 4.0 GA, considering that we have already
deeply tested the code.

Would it make sense to say: "Let's give us 1 or 2 weeks to test RC-2. If no
blocker shows up we can release 4.0 GA" ?

Le mar. 15 juin 2021 à 12:25, bened...@apache.org <bened...@apache.org> a
écrit :

> That popularity line is a lot more stable than I would have expected,
> honestly, given the huge shifts in the database landscape in the
> intervening years. Though of course I’m sure we’d all rather it were
> trending upwards. I think the release of 4.0 is likely to have minimal
> impact on that, though – future project developments are going to determine
> the project’s success, I expect. Plus maybe a new logo 😊
>
> Still, not disputing the need to ship GA soon.  We do have to cut another
> RC given the seriousness of CASSANDRA-16735 though, right?
>
>
> From: Benjamin Lerer <b.le...@gmail.com>
> Date: Tuesday, 15 June 2021 at 11:14
> To: dev@cassandra.apache.org <dev@cassandra.apache.org>
> Subject: Re: Are we ready for 4.0.0 (GA) ?
> As the list of flaky tests was filtered out I wanted to add some
> information about the test that revealed real issues. First there was a
> mistake: only 3 of the issues were revealed by flaky tests. The other one
> was a user report.
> From the 3 remaining tickets only 2 were 4.0 bugs: CASSANDRA-16238
> <https://issues.apache.org/jira/browse/CASSANDRA-16238> and
> CASSANDRA-16668
> <https://issues.apache.org/jira/browse/CASSANDRA-16668>(which was a pretty
> hard to hit bug).
> I totally agree that we found some real issues but the cost is pretty high:
> 2 months of work for two 4.0 issues.
>
> I had a look this morning at how many users reported bugs on the RC-2
> release. Outside of the people deeply involved in this project there were
> only 4 people reporting true issues and all of the issues were relatively
> minors.
>
> I totally understand that we want to deliver a high quality product. I just
> believe that we have to draw the line at some point.
> The popularity of Cassandra has been going down for years (
> https://db-engines.com/en/ranking_trend/system/Cassandra). The project
> might need that release more than any bug fix we can do.
>
> Le mar. 15 juin 2021 à 07:00, Dinesh Joshi <djos...@icloud.com.invalid> a
> écrit :
>
> > Based on the release lifecycle[1], we should cut another RC until we
> don’t
> > find any blocking issues.
> >
> > Dinesh
> >
> > [1]
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=132320437
> >
> > >
> > > On Jun 14, 2021, at 9:05 PM, Scott Andreas <sc...@paradoxica.net>
> wrote:
> > >
> > > A second RC is appropriate given the revert of CASSANDRA-15899
> > necessitated by the discovery of CASSANDRA-16735: Adding columns via
> ALTER
> > TABLE can generate corrupt sstables.
> > >
> > > Ekaterina and Benedict's statement regarding the true positive rate of
> > flaky tests also shows the value of resolving these, and that it would be
> > good to pay this down as far as we can reasonably do so without
> > unnecessarily withholding the release.
> > >
> > > I do think it's possible that an RC2 build is a candidate for
> nomination
> > as our GA release. I don't think the RC2 phase needs to be drawn-out, but
> > believe it would build confidence for the project to have positive
> feedback
> > from a release containing the fix for C-16735. If work paying down the
> > remaining flaky tests surfaces a similar true positive rate, a third
> build
> > might be warranted, and it would be to the benefit of our users - but I
> > don't think we're far off.
> > >
> > > I hope others are working to deploy the beta/RC builds and integrate +
> > deploy changes from trunk into the releases they're deploying, as heavy
> > contributors doing so provides us the best opportunity to catch these
> > issues before our users do.
> > >
> > > We're getting close.
> > >
> > > ________________________________________
> > > From: bened...@apache.org <bened...@apache.org>
> > > Sent: Monday, June 14, 2021 3:03 PM
> > > To: dev@cassandra.apache.org
> > > Subject: Re: Are we ready for 4.0.0 (GA) ?
> > >
> > > A rate of 4/30 is a rate of 13% true bugs, which worries me with
> respect
> > to our promise of shipping a bug-free GA.  In past releases we have
> ensured
> > no flaky tests, I think.
> > >
> > > That said, I’ve not had the time to contribute to the fixing of flaky
> > tests, so I’ll leave the decision to those who have, or otherwise have a
> > strong opinion.
> > >
> > >
> > > From: Ekaterina Dimitrova <e.dimitr...@gmail.com>
> > > Date: Monday, 14 June 2021 at 20:51
> > > To: dev@cassandra.apache.org <dev@cassandra.apache.org>
> > > Subject: Re: Are we ready for 4.0.0 (GA) ?
> > > To give some context around the flaky tests, I pulled a quick report
> for
> > the fixed ones during the past two months. It is attached for your
> > reference.
> > >
> > > To summarize, in two months 30 tickets for flaky tests were closed and
> > only 4 of them were Cassandra bugs(marked in red in the report), the rest
> > of them were test fixes.
> > >
> > > I think Butler and running in a loop any new tests before adding them
> to
> > our test suite will help a lot. Also, Mick did a lot of work to stabilize
> > Jenkins. Timeouts and resource issues are less common than before, that
> is
> > a win! Thank you Mick!
> > >
> > > Best regards,
> > > Ekaterina
> > >
> > >
> > > On Mon, 14 Jun 2021 at 13:08, Adam Holmberg <
> adam.holmb...@datastax.com
> > <mailto:adam.holmb...@datastax.com>> wrote:
> > > To the point of "long-term observability over flakies":
> > >
> > > I will mention here that we intend to deploy a tool called Butler that
> we
> > > have developed and used internally for a while. It compliments Jenkins
> to
> > > present different views of test results, allowing developers to better
> > > ascertain those tests that are flaky vs failing vs new regressions. We
> > > already have a server provisioned for public hosting. The application
> > > requires a bit of work to generalize for this project. We've been
> putting
> > > it on while focused on getting 4.0 over the line, but should be getting
> > to
> > > it soon after.
> > >
> > >> On Mon, Jun 14, 2021 at 11:33 AM Mick Semb Wever <m...@apache.org
> > <mailto:m...@apache.org>> wrote:
> > >>
> > >> Are we ready to cut 4.0.0 (GA) once the following tickets land?
> > >>
> > >> CASSANDRA-16733 – Allow operators to disable 'ALTER ... DROP COMPACT
> > >> STORAGE' statements"
> > >> CASSANDRA-16669 – Password obfuscation for DCL audit log statements
> > >> CASSANDRA-16735 – Adding columns via ALTER TABLE can generate corrupt
> > >> sstables
> > >>
> > >>
> > >> A bit more background.
> > >>
> > >> 1. On our 4.0 GA board there's a few other tickets, which have
> priority
> > but
> > >> are not blockers for a GA release.
> > >>
> > >>
> >
> https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=355&quickFilter=1661
> > >>
> > >> CASSANDRA-16715 – WEBSITE - June 2021 updates
> > >> CASSANDRA-12519 – dtest failure in
> > >> offline_tools_test.TestOfflineTools.sstableofflinerelevel_test
> > >> CASSANDRA-16681 –
> org.apache.cassandra.utils.memory.LongBufferPoolTest -
> > >> tests are flaky
> > >> CASSANDRA-16689 – Flaky LeaveAndBootstrapTest
> > >>
> > >>
> > >> 2. We also said we would get 5 green CI runs in a row. Progress on
> that
> > >> front
> > >> has been slow and risks delaying GA and our user base. It has had
> > priority
> > >> and there's been lots of momentum which is persisting: lots of flaky
> > fixes
> > >> committed; and the following are being discussed to keep pushing it in
> > the
> > >> right direction…
> > >> - Long-term observability over flakies
> > >> - Jenkins agent observability (infra stability)
> > >>
> > >> The past weeks has seen good progress on stability of ci-cassandra.a.o
> > with
> > >> the introduction of cpu docker limits imposed, and better monitoring
> of
> > the
> > >> agents so we can ensure we get the saturation and load we want.
> > Dockerising
> > >> the cqlshlib tests is also in progress.
> > >>
> > >> The alternative to a 4.0.0 GA release is a 4.0-rc2 release.
> > >> Should the next release be: 4.0.0 (GA) or 4.0-rc2 ?
> > >>
> > >
> > >
> > > --
> > > Adam Holmberg
> > > e. adam.holmb...@datastax.com<mailto:adam.holmb...@datastax.com>
> > > w. www.datastax.com<http://www.datastax.com>
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > > For additional commands, e-mail: dev-h...@cassandra.apache.org
> > >
> >
>

Reply via email to