A rate of 4/30 is a rate of 13% true bugs, which worries me with respect to our 
promise of shipping a bug-free GA.  In past releases we have ensured no flaky 
tests, I think.

That said, I’ve not had the time to contribute to the fixing of flaky tests, so 
I’ll leave the decision to those who have, or otherwise have a strong opinion.


From: Ekaterina Dimitrova <e.dimitr...@gmail.com>
Date: Monday, 14 June 2021 at 20:51
To: dev@cassandra.apache.org <dev@cassandra.apache.org>
Subject: Re: Are we ready for 4.0.0 (GA) ?
To give some context around the flaky tests, I pulled a quick report for the 
fixed ones during the past two months. It is attached for your reference.

To summarize, in two months 30 tickets for flaky tests were closed and only 4 
of them were Cassandra bugs(marked in red in the report), the rest of them were 
test fixes.

I think Butler and running in a loop any new tests before adding them to our 
test suite will help a lot. Also, Mick did a lot of work to stabilize Jenkins. 
Timeouts and resource issues are less common than before, that is  a win! Thank 
you Mick!

Best regards,
Ekaterina


On Mon, 14 Jun 2021 at 13:08, Adam Holmberg 
<adam.holmb...@datastax.com<mailto:adam.holmb...@datastax.com>> wrote:
To the point of "long-term observability over flakies":

I will mention here that we intend to deploy a tool called Butler that we
have developed and used internally for a while. It compliments Jenkins to
present different views of test results, allowing developers to better
ascertain those tests that are flaky vs failing vs new regressions. We
already have a server provisioned for public hosting. The application
requires a bit of work to generalize for this project. We've been putting
it on while focused on getting 4.0 over the line, but should be getting to
it soon after.

On Mon, Jun 14, 2021 at 11:33 AM Mick Semb Wever 
<m...@apache.org<mailto:m...@apache.org>> wrote:

> Are we ready to cut 4.0.0 (GA) once the following tickets land?
>
>  CASSANDRA-16733 – Allow operators to disable 'ALTER ... DROP COMPACT
> STORAGE' statements"
>  CASSANDRA-16669 – Password obfuscation for DCL audit log statements
>  CASSANDRA-16735 – Adding columns via ALTER TABLE can generate corrupt
> sstables
>
>
> A bit more background.
>
> 1. On our 4.0 GA board there's a few other tickets, which have priority but
> are not blockers for a GA release.
>
> https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=355&quickFilter=1661
>
>  CASSANDRA-16715 – WEBSITE - June 2021 updates
>  CASSANDRA-12519 – dtest failure in
> offline_tools_test.TestOfflineTools.sstableofflinerelevel_test
>  CASSANDRA-16681 – org.apache.cassandra.utils.memory.LongBufferPoolTest -
> tests are flaky
>  CASSANDRA-16689 – Flaky LeaveAndBootstrapTest
>
>
> 2. We also said we would get 5 green CI runs in a row. Progress on that
> front
> has been slow and risks delaying GA and our user base. It has had priority
> and there's been lots of momentum which is persisting: lots of flaky fixes
> committed; and the following are being discussed to keep pushing it in the
> right direction…
>  - Long-term observability over flakies
>  - Jenkins agent observability (infra stability)
>
> The past weeks has seen good progress on stability of ci-cassandra.a.o with
> the introduction of cpu docker limits imposed, and better monitoring of the
> agents so we can ensure we get the saturation and load we want. Dockerising
> the cqlshlib tests is also in progress.
>
> The alternative to a 4.0.0 GA release is a 4.0-rc2 release.
> Should the next release be: 4.0.0 (GA) or 4.0-rc2 ?
>


--
Adam Holmberg
e. adam.holmb...@datastax.com<mailto:adam.holmb...@datastax.com>
w. www.datastax.com<http://www.datastax.com>

Reply via email to