Wanted to add some that I remembered:
* https://issues.apache.org/jira/browse/CASSANDRA-12811 - data
resurrection, but was marked as normal because was discovered with a test.
Should've marked it as critical.
* https://issues.apache.org/jira/browse/CASSANDRA-12956 - data loss
(commit log isn't replayed on custom 2i exception)
* https://issues.apache.org/jira/browse/CASSANDRA-12144 -
undeletable/duplicate rows problem; can be considered data resurrection
and/or sstable corruption.
On Thu, May 7, 2020 at 6:55 PM Joshua McKenzie wrote:
> "ML is plaintext bro" - thanks Mick. ಠ_ಠ
>
> Since we're stuck in the late 90's, here's some links to a gsheet:
>
> Defects by month:
> https://docs.google.com/spreadsheets/d/1Qt8lLIiqVvK7mlSML7zsmXdAc-LsvktFW5RXJDRtN8k/edit#gid=1584867240
> Defects by component:
> https://docs.google.com/spreadsheets/d/1Qt8lLIiqVvK7mlSML7zsmXdAc-LsvktFW5RXJDRtN8k/edit#gid=1946109279
> Defects by type:
> https://docs.google.com/spreadsheets/d/1Qt8lLIiqVvK7mlSML7zsmXdAc-LsvktFW5RXJDRtN8k/edit#gid=385136105
>
> On Thu, May 7, 2020 at 12:31 PM Joshua McKenzie
> wrote:
>
>> Hearing the images got killed by the web server. Trying from gmail (sorry
>> for spam). Time to see if it's the apache smtp server or the list culling
>> images:
>>
>> ---
>> I did a little analysis on this data (any defect marked with fixversion
>> 4.0 that rose to the level of critical in terms of availability,
>> correctness, or corruption/loss) and charted some things the rest of the
>> project community might find interesting:
>>
>> 1: Critical (availability, correctness, corruption/loss) defects fixed
>> per month since about 6 months before 3.11.0:
>> [image: monthly.png]
>>
>> 2: Components in which critical defects arose (note: bright red bar ==
>> sum of 3 dark red):
>> [image: Total Defects by Component.png]
>>
>> 3: Type of defect found and fixed (bright red: cluster down or permaloss,
>> dark red: temp corrupt/loss, yellow: incorrect response):
>>
>> [image: Total Defects by Type.png]
>>
>> My personal takeaways from this: a ton of great defect fixing work has
>> gone into 4.0. I'd love it if we had both code coverage analysis for
>> testing on the codebase as well as data to surface where hotspots of
>> defects are in the code that might need further testing (caveat: many have
>> voiced their skepticism of the value of this type of data in the past in
>> this project community, so that's probably another conversation to have on
>> another thread)
>>
>> Hope someone else finds the above interesting if not useful.
>>
>> --
>> Joshua McKenzie
>>
>> On Thu, May 7, 2020 at 12:24 PM Joshua McKenzie
>> wrote:
>>
>>> I did a little analysis on this data (any defect marked with fixversion
>>> 4.0 that rose to the level of critical in terms of availability,
>>> correctness, or corruption/loss) and charted some things the rest of the
>>> project community might find interesting:
>>>
>>> 1: Critical (availability, correctness, corruption/loss) defects fixed
>>> per month since about 6 months before 3.11.0:
>>> [image: monthly.png]
>>>
>>> 2: Components in which critical defects arose (note: bright red bar ==
>>> sum of 3 dark red):
>>> [image: Total Defects by Component.png]
>>>
>>> 3: Type of defect found and fixed (bright red: cluster down or
>>> permaloss, dark red: temp corrupt/loss, yellow: incorrect response):
>>>
>>> [image: Total Defects by Type.png]
>>>
>>> My personal takeaways from this: a ton of great defect fixing work has
>>> gone into 4.0. I'd love it if we had both code coverage analysis for
>>> testing on the codebase as well as data to surface where hotspots of
>>> defects are in the code that might need further testing (caveat: many have
>>> voiced their skepticism of the value of this type of data in the past in
>>> this project community, so that's probably another conversation to have on
>>> another thread)
>>>
>>> Hope someone else finds the above interesting if not useful.
>>>
>>> ~Josh
>>>
>>>
>>> On Wed, May 6, 2020 at 3:38 PM Dinesh Joshi wrote:
>>>
Hi Sankalp,
Thanks for bringing this up. At the very minimum, I hope we have
regression tests for the specific issues we have fixed.
I personally think, the project should focus on building a
comprehensive test suite. However, some of these issues can only be
detected at scale. We need users to test* C* in their environment for their
use-cases. Ideally these folks stand up large clusters and tee their
traffic to the new cluster and report issues.
If we had an automated test suite that everyone can run at a large
scale that would be even better.
Thanks,
Dinesh
* test != starting C* in a few nodes and looking at logs.
> On May 6, 2020, at 10:11 AM, sankalp kohli
wrote:
>
> Hi,
>I want to share some of the serious issues that were found and
fixed in
> 3.0.x.