Reminder - 2020-05-19 Apache Cassandra Contributor Meeting

2020-05-18 Thread Patrick McFadin
Hi everyone,

Reminder that tomorrow at 1PM PST we'll be having a contributor meeting. I
gave Jitsi a try for the Kubernetes SIG but ran into a lot of trouble with
browser compatibility and recording. I'll just stick with using Zoom to
keep it working consistently.

https://datastax.zoom.us/j/390839037

https://cwiki.apache.org/confluence/display/CASSANDRA/2020-05-19+Apache+Cassandra+Contributor+Meeting

Add any agenda items here or email me direct and I can put them in.

Thanks,

Patrick


Re: Scope of CASSANDRA-14557 (Default and Minimum RF)

2020-05-18 Thread Sumanth Pasupuleti
I agree. I've updated the patch in 14557 to include a note about
application to system keyspaces in cassandra.yaml.

On Mon, May 18, 2020 at 1:59 AM Oleksandr Petrov 
wrote:

> I think it is reasonable that system keyspaces would get initialized with a
> default replication factor, assuming ones that were already
> initalized would remain intact (however, this should be the same for
> user-created keyspaces).
>
> Assuming it doesn't change the current behaviour, and default and min rf,
> when unset, act the same way current version would, the only thing we
> should probably add is a line in cassandra.conf that default and min
> replication factors will also apply to system keyspaces.
>
> On Wed, May 13, 2020 at 9:01 PM Sumanth Pasupuleti <
> sumanth.pasupuleti...@gmail.com> wrote:
>
> > Hi,
> >
> > Based on Alex's suggestion on the ticket, wanted to reach out to clarify
> on
> > the current scope of  default and minimum replication factors that 14557
> > defines, and gather thoughts to farm for dissent.
> >
> > Both the configurations (default and minimum) apply not just to user
> > keyspaces, but also to system keyspaces. For instance, this can be handy
> in
> > deployments that use authenticated C* clusters where operators have to
> > "remember" to set system_auth keyspace's RF to a value higher than 1. In
> > such cases, setting default_rf = 3 for example (which I suppose is common
> > in most deployments) would ensure all the system keyspaces (including
> > system_auth) come up with RF=3.
> >
> > It can be helpful to note that, this patch by default does not cause any
> > change to the replication factors, reason being, the default values of
> > these configurations are set to [defaultRF=1, minimumRF=0] to not induce
> > any changes that folks may not expect, but rather offers knobs to define
> > what a sane default RF should be, and have a gate on any new keyspaces
> > being created with an RF lower than minimumRF.
> >
> > Curious to know your thoughts.
> >
> > Thanks,
> > Sumanth
> >
>
>
> --
> alex p
>


Re: List of serious issues fixed in 3.0.x

2020-05-18 Thread Oleksandr Petrov
Wanted to add some that I remembered:

  * https://issues.apache.org/jira/browse/CASSANDRA-12811 - data
resurrection, but was marked as normal because was discovered with a test.
Should've marked it as critical.
  * https://issues.apache.org/jira/browse/CASSANDRA-12956 - data loss
(commit log isn't replayed on custom 2i exception)
  * https://issues.apache.org/jira/browse/CASSANDRA-12144 -
undeletable/duplicate rows problem; can be considered data resurrection
and/or sstable corruption.



On Thu, May 7, 2020 at 6:55 PM Joshua McKenzie  wrote:

> "ML is plaintext bro" - thanks Mick. ಠ_ಠ
>
> Since we're stuck in the late 90's, here's some links to a gsheet:
>
> Defects by month:
> https://docs.google.com/spreadsheets/d/1Qt8lLIiqVvK7mlSML7zsmXdAc-LsvktFW5RXJDRtN8k/edit#gid=1584867240
> Defects by component:
> https://docs.google.com/spreadsheets/d/1Qt8lLIiqVvK7mlSML7zsmXdAc-LsvktFW5RXJDRtN8k/edit#gid=1946109279
> Defects by type:
> https://docs.google.com/spreadsheets/d/1Qt8lLIiqVvK7mlSML7zsmXdAc-LsvktFW5RXJDRtN8k/edit#gid=385136105
>
> On Thu, May 7, 2020 at 12:31 PM Joshua McKenzie 
> wrote:
>
>> Hearing the images got killed by the web server. Trying from gmail (sorry
>> for spam). Time to see if it's the apache smtp server or the list culling
>> images:
>>
>> ---
>> I did a little analysis on this data (any defect marked with fixversion
>> 4.0 that rose to the level of critical in terms of availability,
>> correctness, or corruption/loss) and charted some things the rest of the
>> project community might find interesting:
>>
>> 1: Critical (availability, correctness, corruption/loss) defects fixed
>> per month since about 6 months before 3.11.0:
>> [image: monthly.png]
>>
>> 2: Components in which critical defects arose (note: bright red bar ==
>> sum of 3 dark red):
>> [image: Total Defects by Component.png]
>>
>> 3: Type of defect found and fixed (bright red: cluster down or permaloss,
>> dark red: temp corrupt/loss, yellow: incorrect response):
>>
>> [image: Total Defects by Type.png]
>>
>> My personal takeaways from this: a ton of great defect fixing work has
>> gone into 4.0. I'd love it if we had both code coverage analysis for
>> testing on the codebase as well as data to surface where hotspots of
>> defects are in the code that might need further testing (caveat: many have
>> voiced their skepticism of the value of this type of data in the past in
>> this project community, so that's probably another conversation to have on
>> another thread)
>>
>> Hope someone else finds the above interesting if not useful.
>>
>> --
>> Joshua McKenzie
>>
>> On Thu, May 7, 2020 at 12:24 PM Joshua McKenzie 
>> wrote:
>>
>>> I did a little analysis on this data (any defect marked with fixversion
>>> 4.0 that rose to the level of critical in terms of availability,
>>> correctness, or corruption/loss) and charted some things the rest of the
>>> project community might find interesting:
>>>
>>> 1: Critical (availability, correctness, corruption/loss) defects fixed
>>> per month since about 6 months before 3.11.0:
>>> [image: monthly.png]
>>>
>>> 2: Components in which critical defects arose (note: bright red bar ==
>>> sum of 3 dark red):
>>> [image: Total Defects by Component.png]
>>>
>>> 3: Type of defect found and fixed (bright red: cluster down or
>>> permaloss, dark red: temp corrupt/loss, yellow: incorrect response):
>>>
>>> [image: Total Defects by Type.png]
>>>
>>> My personal takeaways from this: a ton of great defect fixing work has
>>> gone into 4.0. I'd love it if we had both code coverage analysis for
>>> testing on the codebase as well as data to surface where hotspots of
>>> defects are in the code that might need further testing (caveat: many have
>>> voiced their skepticism of the value of this type of data in the past in
>>> this project community, so that's probably another conversation to have on
>>> another thread)
>>>
>>> Hope someone else finds the above interesting if not useful.
>>>
>>> ~Josh
>>>
>>>
>>> On Wed, May 6, 2020 at 3:38 PM Dinesh Joshi  wrote:
>>>
 Hi Sankalp,

 Thanks for bringing this up. At the very minimum, I hope we have
 regression tests for the specific issues we have fixed.

 I personally think, the project should focus on building a
 comprehensive test suite. However, some of these issues can only be
 detected at scale. We need users to test* C* in their environment for their
 use-cases. Ideally these folks stand up large clusters and tee their
 traffic to the new cluster and report issues.

 If we had an automated test suite that everyone can run at a large
 scale that would be even better.

 Thanks,

 Dinesh


 * test != starting C* in a few nodes and looking at logs.

 > On May 6, 2020, at 10:11 AM, sankalp kohli 
 wrote:
 >
 > Hi,
 >I want to share some of the serious issues that were found and
 fixed in
 > 3.0.x. 

Re: Scope of CASSANDRA-14557 (Default and Minimum RF)

2020-05-18 Thread Oleksandr Petrov
I think it is reasonable that system keyspaces would get initialized with a
default replication factor, assuming ones that were already
initalized would remain intact (however, this should be the same for
user-created keyspaces).

Assuming it doesn't change the current behaviour, and default and min rf,
when unset, act the same way current version would, the only thing we
should probably add is a line in cassandra.conf that default and min
replication factors will also apply to system keyspaces.

On Wed, May 13, 2020 at 9:01 PM Sumanth Pasupuleti <
sumanth.pasupuleti...@gmail.com> wrote:

> Hi,
>
> Based on Alex's suggestion on the ticket, wanted to reach out to clarify on
> the current scope of  default and minimum replication factors that 14557
> defines, and gather thoughts to farm for dissent.
>
> Both the configurations (default and minimum) apply not just to user
> keyspaces, but also to system keyspaces. For instance, this can be handy in
> deployments that use authenticated C* clusters where operators have to
> "remember" to set system_auth keyspace's RF to a value higher than 1. In
> such cases, setting default_rf = 3 for example (which I suppose is common
> in most deployments) would ensure all the system keyspaces (including
> system_auth) come up with RF=3.
>
> It can be helpful to note that, this patch by default does not cause any
> change to the replication factors, reason being, the default values of
> these configurations are set to [defaultRF=1, minimumRF=0] to not induce
> any changes that folks may not expect, but rather offers knobs to define
> what a sane default RF should be, and have a gate on any new keyspaces
> being created with an RF lower than minimumRF.
>
> Curious to know your thoughts.
>
> Thanks,
> Sumanth
>


-- 
alex p