Re: Degradation of availability when using NTS and RF > number of racks

2023-03-07 Thread Miklosovic, Stefan
I forgot to remove the last paragraph. We really do some queries with QUORUM on system_auth. https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/auth/CassandraRoleManager.java#L277-L291 From: Miklosovic, Stefan Sent: Tuesday,

Re: Degradation of availability when using NTS and RF > number of racks

2023-03-07 Thread Miklosovic, Stefan
I am forwarding the message Ben Slater wrote me personally and asked me to post it. He has some problems with the subscription to this mailing list with his email. Very uncommon in my experience – my guess would be at most 2 to 3 cluster out of the few hundred that we manage. Also picking up

Re: Degradation of availability when using NTS and RF > number of racks

2023-03-07 Thread Paulo Motta
I'm not sure if this recommendation is still valid (or ever was) but it's not uncommon to have higher RF on system_auth keyspaces, where it would be quite dramatic to hit this bug on the loss of a properly configured rack for RF=3. On Tue, Mar 7, 2023 at 2:40 PM Jeff Jirsa wrote: > Anyone have

Re: Degradation of availability when using NTS and RF > number of racks

2023-03-07 Thread Jeff Jirsa
Anyone have stats on how many people use RF > 3 per dc? (I know what it looks like in my day job but I don’t want to pretend it’s representative of the larger community) I’m a fan of fixing this but I do wonder how common this is in the wild. On Mar 7, 2023, at 9:12 AM, Derek Chen-Becker wrote:I

Re: Degradation of availability when using NTS and RF > number of racks

2023-03-07 Thread Miklosovic, Stefan
I am glad more people joined and expressed their opinions after my last e-mail. It seems to me that there is a consensus in having it fixed directly in NTS and make it little bit more smart about the replica placement but we should still have a way how to do it "the old way". There is a lot of

Re: Degradation of availability when using NTS and RF > number of racks

2023-03-07 Thread Jeremiah D Jordan
Right, why I said we should make NTS do the right thing, rather than throwing a warning. Doing the right thing, and not getting a warning, is the best behavior. > On Mar 7, 2023, at 11:12 AM, Derek Chen-Becker wrote: > > I think that the warning would only be thrown in the case where a

Re: Degradation of availability when using NTS and RF > number of racks

2023-03-07 Thread Aaron Ploetz
> I think it would be a worse experience to not warn and let the user discover later when they can't write at QUORUM. Agree. Should we add a note in the cassandra.yaml comments as well? Maybe in the spot where default_keyspace_rf is defined? On the other hand, that section is pretty "wordy"

Re: Degradation of availability when using NTS and RF > number of racks

2023-03-07 Thread Derek Chen-Becker
I think that the warning would only be thrown in the case where a potentially QUORUM-busting configuration is used. I think it would be a worse experience to not warn and let the user discover later when they can't write at QUORUM. Cheers, Derek On Tue, Mar 7, 2023 at 9:32 AM Jeremiah D Jordan

Re: Degradation of availability when using NTS and RF > number of racks

2023-03-07 Thread Jeremiah D Jordan
I agree with Paulo, it would be nice if we could figure out some way to make new NTS work correctly, with a parameter to fall back to the “bad” behavior, so that people restoring backups to a new cluster can get the right behavior to match their backups. The problem with only fixing this in a

Re: Degradation of availability when using NTS and RF > number of racks

2023-03-07 Thread Benedict
My view is that if this is a pretty serious bug. I wonder if transactional metadata will make it possible to safely fix this for users without rebuilding (only via opt-in, of course). > On 7 Mar 2023, at 15:54, Miklosovic, Stefan > wrote: > > Thanks everybody for the feedback. > > I think

Re: Degradation of availability when using NTS and RF > number of racks

2023-03-07 Thread Miklosovic, Stefan
Thanks everybody for the feedback. I think that emitting a warning upon keyspace creation (and alteration) should be enough for starters. If somebody can not live without 100% bullet proof solution over time we might choose some approach from the offered ones. As the saying goes there is no

Re: Degradation of availability when using NTS and RF > number of racks

2023-03-06 Thread Paulo Motta
It's a bit unfortunate that NTS does not maintain the ability to lose a rack without loss of quorum for RF > #racks > 2, since this can be easily achieved by evenly placing replicas across all racks. Since RackAwareTopologyStrategy is a superset of NetworkTopologyStrategy, can't we just use the

Re: Degradation of availability when using NTS and RF > number of racks

2023-03-06 Thread Jeff Jirsa
A huge number of people use this legal and unsafe combination - like anyone running RF=3 in AWS us-west-1 (or any other region with only 2 accessible AZs), and no patch is going to suddenly make that safe, and banning it hurts users a lot. If we're really going to ship a less-bad version of this,

Re: Degradation of availability when using NTS and RF > number of racks

2023-03-06 Thread Derek Chen-Becker
1) It does seem a like a big footgun. I think it violates the principle of least surprise if someone has configured NTS thinking that they are improving availability 2) I don't know that we want to ban it outright, since maybe there's a case for someone to be using a different CL that would be OK

Re: Degradation of availability when using NTS and RF > number of racks

2023-03-06 Thread C. Scott Andreas
Modifying NTS in place would not be possible if it changes rack placement in a way that breaks existing clusters on upgrade. A strategy introducing a change to placement like this would need a new name. A new strategy would be fine in trunk. Logging a warning seems appropriate if RF > rack

Degradation of availability when using NTS and RF > number of racks

2023-03-06 Thread Miklosovic, Stefan
Hi all, some time ago we identified an issue with NetworkTopologyStrategy. The problem is that when RF > number of racks, it may happen that NTS places replicas in such a way that when whole rack is lost, we lose QUORUM and data are not available anymore if QUORUM CL is used. To illustrate