I forgot to remove the last paragraph. We really do some queries with QUORUM on
system_auth.
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/auth/CassandraRoleManager.java#L277-L291
From: Miklosovic, Stefan
Sent: Tuesday,
I am forwarding the message Ben Slater wrote me personally and asked me to post
it. He has some problems with the subscription to this mailing list with his
email.
Very uncommon in my experience – my guess would be at most 2 to 3 cluster out
of the few hundred that we manage.
Also picking up
I'm not sure if this recommendation is still valid (or ever was) but it's
not uncommon to have higher RF on system_auth keyspaces, where it would be
quite dramatic to hit this bug on the loss of a properly configured rack
for RF=3.
On Tue, Mar 7, 2023 at 2:40 PM Jeff Jirsa wrote:
> Anyone have
Anyone have stats on how many people use RF > 3 per dc? (I know what it looks like in my day job but I don’t want to pretend it’s representative of the larger community) I’m a fan of fixing this but I do wonder how common this is in the wild. On Mar 7, 2023, at 9:12 AM, Derek Chen-Becker wrote:I
I am glad more people joined and expressed their opinions after my last e-mail.
It seems to me that there is a consensus in having it fixed directly in NTS and
make it little bit more smart about the replica placement but we should still
have a way how to do it "the old way".
There is a lot of
Right, why I said we should make NTS do the right thing, rather than throwing a
warning. Doing the right thing, and not getting a warning, is the best
behavior.
> On Mar 7, 2023, at 11:12 AM, Derek Chen-Becker wrote:
>
> I think that the warning would only be thrown in the case where a
> I think it would be a worse experience to not warn and let the user
discover later when they can't write at QUORUM.
Agree.
Should we add a note in the cassandra.yaml comments as well? Maybe in the
spot where default_keyspace_rf is defined? On the other hand, that section
is pretty "wordy"
I think that the warning would only be thrown in the case where a
potentially QUORUM-busting configuration is used. I think it would be a
worse experience to not warn and let the user discover later when they
can't write at QUORUM.
Cheers,
Derek
On Tue, Mar 7, 2023 at 9:32 AM Jeremiah D Jordan
I agree with Paulo, it would be nice if we could figure out some way to make
new NTS work correctly, with a parameter to fall back to the “bad” behavior, so
that people restoring backups to a new cluster can get the right behavior to
match their backups.
The problem with only fixing this in a
My view is that if this is a pretty serious bug. I wonder if transactional
metadata will make it possible to safely fix this for users without rebuilding
(only via opt-in, of course).
> On 7 Mar 2023, at 15:54, Miklosovic, Stefan
> wrote:
>
> Thanks everybody for the feedback.
>
> I think
Thanks everybody for the feedback.
I think that emitting a warning upon keyspace creation (and alteration) should
be enough for starters. If somebody can not live without 100% bullet proof
solution over time we might choose some approach from the offered ones. As the
saying goes there is no
It's a bit unfortunate that NTS does not maintain the ability to lose a
rack without loss of quorum for RF > #racks > 2, since this can be easily
achieved by evenly placing replicas across all racks.
Since RackAwareTopologyStrategy is a superset of NetworkTopologyStrategy,
can't we just use the
A huge number of people use this legal and unsafe combination - like anyone
running RF=3 in AWS us-west-1 (or any other region with only 2 accessible
AZs), and no patch is going to suddenly make that safe, and banning it
hurts users a lot.
If we're really going to ship a less-bad version of this,
1) It does seem a like a big footgun. I think it violates the principle of
least surprise if someone has configured NTS thinking that they are
improving availability
2) I don't know that we want to ban it outright, since maybe there's a case
for someone to be using a different CL that would be OK
Modifying NTS in place would not be possible if it changes rack placement in a
way that breaks existing clusters on upgrade. A strategy introducing a change
to placement like this would need a new name. A new strategy would be fine in
trunk.
Logging a warning seems appropriate if RF > rack
Hi all,
some time ago we identified an issue with NetworkTopologyStrategy. The problem
is that when RF > number of racks, it may happen that NTS places replicas in
such a way that when whole rack is lost, we lose QUORUM and data are not
available anymore if QUORUM CL is used.
To illustrate
16 matches
Mail list logo