[
https://issues.apache.org/jira/browse/CASSANDRA-12525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17644897#comment-17644897
]
Stefan Miklosovic commented on CASSANDRA-12525:
-----------------------------------------------
Hi again [~xgerman42] ,
I wanted to replicate your issue locally but I was not able to do that. I used
3 nodes per dc in two dc's (6 nodes in total).
I was also checking the logic when it comes to role creation (1) and (2). What
it does is that it will create local cassandra role only in case it is nowhere
to be found. First it checks with consistency level ONE, if it is not there, it
will check with QUORUM CL. I put more debugging to that and every time it
executed only query with ONE and it saw that cassandra role is there and it
just moved on.
The only ever case when cassandra role was created was when I started the very
first node in dc1. The fact that you see cassandra role creation in other node
in the second dc is very interesting. I am not completely sure how you got that
behavior.
There is also this (3), it will postpone the check of (1) for 10 seconds. So
what could in theory happen is that the node did not see any peers in this
window of 10 seconds which means that it evaluated that it is alone in the
cluster (all conditions in (1) were evaluated as true (and false as they were
negated)). This is the most reasonable explanation why you see this from time
to time.
The interesting consequence of that logic in (3) is that it can not be blocking
because that node about to start does not know in advance if it is going to be
the only one in the cluster or not. The delay is controlled by system property
(4) so you could pro-actively increase this to some higher value, like 30
seconds to minimize the chance that this might happen.
To improve this, we might make the default waiting time bigger but that is not
solving it entirely, we are just kicking the can down the road here.
So if we can not completely prevent this, the next best option is to do what
you suggested.
Is not there any third, better way? What about waiting on _something_ in that
scheduled tasked, postponed to 10s by default, to wait for something which is
not initialized fully yet? The fact that these peers are not there yet seems
like Gossip did not have a chance to see the topology fully yet?
(1)
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/auth/CassandraRoleManager.java#L351-L356
(2)
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/auth/CassandraRoleManager.java#L376-L384
(3)
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/auth/CassandraRoleManager.java#L386-L405
(4)
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/auth/AuthKeyspace.java#L58
> When adding new nodes to a cluster which has authentication enabled, we end
> up losing cassandra user's current crendentials and they get reverted back to
> default cassandra/cassandra crendetials
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: CASSANDRA-12525
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12525
> Project: Cassandra
> Issue Type: Bug
> Components: Cluster/Schema, Local/Config
> Reporter: Atin Sood
> Assignee: German Eichberger
> Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.0.x, 4.1.x, 4.x
>
> Time Spent: 2h 40m
> Remaining Estimate: 0h
>
> Made the following observation:
> When adding new nodes to an existing C* cluster with authentication enabled
> we end up loosing password information about `cassandra` user.
> Initial Setup
> - Create a 5 node cluster with system_auth having RF=5 and
> NetworkTopologyStrategy
> - Enable PasswordAuthenticator on this cluster and update the password for
> 'cassandra' user to say 'password' via the alter query
> - Make sure you run nodetool repair on all the nodes
> Test case
> - Now go ahead and add 5 more nodes to this cluster.
> - Run nodetool repair on all the 10 nodes now
> - Decommission the original 5 nodes such that only the new 5 nodes are in the
> cluster now
> - Run cqlsh and try to connect to this cluster using old user name and
> password, cassandra/password
> I was unable to connect to the nodes with the original credentials and was
> only able to connect using the default cassandra/cassandra credentials
> From the conversation over IIRC
> `beobal: sood: that definitely shouldn't happen. The new nodes should only
> create the default superuser role if there are 0 roles currently defined
> (including that default one)`
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]