Thanks for the reply. My system_auth settings is as below and what should I do with it? And I'm interested why the newly added node is responsible for the user authentication?
CREATE KEYSPACE system_auth WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '1'} AND durable_writes = true; -Simon From: Oleksandr Shulgin Date: 2017-06-14 16:36 To: wxn...@zjqunshuo.com CC: user Subject: Re: Cannot achieve consistency level LOCAL_ONE On Wed, Jun 14, 2017 at 9:11 AM, wxn...@zjqunshuo.com <wxn...@zjqunshuo.com> wrote: Hi, Cluster set up: 1 DC with 5 nodes (each node having 700GB data) 1 kespace with RF of 2 write CL is LOCAL_ONE read CL is LOCAL_QUORUM One node was down for about 1 hour because of OOM issue. During the down period, all 4 other nodes report "Cannot achieve consistency level LOCAL_ONE" constantly until I brought up the dead node. My data seems lost during that down time. To me this could not happen because the write CL is LOCAL_ONE and only one node was dead. I encountered node down before because of OOM issue and I believe I didn't lose data because of the hinted handoff feature. Hi, The problem here is at a different level: not a single replica of the data could be written because no coordinator was available to serve the (authentication, see below) request. One more thing, the dead node was added recently and the only difference is the other 4 nodes are behind an internal SLB(Service Load Balance) with VIP, and the new one not. Our application access Casssandra cluster by the SLB VIP. Any thoughts are appreciated. Best regards, -Simon System log: 57659 Caused by: com.google.common.util.concurrent.UncheckedExecutionException: java.lang.RuntimeException: org.apache.cassandra.exceptions.Unavai lableException: Cannot achieve consistency level LOCAL_ONE 57660 at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2201) ~[guava-16.0.jar:na] 57661 at com.google.common.cache.LocalCache.get(LocalCache.java:3934) ~[guava-16.0.jar:na] 57662 at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3938) ~[guava-16.0.jar:na] 57663 at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4821) ~[guava-16.0.jar:na] 57664 at org.apache.cassandra.auth.RolesCache.getRoles(RolesCache.java:70) ~[apache-cassandra-2.2.8.jar:2.2.8] 57665 at org.apache.cassandra.auth.Roles.hasSuperuserStatus(Roles.java:51) ~[apache-cassandra-2.2.8.jar:2.2.8] 57666 at org.apache.cassandra.auth.AuthenticatedUser.isSuper(AuthenticatedUser.java:71) ~[apache-cassandra-2.2.8.jar:2.2.8] 57667 at org.apache.cassandra.auth.CassandraAuthorizer.authorize(CassandraAuthorizer.java:76) ~[apache-cassandra-2.2.8.jar:2.2.8] What are the replication settings of your system_auth keyspace? It looks like the node being down was responsible for the only replica of the user info needed to check its credentials/permissions. Cheers, -- Alex