Hi Cassandra experts,
  I am facing an issue,a node downs every day in a 6 nodes cluster,the cluster 
is just in one DC,
  Every node has 4C 16G,and the heap configuration is MAX_HEAP_SIZE=8192m 
HEAP_NEWSIZE=512m,every node load about 200G data,the RF for the business CF is 
3,a node downs one time every day,the system.log shows below info:
WARN  [Native-Transport-Requests-19] 2018-03-26 18:53:17,128 
CassandraAuthorizer.java:101 - CassandraAuthorizer failed to authorize #<User 
nev_tsp_sa> for <table nev_prod_tsp.latest_rt_alarm>
ERROR [Native-Transport-Requests-19] 2018-03-26 18:53:17,129 
QueryMessage.java:128 - Unexpected error during query
com.google.common.util.concurrent.UncheckedExecutionException: 
java.lang.RuntimeException: 
org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed out - 
received only 0 responses.
        at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2203) 
~[guava-18.0.jar:na]
        at com.google.common.cache.LocalCache.get(LocalCache.java:3937) 
~[guava-18.0.jar:na]
        at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3941) 
~[guava-18.0.jar:na]
        at 
com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4824) 
~[guava-18.0.jar:na]
        at org.apache.cassandra.auth.AuthCache.get(AuthCache.java:108) 
~[apache-cassandra-3.9.jar:3.9]
        at 
org.apache.cassandra.auth.PermissionsCache.getPermissions(PermissionsCache.java:45)
 ~[apache-cassandra-3.9.jar:3.9]
        at 
org.apache.cassandra.auth.AuthenticatedUser.getPermissions(AuthenticatedUser.java:104)
 ~[apache-cassandra-3.9.jar:3.9]
        at 
org.apache.cassandra.service.ClientState.authorize(ClientState.java:419) 
~[apache-cassandra-3.9.jar:3.9]
        at 
org.apache.cassandra.service.ClientState.checkPermissionOnResourceChain(ClientState.java:352)
 ~[apache-cassandra-3.9.jar:3.9]
        at 
org.apache.cassandra.service.ClientState.ensureHasPermission(ClientState.java:329)
 ~[apache-cassandra-3.9.jar:3.9]
        at 
org.apache.cassandra.service.ClientState.hasAccess(ClientState.java:316) 
~[apache-cassandra-3.9.jar:3.9]
        at 
org.apache.cassandra.service.ClientState.hasColumnFamilyAccess(ClientState.java:300)
 ~[apache-cassandra-3.9.jar:3.9]
        at 
org.apache.cassandra.cql3.statements.ModificationStatement.checkAccess(ModificationStatement.java:211)
 ~[apache-cassandra-3.9.jar:3.9]
        at 
org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:185)
 ~[apache-cassandra-3.9.jar:3.9]
        at 
org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:219) 
~[apache-cassandra-3.9.jar:3.9]
        at 
org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:204) 
~[apache-cassandra-3.9.jar:3.9]
        at 
org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:115)
 ~[apache-cassandra-3.9.jar:3.9]
        at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:513)
 [apache-cassandra-3.9.jar:3.9]
        at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:407)
 [apache-cassandra-3.9.jar:3.9]
        at 
io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
 [netty-all-4.0.39.Final.jar:4.0.39.Final]
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:366)
 [netty-all-4.0.39.Final.jar:4.0.39.Final]
        at 
io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:35)
 [netty-all-4.0.39.Final.jar:4.0.39.Final]
        at 
io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:357)
 [netty-all-4.0.39.Final.jar:4.0.39.Final]
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[na:1.8.0_91]
        at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
 [apache-cassandra-3.9.jar:3.9]
        at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) 
[apache-cassandra-3.9.jar:3.9]
        at java.lang.Thread.run(Thread.java:745) [na:1.8.0_91]
Caused by: java.lang.RuntimeException: 
org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed out - 
received only 0 responses.
        at 
org.apache.cassandra.auth.CassandraAuthorizer.authorize(CassandraAuthorizer.java:102)
 ~[apache-cassandra-3.9.jar:3.9]
        at 
org.apache.cassandra.auth.PermissionsCache.lambda$new$0(PermissionsCache.java:37)
 ~[apache-cassandra-3.9.jar:3.9]
        at org.apache.cassandra.auth.AuthCache$1.load(AuthCache.java:183) 
~[apache-cassandra-3.9.jar:3.9]
        at 
com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3527)
 ~[guava-18.0.jar:na]
        at 
com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2319) 
~[guava-18.0.jar:na]
        at 
com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2282)
 ~[guava-18.0.jar:na]
        at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2197) 
~[guava-18.0.jar:na]
        ... 26 common frames omitted
Caused by: org.apache.cassandra.exceptions.ReadTimeoutException: Operation 
timed out - received only 0 responses.
        at 
org.apache.cassandra.service.ReadCallback.awaitResults(ReadCallback.java:132) 
~[apache-cassandra-3.9.jar:3.9]
        at org.apache.cassandra.service.ReadCallback.get(ReadCallback.java:137) 
~[apache-cassandra-3.9.jar:3.9]
        at 
org.apache.cassandra.service.AbstractReadExecutor.get(AbstractReadExecutor.java:145)
 ~[apache-cassandra-3.9.jar:3.9]
        at 
org.apache.cassandra.service.StorageProxy$SinglePartitionReadLifecycle.awaitResultsAndRetryOnDigestMismatch(StorageProxy.java:1718)
 ~[apache-cassandra-3.9.jar:3.9]
        at 
org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:1667) 
~[apache-cassandra-3.9.jar:3.9]
        at 
org.apache.cassandra.service.StorageProxy.readRegular(StorageProxy.java:1608) 
~[apache-cassandra-3.9.jar:3.9]
        at 
org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:1527) 
~[apache-cassandra-3.9.jar:3.9]
        at 
org.apache.cassandra.db.SinglePartitionReadCommand$Group.execute(SinglePartitionReadCommand.java:975)
 ~[apache-cassandra-3.9.jar:3.9]
        at 
org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:271)
 ~[apache-cassandra-3.9.jar:3.9]
        at 
org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:232)
 ~[apache-cassandra-3.9.jar:3.9]
        at 
org.apache.cassandra.auth.CassandraAuthorizer.addPermissionsForRole(CassandraAuthorizer.java:227)
 ~[apache-cassandra-3.9.jar:3.9]
        at 
org.apache.cassandra.auth.CassandraAuthorizer.authorize(CassandraAuthorizer.java:93)
 ~[apache-cassandra-3.9.jar:3.9]
        ... 32 common frames omitted
WARN  [Native-Transport-Requests-23] 2018-03-26 18:53:17,131 
CassandraAuthorizer.java:101 - CassandraAuthorizer failed to authorize #<User 
nev_tsp_sa> for <table nev_prod_tsp.rt_alarm_unite>
ERROR [Native-Transport-Requests-64] 2018-03-26 18:53:17,135 
QueryMessage.java:128 - Unexpected error during query
com.google.common.util.concurrent.UncheckedExecutionException: 
java.lang.RuntimeException: 
org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed out - 
received only 0 responses.
        at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2203) 
~[guava-18.0.jar:na]

I have confirmed that nev_tsp_sa has all rights on nev_prod_tsp keyspace:
cassandra@cqlsh:system_auth> select * from role_permissions where role = 
'nev_tsp_sa';

role       | resource          | permissions
------------+-------------------+--------------------------------------------------------------
nev_tsp_sa | data/nev_prod_tsp | {'ALTER', 'AUTHORIZE', 'CREATE', 'DROP', 
'MODIFY', 'SELECT'}

the cache disk can be read/write as normal.

Highly appreciated if anyone can help,thanks very much !


Best Regards,

倪项菲/ David Ni
中移德电网络科技有限公司
Virtue Intelligent Network Ltd, co.
Add: 2003,20F No.35 Luojia creative city,Luoyu Road,Wuhan,HuBei
Mob: +86 13797007811|Tel: + 86 27 5024 2516

Reply via email to