[
https://issues.apache.org/jira/browse/CASSANDRA-12103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15354693#comment-15354693
]
Sam Tunnicliffe commented on CASSANDRA-12103:
---------------------------------------------
This indicates a failure in the node attempting to read credentials for the
user your client is trying to authenticate as. The most likely cause is that
the replication factor for the {{system_auth}} is too low and the replica
holding those credentials is unreachable. I suspect you're using
{{SimpleStrategy}}, the replica responsible for those credentials is in DC2 and
there was some inter-dc connection problem.
Can you confirm:
1) What is the replication strategy config for {{system_auth}}?
2) Do/did you have any nodes down at the time?
3) Are your clients attempting to log in using the default superuser login
(username cassandra), as credentials for this user are read at a higher
consistency level (QUORUM, rather than LOCAL_ONE as for all other users)?
> Cassandra is hang and cqlsh was not able to login with OperationTimeout error
> -----------------------------------------------------------------------------
>
> Key: CASSANDRA-12103
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12103
> Project: Cassandra
> Issue Type: Bug
> Components: Core, Local Write-Read Paths
> Environment: centos 6.5 cassandra 2.1.9
> Reporter: peng xiao
> Priority: Critical
> Attachments: system.log.2016-06-28_1257.gz
>
>
> Hi,
> We have two DCs(DC1 and DC2) with DC1 3 nodes and DC2 9 nodes.
> And we experienced a Timeout error today,all applications connected to DC1
> were hang and no response,even cqlsh was not able to log into any node in DC1.
> I restarted the 3 nodes in DC1,the problem was not resolved.
> Then we switched to DC2,then applications back to normal.
> Could you please help to take a look?
> Thanks
> many errors like below:
> ERROR [SharedPool-Worker-43] 2016-06-28 11:58:49,705 Message.java:538 -
> Unexpected exception during request; channel = [id: 0x87e315d6,
> /172.16.10.198:13604 => /172.16.11.13:9042]
> java.lang.RuntimeException:
> org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed out -
> received only 0 responses.
> at org.apache.cassandra.auth.Auth.selectUser(Auth.java:276)
> ~[apache-cassandra-2.1.9.jar:2.1.9]
> at org.apache.cassandra.auth.Auth.isExistingUser(Auth.java:86)
> ~[apache-cassandra-2.1.9.jar:2.1.9]
> at
> org.apache.cassandra.service.ClientState.login(ClientState.java:206)
> ~[apache-cassandra-2.1.9.jar:2.1.9]
> at
> org.apache.cassandra.transport.messages.AuthResponse.execute(AuthResponse.java:82)
> ~[apache-cassandra-2.1.9.jar:2.1.9]
> at
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:439)
> [apache-cassandra-2.1.9.jar:2.1.9]
> at
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:335)
> [apache-cassandra-2.1.9.jar:2.1.9]
> at
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
> [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
> [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at
> io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32)
> [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at
> io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:324)
> [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> [na:1.8.0]
> at
> org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
> [apache-cassandra-2.1.9.jar:2.1.9]
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105)
> [apache-cassandra-2.1.9.jar:2.1.9]
> at java.lang.Thread.run(Thread.java:744) [na:1.8.0]
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)