[
https://issues.apache.org/jira/browse/CASSANDRA-13043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16149262#comment-16149262
]
Stefano Ortolani edited comment on CASSANDRA-13043 at 9/4/17 10:44 AM:
-----------------------------------------------------------------------
Some updates: added to ccm the ability to send a byteman rule when restarting a
node (https://github.com/ostefano/ccm/tree/startup_byteman). Originally it was
only possible to submit a rule _after_ the node started, which made reproducing
this race quite impossible.
With that I could finally reproduce the bug. Here you can my branch with a new
DTEST exercising the bug:
https://github.com/ostefano/cassandra-dtest/tree/CASSANDRA-13043
Attached you can find a patch fixing the bug by removing nodes that are not RPC
ready from a leader election. Also, I fixed the corner case where the CL
required is local and the current DC does not have available nodes.
Here you can find the commit:
https://github.com/ostefano/cassandra/commit/91cc9b4398009a3cee3004bc11a047c056fda6a6
was (Author: ostefano):
Some updates: added to ccm the ability to send a byteman rule when restarting a
node (https://github.com/ostefano/ccm/tree/startup_byteman).
This allowed me to finally reproduce the bug:
https://github.com/ostefano/cassandra-dtest/tree/CASSANDRA-13043
Attached a patch that try to fix it by removing from a leader election nodes
that are not RPC ready.
Also, fixed corner case where the CL required is local.
> UnavailabeException caused by counter writes forwarded to leaders without
> complete cluster view
> -----------------------------------------------------------------------------------------------
>
> Key: CASSANDRA-13043
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13043
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Environment: Debian
> Reporter: Catalin Alexandru Zamfir
> Attachments: patch.diff
>
>
> In version 3.9 of Cassandra, we get the following exceptions on the
> system.log whenever booting an agent. They seem to grow in number with each
> reboot. Any idea where they come from or what can we do about them? Note that
> the cluster is healthy (has sufficient live nodes).
> {noformat}
> 2/14/2016 12:39:47 PMINFO 10:39:47 Updating topology for /10.136.64.120
> 12/14/2016 12:39:47 PMINFO 10:39:47 Updating topology for /10.136.64.120
> 12/14/2016 12:39:47 PMWARN 10:39:47 Uncaught exception on thread
> Thread[CounterMutationStage-111,5,main]: {}
> 12/14/2016 12:39:47 PMorg.apache.cassandra.exceptions.UnavailableException:
> Cannot achieve consistency level LOCAL_QUORUM
> 12/14/2016 12:39:47 PM at
> org.apache.cassandra.db.ConsistencyLevel.assureSufficientLiveNodes(ConsistencyLevel.java:313)
> ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM at
> org.apache.cassandra.service.AbstractWriteResponseHandler.assureSufficientLiveNodes(AbstractWriteResponseHandler.java:146)
> ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM at
> org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:1054)
> ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM at
> org.apache.cassandra.service.StorageProxy.applyCounterMutationOnLeader(StorageProxy.java:1450)
> ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM at
> org.apache.cassandra.db.CounterMutationVerbHandler.doVerb(CounterMutationVerbHandler.java:48)
> ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM at
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64)
> ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> ~[na:1.8.0_111]
> 12/14/2016 12:39:47 PM at
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
> ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM at
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136)
> [apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM at
> org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109)
> [apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM at java.lang.Thread.run(Thread.java:745)
> [na:1.8.0_111]
> 12/14/2016 12:39:47 PMWARN 10:39:47 Uncaught exception on thread
> Thread[CounterMutationStage-118,5,main]: {}
> 12/14/2016 12:39:47 PMorg.apache.cassandra.exceptions.UnavailableException:
> Cannot achieve consistency level LOCAL_QUORUM
> 12/14/2016 12:39:47 PM at
> org.apache.cassandra.db.ConsistencyLevel.assureSufficientLiveNodes(ConsistencyLevel.java:313)
> ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM at
> org.apache.cassandra.service.AbstractWriteResponseHandler.assureSufficientLiveNodes(AbstractWriteResponseHandler.java:146)
> ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM at
> org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:1054)
> ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM at
> org.apache.cassandra.service.StorageProxy.applyCounterMutationOnLeader(StorageProxy.java:1450)
> ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM at
> org.apache.cassandra.db.CounterMutationVerbHandler.doVerb(CounterMutationVerbHandler.java:48)
> ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM at
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64)
> ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> ~[na:1.8.0_111]
> 12/14/2016 12:39:47 PM at
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
> ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM at
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136)
> [apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM at
> org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109)
> [apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM at java.lang.Thread.run(Thread.java:745)
> [na:1.8.0_111]
> 12/14/2016 12:39:47 PMWARN 10:39:47 Uncaught exception on thread
> Thread[CounterMutationStage-164,5,main]: {}
> 12/14/2016 12:39:47 PMorg.apache.cassandra.exceptions.UnavailableException:
> Cannot achieve consistency level LOCAL_QUORUM
> 12/14/2016 12:39:47 PM at
> org.apache.cassandra.db.ConsistencyLevel.assureSufficientLiveNodes(ConsistencyLevel.java:313)
> ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM at
> org.apache.cassandra.service.AbstractWriteResponseHandler.assureSufficientLiveNodes(AbstractWriteResponseHandler.java:146)
> ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM at
> org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:1054)
> ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM at
> org.apache.cassandra.service.StorageProxy.applyCounterMutationOnLeader(StorageProxy.java:1450)
> ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM at
> org.apache.cassandra.db.CounterMutationVerbHandler.doVerb(CounterMutationVerbHandler.java:48)
> ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM at
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64)
> ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> ~[na:1.8.0_111]
> 12/14/2016 12:39:47 PM at
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
> ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM at
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136)
> [apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM at
> org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109)
> [apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM at java.lang.Thread.run(Thread.java:745)
> [na:1.8.0_111]
> 12/14/2016 12:39:47 PMWARN 10:39:47 Uncaught exception on thread
> Thread[CounterMutationStage-117,5,main]: {}
> 12/14/2016 12:39:47 PMorg.apache.cassandra.exceptions.UnavailableException:
> Cannot achieve consistency level LOCAL_QUORUM
> 12/14/2016 12:39:47 PM at
> org.apache.cassandra.db.ConsistencyLevel.assureSufficientLiveNodes(ConsistencyLevel.java:313)
> ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM at
> org.apache.cassandra.service.AbstractWriteResponseHandler.assureSufficientLiveNodes(AbstractWriteResponseHandler.java:146)
> ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM at
> org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:1054)
> ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM at
> org.apache.cassandra.service.StorageProxy.applyCounterMutationOnLeader(StorageProxy.java:1450)
> ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM at
> org.apache.cassandra.db.CounterMutationVerbHandler.doVerb(CounterMutationVerbHandler.java:48)
> ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM at
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64)
> ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> ~[na:1.8.0_111]
> 12/14/2016 12:39:47 PM at
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
> ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM at
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136)
> [apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM at
> org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109)
> [apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM at java.lang.Thread.run(Thread.java:745)
> [na:1.8.0_111]
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]