[jira] [Comment Edited] (CASSANDRA-13043) UnavailabeException caused by counter writes forwarded to leaders without complete cluster view
[ https://issues.apache.org/jira/browse/CASSANDRA-13043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16167923#comment-16167923 ] Stefano Ortolani edited comment on CASSANDRA-13043 at 9/15/17 2:19 PM: --- Thanks a lot for the feedback! I am submitting a new patch [^13043-3.0.patch] was (Author: ostefano): Thanks a lot for the feedback! I am submitting a new patch. > UnavailabeException caused by counter writes forwarded to leaders without > complete cluster view > --- > > Key: CASSANDRA-13043 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13043 > Project: Cassandra > Issue Type: Bug > Components: Coordination > Environment: Debian >Reporter: Catalin Alexandru Zamfir >Assignee: Stefano Ortolani >Priority: Minor > Fix For: 3.0.x, 3.11.x > > Attachments: 13043-3.0.patch, patch.diff > > > In version 3.9 of Cassandra, we get the following exceptions on the > system.log whenever booting an agent. They seem to grow in number with each > reboot. Any idea where they come from or what can we do about them? Note that > the cluster is healthy (has sufficient live nodes). > {noformat} > 2/14/2016 12:39:47 PMINFO 10:39:47 Updating topology for /10.136.64.120 > 12/14/2016 12:39:47 PMINFO 10:39:47 Updating topology for /10.136.64.120 > 12/14/2016 12:39:47 PMWARN 10:39:47 Uncaught exception on thread > Thread[CounterMutationStage-111,5,main]: {} > 12/14/2016 12:39:47 PMorg.apache.cassandra.exceptions.UnavailableException: > Cannot achieve consistency level LOCAL_QUORUM > 12/14/2016 12:39:47 PMat > org.apache.cassandra.db.ConsistencyLevel.assureSufficientLiveNodes(ConsistencyLevel.java:313) > ~[apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.service.AbstractWriteResponseHandler.assureSufficientLiveNodes(AbstractWriteResponseHandler.java:146) > ~[apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:1054) > ~[apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.service.StorageProxy.applyCounterMutationOnLeader(StorageProxy.java:1450) > ~[apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.db.CounterMutationVerbHandler.doVerb(CounterMutationVerbHandler.java:48) > ~[apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) > ~[apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[na:1.8.0_111] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164) > ~[apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136) > [apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) > [apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat java.lang.Thread.run(Thread.java:745) > [na:1.8.0_111] > 12/14/2016 12:39:47 PMWARN 10:39:47 Uncaught exception on thread > Thread[CounterMutationStage-118,5,main]: {} > 12/14/2016 12:39:47 PMorg.apache.cassandra.exceptions.UnavailableException: > Cannot achieve consistency level LOCAL_QUORUM > 12/14/2016 12:39:47 PMat > org.apache.cassandra.db.ConsistencyLevel.assureSufficientLiveNodes(ConsistencyLevel.java:313) > ~[apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.service.AbstractWriteResponseHandler.assureSufficientLiveNodes(AbstractWriteResponseHandler.java:146) > ~[apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:1054) > ~[apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.service.StorageProxy.applyCounterMutationOnLeader(StorageProxy.java:1450) > ~[apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.db.CounterMutationVerbHandler.doVerb(CounterMutationVerbHandler.java:48) > ~[apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) > ~[apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[na:1.8.0_111] > 12/14/2016 12:39:47 PMat >
[jira] [Comment Edited] (CASSANDRA-13043) UnavailabeException caused by counter writes forwarded to leaders without complete cluster view
[ https://issues.apache.org/jira/browse/CASSANDRA-13043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149262#comment-16149262 ] Stefano Ortolani edited comment on CASSANDRA-13043 at 9/4/17 10:58 AM: --- Some updates: added to ccm the ability to send a byteman rule when restarting a node (https://github.com/ostefano/ccm/tree/startup_byteman). Originally it was only possible to submit a rule _after_ the node started, which made reproducing this race quite impossible. With that I could finally reproduce the bug. Here you can my branch with a new DTEST exercising the bug: https://github.com/ostefano/cassandra-dtest/tree/CASSANDRA-13043 Attached you can find a patch fixing the bug by removing nodes that are not RPC ready from a leader election. Also, I fixed the corner case where the CL required is local and the current DC does not have available nodes. Here you can find the commit: https://github.com/ostefano/cassandra/commit/91cc9b4398009a3cee3004bc11a047c056fda6a6 update: unit tests are passing was (Author: ostefano): Some updates: added to ccm the ability to send a byteman rule when restarting a node (https://github.com/ostefano/ccm/tree/startup_byteman). Originally it was only possible to submit a rule _after_ the node started, which made reproducing this race quite impossible. With that I could finally reproduce the bug. Here you can my branch with a new DTEST exercising the bug: https://github.com/ostefano/cassandra-dtest/tree/CASSANDRA-13043 Attached you can find a patch fixing the bug by removing nodes that are not RPC ready from a leader election. Also, I fixed the corner case where the CL required is local and the current DC does not have available nodes. Here you can find the commit: https://github.com/ostefano/cassandra/commit/91cc9b4398009a3cee3004bc11a047c056fda6a6 > UnavailabeException caused by counter writes forwarded to leaders without > complete cluster view > --- > > Key: CASSANDRA-13043 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13043 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: Debian >Reporter: Catalin Alexandru Zamfir > Attachments: patch.diff > > > In version 3.9 of Cassandra, we get the following exceptions on the > system.log whenever booting an agent. They seem to grow in number with each > reboot. Any idea where they come from or what can we do about them? Note that > the cluster is healthy (has sufficient live nodes). > {noformat} > 2/14/2016 12:39:47 PMINFO 10:39:47 Updating topology for /10.136.64.120 > 12/14/2016 12:39:47 PMINFO 10:39:47 Updating topology for /10.136.64.120 > 12/14/2016 12:39:47 PMWARN 10:39:47 Uncaught exception on thread > Thread[CounterMutationStage-111,5,main]: {} > 12/14/2016 12:39:47 PMorg.apache.cassandra.exceptions.UnavailableException: > Cannot achieve consistency level LOCAL_QUORUM > 12/14/2016 12:39:47 PMat > org.apache.cassandra.db.ConsistencyLevel.assureSufficientLiveNodes(ConsistencyLevel.java:313) > ~[apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.service.AbstractWriteResponseHandler.assureSufficientLiveNodes(AbstractWriteResponseHandler.java:146) > ~[apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:1054) > ~[apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.service.StorageProxy.applyCounterMutationOnLeader(StorageProxy.java:1450) > ~[apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.db.CounterMutationVerbHandler.doVerb(CounterMutationVerbHandler.java:48) > ~[apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) > ~[apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[na:1.8.0_111] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164) > ~[apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136) > [apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) > [apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat java.lang.Thread.run(Thread.java:745) > [na:1.8.0_111] > 12/14/2016 12:39:47 PMWARN 10:39:47 Uncaught exception on thread >
[jira] [Comment Edited] (CASSANDRA-13043) UnavailabeException caused by counter writes forwarded to leaders without complete cluster view
[ https://issues.apache.org/jira/browse/CASSANDRA-13043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149262#comment-16149262 ] Stefano Ortolani edited comment on CASSANDRA-13043 at 9/4/17 10:44 AM: --- Some updates: added to ccm the ability to send a byteman rule when restarting a node (https://github.com/ostefano/ccm/tree/startup_byteman). Originally it was only possible to submit a rule _after_ the node started, which made reproducing this race quite impossible. With that I could finally reproduce the bug. Here you can my branch with a new DTEST exercising the bug: https://github.com/ostefano/cassandra-dtest/tree/CASSANDRA-13043 Attached you can find a patch fixing the bug by removing nodes that are not RPC ready from a leader election. Also, I fixed the corner case where the CL required is local and the current DC does not have available nodes. Here you can find the commit: https://github.com/ostefano/cassandra/commit/91cc9b4398009a3cee3004bc11a047c056fda6a6 was (Author: ostefano): Some updates: added to ccm the ability to send a byteman rule when restarting a node (https://github.com/ostefano/ccm/tree/startup_byteman). This allowed me to finally reproduce the bug: https://github.com/ostefano/cassandra-dtest/tree/CASSANDRA-13043 Attached a patch that try to fix it by removing from a leader election nodes that are not RPC ready. Also, fixed corner case where the CL required is local. > UnavailabeException caused by counter writes forwarded to leaders without > complete cluster view > --- > > Key: CASSANDRA-13043 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13043 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: Debian >Reporter: Catalin Alexandru Zamfir > Attachments: patch.diff > > > In version 3.9 of Cassandra, we get the following exceptions on the > system.log whenever booting an agent. They seem to grow in number with each > reboot. Any idea where they come from or what can we do about them? Note that > the cluster is healthy (has sufficient live nodes). > {noformat} > 2/14/2016 12:39:47 PMINFO 10:39:47 Updating topology for /10.136.64.120 > 12/14/2016 12:39:47 PMINFO 10:39:47 Updating topology for /10.136.64.120 > 12/14/2016 12:39:47 PMWARN 10:39:47 Uncaught exception on thread > Thread[CounterMutationStage-111,5,main]: {} > 12/14/2016 12:39:47 PMorg.apache.cassandra.exceptions.UnavailableException: > Cannot achieve consistency level LOCAL_QUORUM > 12/14/2016 12:39:47 PMat > org.apache.cassandra.db.ConsistencyLevel.assureSufficientLiveNodes(ConsistencyLevel.java:313) > ~[apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.service.AbstractWriteResponseHandler.assureSufficientLiveNodes(AbstractWriteResponseHandler.java:146) > ~[apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:1054) > ~[apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.service.StorageProxy.applyCounterMutationOnLeader(StorageProxy.java:1450) > ~[apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.db.CounterMutationVerbHandler.doVerb(CounterMutationVerbHandler.java:48) > ~[apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) > ~[apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[na:1.8.0_111] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164) > ~[apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136) > [apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) > [apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat java.lang.Thread.run(Thread.java:745) > [na:1.8.0_111] > 12/14/2016 12:39:47 PMWARN 10:39:47 Uncaught exception on thread > Thread[CounterMutationStage-118,5,main]: {} > 12/14/2016 12:39:47 PMorg.apache.cassandra.exceptions.UnavailableException: > Cannot achieve consistency level LOCAL_QUORUM > 12/14/2016 12:39:47 PMat > org.apache.cassandra.db.ConsistencyLevel.assureSufficientLiveNodes(ConsistencyLevel.java:313) > ~[apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat >
[jira] [Comment Edited] (CASSANDRA-13043) UnavailabeException caused by counter writes forwarded to leaders without complete cluster view
[ https://issues.apache.org/jira/browse/CASSANDRA-13043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149262#comment-16149262 ] Stefano Ortolani edited comment on CASSANDRA-13043 at 9/3/17 3:55 PM: -- Some updates: added to ccm the ability to send a byteman rule when restarting a node (https://github.com/ostefano/ccm/tree/startup_byteman). This allowed me to finally reproduce the bug: https://github.com/ostefano/cassandra-dtest/tree/CASSANDRA-13043 Attached a patch that try to fix it by removing from a leader election nodes that are not RPC ready. Also, fixed corner case where the CL required is local. was (Author: ostefano): Some updates: * Added to ccm the ability to send a byteman rule when restarting a node (https://github.com/ostefano/ccm/tree/startup_byteman). * Not trying to slow down the gossip anymore, but rather I instruct the other two nodes to pick the restarting node as leader. This allowed me to finally reproduce the bug: https://github.com/ostefano/cassandra-dtest/tree/CASSANDRA-13043 The way I plan to fix is to make `assureSufficientLiveNodes` wait for the gossip to settle. What do you think, [~iamaleksey]? Would that be a sound approach to fix it? > UnavailabeException caused by counter writes forwarded to leaders without > complete cluster view > --- > > Key: CASSANDRA-13043 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13043 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: Debian >Reporter: Catalin Alexandru Zamfir > Attachments: patch.diff > > > In version 3.9 of Cassandra, we get the following exceptions on the > system.log whenever booting an agent. They seem to grow in number with each > reboot. Any idea where they come from or what can we do about them? Note that > the cluster is healthy (has sufficient live nodes). > {noformat} > 2/14/2016 12:39:47 PMINFO 10:39:47 Updating topology for /10.136.64.120 > 12/14/2016 12:39:47 PMINFO 10:39:47 Updating topology for /10.136.64.120 > 12/14/2016 12:39:47 PMWARN 10:39:47 Uncaught exception on thread > Thread[CounterMutationStage-111,5,main]: {} > 12/14/2016 12:39:47 PMorg.apache.cassandra.exceptions.UnavailableException: > Cannot achieve consistency level LOCAL_QUORUM > 12/14/2016 12:39:47 PMat > org.apache.cassandra.db.ConsistencyLevel.assureSufficientLiveNodes(ConsistencyLevel.java:313) > ~[apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.service.AbstractWriteResponseHandler.assureSufficientLiveNodes(AbstractWriteResponseHandler.java:146) > ~[apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:1054) > ~[apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.service.StorageProxy.applyCounterMutationOnLeader(StorageProxy.java:1450) > ~[apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.db.CounterMutationVerbHandler.doVerb(CounterMutationVerbHandler.java:48) > ~[apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) > ~[apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[na:1.8.0_111] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164) > ~[apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136) > [apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) > [apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat java.lang.Thread.run(Thread.java:745) > [na:1.8.0_111] > 12/14/2016 12:39:47 PMWARN 10:39:47 Uncaught exception on thread > Thread[CounterMutationStage-118,5,main]: {} > 12/14/2016 12:39:47 PMorg.apache.cassandra.exceptions.UnavailableException: > Cannot achieve consistency level LOCAL_QUORUM > 12/14/2016 12:39:47 PMat > org.apache.cassandra.db.ConsistencyLevel.assureSufficientLiveNodes(ConsistencyLevel.java:313) > ~[apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.service.AbstractWriteResponseHandler.assureSufficientLiveNodes(AbstractWriteResponseHandler.java:146) > ~[apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat >
[jira] [Comment Edited] (CASSANDRA-13043) UnavailabeException caused by counter writes forwarded to leaders without complete cluster view
[ https://issues.apache.org/jira/browse/CASSANDRA-13043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149262#comment-16149262 ] Stefano Ortolani edited comment on CASSANDRA-13043 at 8/31/17 5:09 PM: --- Some updates: * Added to ccm the ability to send a byteman rule when restarting a node (https://github.com/ostefano/ccm/tree/startup_byteman). * Not trying to slow down the gossip anymore, but rather I instruct the other two nodes to pick the restarting node as leader. This allowed me to finally reproduce the bug: https://github.com/ostefano/cassandra-dtest/tree/CASSANDRA-13043 The way I plan to fix is to make `assureSufficientLiveNodes` wait for the gossip to settle. What do you think, [~iamaleksey]? Would that be a sound approach to fix it? was (Author: ostefano): Some updates: * Added to ccm the ability to send a byteman rule when restarting a node (https://github.com/ostefano/ccm/tree/startup_byteman). * Not trying to slow down the gossip anymore, but rather I instruct the other two nodes to pick the restarting node as leader. This allowed me to finally reproduce the bug: https://github.com/ostefano/cassandra-dtest/tree/CASSANDRA-13043 The way I plan to fix is to make `assureSufficientLiveNodes` wait for the gossip to settle. What do you think, [~iamaleksey]? Would that work? > UnavailabeException caused by counter writes forwarded to leaders without > complete cluster view > --- > > Key: CASSANDRA-13043 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13043 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: Debian >Reporter: Catalin Alexandru Zamfir > > In version 3.9 of Cassandra, we get the following exceptions on the > system.log whenever booting an agent. They seem to grow in number with each > reboot. Any idea where they come from or what can we do about them? Note that > the cluster is healthy (has sufficient live nodes). > {noformat} > 2/14/2016 12:39:47 PMINFO 10:39:47 Updating topology for /10.136.64.120 > 12/14/2016 12:39:47 PMINFO 10:39:47 Updating topology for /10.136.64.120 > 12/14/2016 12:39:47 PMWARN 10:39:47 Uncaught exception on thread > Thread[CounterMutationStage-111,5,main]: {} > 12/14/2016 12:39:47 PMorg.apache.cassandra.exceptions.UnavailableException: > Cannot achieve consistency level LOCAL_QUORUM > 12/14/2016 12:39:47 PMat > org.apache.cassandra.db.ConsistencyLevel.assureSufficientLiveNodes(ConsistencyLevel.java:313) > ~[apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.service.AbstractWriteResponseHandler.assureSufficientLiveNodes(AbstractWriteResponseHandler.java:146) > ~[apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:1054) > ~[apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.service.StorageProxy.applyCounterMutationOnLeader(StorageProxy.java:1450) > ~[apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.db.CounterMutationVerbHandler.doVerb(CounterMutationVerbHandler.java:48) > ~[apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) > ~[apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[na:1.8.0_111] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164) > ~[apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136) > [apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) > [apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat java.lang.Thread.run(Thread.java:745) > [na:1.8.0_111] > 12/14/2016 12:39:47 PMWARN 10:39:47 Uncaught exception on thread > Thread[CounterMutationStage-118,5,main]: {} > 12/14/2016 12:39:47 PMorg.apache.cassandra.exceptions.UnavailableException: > Cannot achieve consistency level LOCAL_QUORUM > 12/14/2016 12:39:47 PMat > org.apache.cassandra.db.ConsistencyLevel.assureSufficientLiveNodes(ConsistencyLevel.java:313) > ~[apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.service.AbstractWriteResponseHandler.assureSufficientLiveNodes(AbstractWriteResponseHandler.java:146) > ~[apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PM
[jira] [Comment Edited] (CASSANDRA-13043) UnavailabeException caused by counter writes forwarded to leaders without complete cluster view
[ https://issues.apache.org/jira/browse/CASSANDRA-13043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16141592#comment-16141592 ] Stefano Ortolani edited comment on CASSANDRA-13043 at 8/25/17 1:16 PM: --- Hi [~iamaleksey], I am having some difficulties reproducing it. I am bootstrapping 3 nodes via cassandra-dtest, and updating the same counter 3 times in three different phases. During each phase I pick a node, stop it, and start it without waiting ({{no_wait=True, wait_other_notice=False}}). This while I keep inserting. Unfortunately no luck. I have also been trying byteman as well, with the idea of slowing down how quickly the starting node becomes aware of the topology. Unfortunately it seems I am either able to start the node without waiting, or submit a rule with byteman. If I don't want wait for the node to start, the rule submission fails; while if I wait, the rule doesn't trigger because the node is already up and running. Any suggestion how to proceed? was (Author: ostefano): Hi [~iamaleksey], I am having some difficulties reproducing it. I am bootstrapping 3 nodes via cassandra-dtest, and updating the same counter 3 times in three different phases. During each phase I pick a node, stop it, and start it without waiting (`no_wait=True, wait_other_notice=False`). This while I keep inserting. Unfortunately no luck. I have also been trying byteman as well, with the idea of slowing down how quickly the starting node becomes aware of the topology. Unfortunately it seems I am either able to start the node without waiting, or submit a rule with byteman. If I don't want wait for the node to start, the rule submission fails; while if I wait, the rule doesn't trigger because the node is already up and running. Any suggestion how to proceed? > UnavailabeException caused by counter writes forwarded to leaders without > complete cluster view > --- > > Key: CASSANDRA-13043 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13043 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: Debian >Reporter: Catalin Alexandru Zamfir > > In version 3.9 of Cassandra, we get the following exceptions on the > system.log whenever booting an agent. They seem to grow in number with each > reboot. Any idea where they come from or what can we do about them? Note that > the cluster is healthy (has sufficient live nodes). > {noformat} > 2/14/2016 12:39:47 PMINFO 10:39:47 Updating topology for /10.136.64.120 > 12/14/2016 12:39:47 PMINFO 10:39:47 Updating topology for /10.136.64.120 > 12/14/2016 12:39:47 PMWARN 10:39:47 Uncaught exception on thread > Thread[CounterMutationStage-111,5,main]: {} > 12/14/2016 12:39:47 PMorg.apache.cassandra.exceptions.UnavailableException: > Cannot achieve consistency level LOCAL_QUORUM > 12/14/2016 12:39:47 PMat > org.apache.cassandra.db.ConsistencyLevel.assureSufficientLiveNodes(ConsistencyLevel.java:313) > ~[apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.service.AbstractWriteResponseHandler.assureSufficientLiveNodes(AbstractWriteResponseHandler.java:146) > ~[apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:1054) > ~[apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.service.StorageProxy.applyCounterMutationOnLeader(StorageProxy.java:1450) > ~[apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.db.CounterMutationVerbHandler.doVerb(CounterMutationVerbHandler.java:48) > ~[apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) > ~[apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[na:1.8.0_111] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164) > ~[apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136) > [apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat > org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) > [apache-cassandra-3.9.jar:3.9] > 12/14/2016 12:39:47 PMat java.lang.Thread.run(Thread.java:745) > [na:1.8.0_111] > 12/14/2016 12:39:47 PMWARN 10:39:47 Uncaught exception on thread > Thread[CounterMutationStage-118,5,main]: {} > 12/14/2016 12:39:47