[ 
https://issues.apache.org/jira/browse/CASSANDRA-15529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankitha updated CASSANDRA-15529:
--------------------------------
    Description: 
Hello Team, 

For the background of this problem, kindly refer this ticket 
https://issues.apache.org/jira/browse/CASSANDRA-15263
 ++

It was proposed that after the upgrade of the cluster from 2.1.16 to 3.11.4, we 
should no longer be seeing any type of exceptions (WARN/ERROR). 

But even after a month of upgrade we still see the below exceptions
{code:java}
WARN [ReadStage-231] 2020-01-23 02:29:11,137 
AbstractLocalAwareExecutorService.java:167 - Uncaught exception on thread 
Thread[ReadStage-231,5,main]: {}
 at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
 ~[apache-cassandra-3.11.4.jar:3.11.4]
 at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134)
 [apache-cassandra-3.11.4.jar:3.11.4]
  
{code}
*and* **
{code:java}
WARN [MutationStage-36] 2020-01-21 19:31:03,343 
AbstractLocalAwareExecutorService.java:167 - Uncaught exception on thread 
Thread[MutationStage-36,5,main]: {}
 at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
 ~[apache-cassandra-3.11.4.jar:3.11.4]
 at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134)
 [apache-cassandra-3.11.4.jar:3.11.4] .{code}
*Problem:*

If one of the node goes down, and we try to bring it up, the node hangs at a 
stage of reading Key-cache. Sometimes the node comes up after a long period of 
pause with the above *Mutation-stage* warnings. Other times , the node does not 
come up and we have to manually clear the Key-cache to bring up the node.
 Also,once the node is up we see lot of above *Read-Stage* warnings.

In some cases like below , gossip is shutting down and when trying to bring up 
the node it is getting stuck:
{code:java}
[StorageServiceShutdownHook] 2020-01-24 02:19:30,586 HintsService.java:209 - 
Paused hints dispatch
 INFO [HintsDispatcher:14410] 2020-01-24 02:19:30,593 
HintsDispatchExecutor.java:289 - Finished hinted handoff of file 
594f065a-3134-4a39-b00d-b87b0e4625ff-1579831570632-1.hints to endpoint 
/10.177.56.125: 594f065a-3134-4a39-b00d-b87b0e4625ff, partially
 INFO [StorageServiceShutdownHook] 2020-01-24 02:19:30,623 Server.java:176 - 
Stop listening for CQL clients
 INFO [StorageServiceShutdownHook] 2020-01-24 02:19:30,624 Gossiper.java:1551 - 
Announcing shutdown
 INFO [StorageServiceShutdownHook] 2020-01-24 02:19:30,625 
StorageService.java:2327 - Node ont-dce-cass-sal05-priv/10.103.56.25 state jump 
to shutdown
 INFO [HintsDispatcher:14411] 2020-01-24 02:19:30,682 
HintsDispatchExecutor.java:289 - Finished hinted handoff of file 
b224b069-5b16-42f9-971b-6ae8f8bbcf23-1579829423673-1.hints to endpoint 
/10.177.56.116: b224b069-5b16-42f9-971b-6ae8f8bbcf23, partially
 INFO [StorageServiceShutdownHook] 2020-01-24 02:19:32,629 
MessagingService.java:981 - Waiting for messaging service to quiesce
 INFO [ACCEPT-ont-dce-cass-sal05-priv/10.103.56.25] 2020-01-24 02:19:32,631 
MessagingService.java:1336 - MessagingService has terminated the accept() thread
 INFO [main] 2020-01-24 02:20:06,214 YamlConfigurationLoader.java:89 - 
Configuration location: 
file:/opt/cass/apache-cassandra-3.11.4/conf/cassandra.yamlWARN 
[MutationStage-12] 2020-01-24 02:24:57,608 
AbstractLocalAwareExecutorService.java:167 - Uncaught exception on thread 
Thread[MutationStage-12,5,main]: {}
 java.lang.AssertionError: null at 
org.apache.cassandra.utils.memory.AbstractAllocator.clone(AbstractAllocator.java:35)
 ~[apache-cassandra-3.11.4.jar:3.11.4]
 at 
org.apache.cassandra.db.RangeTombstoneList.clone(RangeTombstoneList.java:130) 
~[apache-cassandra-3.11.4.jar:3.11.4]
 at 
org.apache.cassandra.db.RangeTombstoneList.copy(RangeTombstoneList.java:119) 
~[apache-cassandra-3.11.4.jar:3.11.4]
 at 
org.apache.cassandra.db.MutableDeletionInfo.copy(MutableDeletionInfo.java:90) 
~[apache-cassandra-3.11.4.jar:3.11.4]
 at 
org.apache.cassandra.db.MutableDeletionInfo.copy(MutableDeletionInfo.java:33) 
~[apache-cassandra-3.11.4.jar:3.11.4]
 at 
org.apache.cassandra.db.partitions.AtomicBTreePartition.addAllWithSizeDelta(AtomicBTreePartition.java:141)
 ~[apache-cassandra-3.11.4.jar:3.11.4]
 at org.apache.cassandra.db.Memtable.put(Memtable.java:282) 
~[apache-cassandra-3.11.4.jar:3.11.4]
 at 
org.apache.cassandra.db.ColumnFamilyStore.apply(ColumnFamilyStore.java:1352) 
~[apache-cassandra-3.11.4.jar:3.11.4]
 at org.apache.cassandra.db.Keyspace.applyInternal(Keyspace.java:626) 
~[apache-cassandra-3.11.4.jar:3.11.4]
 at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:470) 
~[apache-cassandra-3.11.4.jar:3.11.4]
 at 
org.apache.cassandra.db.commitlog.CommitLogReplayer$MutationInitiator$1.runMayThrow(CommitLogReplayer.java:224)
 ~[apache-cassandra-3.11.4.jar:3.11.4]
 at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
~[apache-cassandra-3.11.4.jar:3.11.4]
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[na:1.8.0-internal]
 at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
 ~[apache-cassandra-3.11.4.jar:3.11.4]
 at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:114) 
[apache-cassandra-3.11.4.jar:3.11.4]
 at java.lang.Thread.run(Thread.java:748) [na:1.8.0-internal]
 ERROR [main] 2020-01-24 02:24:57,611 CassandraDaemon.java:749 - Exception 
encountered during startup
 java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
java.lang.AssertionError{code}

  was:
Hello Team, 

For the background of this problem, kindly refer this ticket 
https://issues.apache.org/jira/browse/CASSANDRA-15263
++

It was proposed that after the upgrade of the cluster from 2.1.16 to 3.11.4, we 
should no longer be seeing any type of exceptions (WARN/ERROR). 

But even after a month of upgrade we still see the below exceptions

*WARN [ReadStage-231]* 2020-01-23 02:29:11,137 
AbstractLocalAwareExecutorService.java:167 - Uncaught exception on thread 
Thread[ReadStage-231,5,main]: {}
 at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
 ~[apache-cassandra-3.11.4.jar:3.11.4]
 at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134)
 [apache-cassandra-3.11.4.jar:3.11.4]
 

*and*

 

*WARN [MutationStage-36]* 2020-01-21 19:31:03,343 
AbstractLocalAwareExecutorService.java:167 - Uncaught exception on thread 
Thread[MutationStage-36,5,main]: {}
 at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
 ~[apache-cassandra-3.11.4.jar:3.11.4]
 at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134)
 [apache-cassandra-3.11.4.jar:3.11.4] .

*Problem:*

If one of the node goes down, and we try to bring it up, the node hangs at a 
stage of reading Key-cache. Sometimes the node comes up after a long period of 
pause with the above *Mutation-stage* warnings. Other times , the node does not 
come up and we have to manually clear the Key-cache to bring up the node.
Also,once the node is up we see lot of above *Read-Stage* warnings.

In some cases like below , gossip is shutting down and when trying to bring up 
the node it is getting stuck:


[StorageServiceShutdownHook] 2020-01-24 02:19:30,586 HintsService.java:209 - 
Paused hints dispatch
INFO  [HintsDispatcher:14410] 2020-01-24 02:19:30,593 
HintsDispatchExecutor.java:289 - Finished hinted handoff of file 
594f065a-3134-4a39-b00d-b87b0e4625ff-1579831570632-1.hints to endpoint 
/10.177.56.125: 594f065a-3134-4a39-b00d-b87b0e4625ff, partially
INFO  [StorageServiceShutdownHook] 2020-01-24 02:19:30,623 Server.java:176 - 
Stop listening for CQL clients
INFO  [StorageServiceShutdownHook] 2020-01-24 02:19:30,624 Gossiper.java:1551 - 
Announcing shutdown
INFO  [StorageServiceShutdownHook] 2020-01-24 02:19:30,625 
StorageService.java:2327 - Node ont-dce-cass-sal05-priv/10.103.56.25 state jump 
to shutdown
INFO  [HintsDispatcher:14411] 2020-01-24 02:19:30,682 
HintsDispatchExecutor.java:289 - Finished hinted handoff of file 
b224b069-5b16-42f9-971b-6ae8f8bbcf23-1579829423673-1.hints to endpoint 
/10.177.56.116: b224b069-5b16-42f9-971b-6ae8f8bbcf23, partially
INFO  [StorageServiceShutdownHook] 2020-01-24 02:19:32,629 
MessagingService.java:981 - Waiting for messaging service to quiesce
INFO  [ACCEPT-ont-dce-cass-sal05-priv/10.103.56.25] 2020-01-24 02:19:32,631 
MessagingService.java:1336 - MessagingService has terminated the accept() thread
INFO  [main] 2020-01-24 02:20:06,214 YamlConfigurationLoader.java:89 - 
Configuration location: 
file:/opt/cass/apache-cassandra-3.11.4/conf/cassandra.yamlWARN  
[MutationStage-12] 2020-01-24 02:24:57,608 
AbstractLocalAwareExecutorService.java:167 - Uncaught exception on thread 
Thread[MutationStage-12,5,main]: {}
java.lang.AssertionError: null        at 
org.apache.cassandra.utils.memory.AbstractAllocator.clone(AbstractAllocator.java:35)
 ~[apache-cassandra-3.11.4.jar:3.11.4]
        at 
org.apache.cassandra.db.RangeTombstoneList.clone(RangeTombstoneList.java:130) 
~[apache-cassandra-3.11.4.jar:3.11.4]
        at 
org.apache.cassandra.db.RangeTombstoneList.copy(RangeTombstoneList.java:119) 
~[apache-cassandra-3.11.4.jar:3.11.4]
        at 
org.apache.cassandra.db.MutableDeletionInfo.copy(MutableDeletionInfo.java:90) 
~[apache-cassandra-3.11.4.jar:3.11.4]
        at 
org.apache.cassandra.db.MutableDeletionInfo.copy(MutableDeletionInfo.java:33) 
~[apache-cassandra-3.11.4.jar:3.11.4]
        at 
org.apache.cassandra.db.partitions.AtomicBTreePartition.addAllWithSizeDelta(AtomicBTreePartition.java:141)
 ~[apache-cassandra-3.11.4.jar:3.11.4]
        at org.apache.cassandra.db.Memtable.put(Memtable.java:282) 
~[apache-cassandra-3.11.4.jar:3.11.4]
        at 
org.apache.cassandra.db.ColumnFamilyStore.apply(ColumnFamilyStore.java:1352) 
~[apache-cassandra-3.11.4.jar:3.11.4]
        at org.apache.cassandra.db.Keyspace.applyInternal(Keyspace.java:626) 
~[apache-cassandra-3.11.4.jar:3.11.4]
        at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:470) 
~[apache-cassandra-3.11.4.jar:3.11.4]
        at 
org.apache.cassandra.db.commitlog.CommitLogReplayer$MutationInitiator$1.runMayThrow(CommitLogReplayer.java:224)
 ~[apache-cassandra-3.11.4.jar:3.11.4]
        at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
~[apache-cassandra-3.11.4.jar:3.11.4]
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[na:1.8.0-internal]
        at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
 ~[apache-cassandra-3.11.4.jar:3.11.4]
        at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:114) 
[apache-cassandra-3.11.4.jar:3.11.4]
        at java.lang.Thread.run(Thread.java:748) [na:1.8.0-internal]
ERROR [main] 2020-01-24 02:24:57,611 CassandraDaemon.java:749 - Exception 
encountered during startup
java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
java.lang.AssertionError


> AbstractLocalAwareExecutorService.java exceptions after upgrade from 2.1.16 
> to 3.11.4
> -------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-15529
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15529
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Cluster/Schema
>            Reporter: Pooja Nair
>            Priority: Urgent
>              Labels: 2.1.16, 3.11.4
>
> Hello Team, 
> For the background of this problem, kindly refer this ticket 
> https://issues.apache.org/jira/browse/CASSANDRA-15263
>  ++
> It was proposed that after the upgrade of the cluster from 2.1.16 to 3.11.4, 
> we should no longer be seeing any type of exceptions (WARN/ERROR). 
> But even after a month of upgrade we still see the below exceptions
> {code:java}
> WARN [ReadStage-231] 2020-01-23 02:29:11,137 
> AbstractLocalAwareExecutorService.java:167 - Uncaught exception on thread 
> Thread[ReadStage-231,5,main]: {}
>  at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
>  ~[apache-cassandra-3.11.4.jar:3.11.4]
>  at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134)
>  [apache-cassandra-3.11.4.jar:3.11.4]
>   
> {code}
> *and* **
> {code:java}
> WARN [MutationStage-36] 2020-01-21 19:31:03,343 
> AbstractLocalAwareExecutorService.java:167 - Uncaught exception on thread 
> Thread[MutationStage-36,5,main]: {}
>  at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
>  ~[apache-cassandra-3.11.4.jar:3.11.4]
>  at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134)
>  [apache-cassandra-3.11.4.jar:3.11.4] .{code}
> *Problem:*
> If one of the node goes down, and we try to bring it up, the node hangs at a 
> stage of reading Key-cache. Sometimes the node comes up after a long period 
> of pause with the above *Mutation-stage* warnings. Other times , the node 
> does not come up and we have to manually clear the Key-cache to bring up the 
> node.
>  Also,once the node is up we see lot of above *Read-Stage* warnings.
> In some cases like below , gossip is shutting down and when trying to bring 
> up the node it is getting stuck:
> {code:java}
> [StorageServiceShutdownHook] 2020-01-24 02:19:30,586 HintsService.java:209 - 
> Paused hints dispatch
>  INFO [HintsDispatcher:14410] 2020-01-24 02:19:30,593 
> HintsDispatchExecutor.java:289 - Finished hinted handoff of file 
> 594f065a-3134-4a39-b00d-b87b0e4625ff-1579831570632-1.hints to endpoint 
> /10.177.56.125: 594f065a-3134-4a39-b00d-b87b0e4625ff, partially
>  INFO [StorageServiceShutdownHook] 2020-01-24 02:19:30,623 Server.java:176 - 
> Stop listening for CQL clients
>  INFO [StorageServiceShutdownHook] 2020-01-24 02:19:30,624 Gossiper.java:1551 
> - Announcing shutdown
>  INFO [StorageServiceShutdownHook] 2020-01-24 02:19:30,625 
> StorageService.java:2327 - Node ont-dce-cass-sal05-priv/10.103.56.25 state 
> jump to shutdown
>  INFO [HintsDispatcher:14411] 2020-01-24 02:19:30,682 
> HintsDispatchExecutor.java:289 - Finished hinted handoff of file 
> b224b069-5b16-42f9-971b-6ae8f8bbcf23-1579829423673-1.hints to endpoint 
> /10.177.56.116: b224b069-5b16-42f9-971b-6ae8f8bbcf23, partially
>  INFO [StorageServiceShutdownHook] 2020-01-24 02:19:32,629 
> MessagingService.java:981 - Waiting for messaging service to quiesce
>  INFO [ACCEPT-ont-dce-cass-sal05-priv/10.103.56.25] 2020-01-24 02:19:32,631 
> MessagingService.java:1336 - MessagingService has terminated the accept() 
> thread
>  INFO [main] 2020-01-24 02:20:06,214 YamlConfigurationLoader.java:89 - 
> Configuration location: 
> file:/opt/cass/apache-cassandra-3.11.4/conf/cassandra.yamlWARN 
> [MutationStage-12] 2020-01-24 02:24:57,608 
> AbstractLocalAwareExecutorService.java:167 - Uncaught exception on thread 
> Thread[MutationStage-12,5,main]: {}
>  java.lang.AssertionError: null at 
> org.apache.cassandra.utils.memory.AbstractAllocator.clone(AbstractAllocator.java:35)
>  ~[apache-cassandra-3.11.4.jar:3.11.4]
>  at 
> org.apache.cassandra.db.RangeTombstoneList.clone(RangeTombstoneList.java:130) 
> ~[apache-cassandra-3.11.4.jar:3.11.4]
>  at 
> org.apache.cassandra.db.RangeTombstoneList.copy(RangeTombstoneList.java:119) 
> ~[apache-cassandra-3.11.4.jar:3.11.4]
>  at 
> org.apache.cassandra.db.MutableDeletionInfo.copy(MutableDeletionInfo.java:90) 
> ~[apache-cassandra-3.11.4.jar:3.11.4]
>  at 
> org.apache.cassandra.db.MutableDeletionInfo.copy(MutableDeletionInfo.java:33) 
> ~[apache-cassandra-3.11.4.jar:3.11.4]
>  at 
> org.apache.cassandra.db.partitions.AtomicBTreePartition.addAllWithSizeDelta(AtomicBTreePartition.java:141)
>  ~[apache-cassandra-3.11.4.jar:3.11.4]
>  at org.apache.cassandra.db.Memtable.put(Memtable.java:282) 
> ~[apache-cassandra-3.11.4.jar:3.11.4]
>  at 
> org.apache.cassandra.db.ColumnFamilyStore.apply(ColumnFamilyStore.java:1352) 
> ~[apache-cassandra-3.11.4.jar:3.11.4]
>  at org.apache.cassandra.db.Keyspace.applyInternal(Keyspace.java:626) 
> ~[apache-cassandra-3.11.4.jar:3.11.4]
>  at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:470) 
> ~[apache-cassandra-3.11.4.jar:3.11.4]
>  at 
> org.apache.cassandra.db.commitlog.CommitLogReplayer$MutationInitiator$1.runMayThrow(CommitLogReplayer.java:224)
>  ~[apache-cassandra-3.11.4.jar:3.11.4]
>  at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
> ~[apache-cassandra-3.11.4.jar:3.11.4]
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0-internal]
>  at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
>  ~[apache-cassandra-3.11.4.jar:3.11.4]
>  at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:114) 
> [apache-cassandra-3.11.4.jar:3.11.4]
>  at java.lang.Thread.run(Thread.java:748) [na:1.8.0-internal]
>  ERROR [main] 2020-01-24 02:24:57,611 CassandraDaemon.java:749 - Exception 
> encountered during startup
>  java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
> java.lang.AssertionError{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to