[jira] [Updated] (CASSANDRA-13020) Stuck in LEAVING state (Transferring all hints to null)

2017-03-10 Thread Stefan Podkowinski (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Podkowinski updated CASSANDRA-13020:
---
Assignee: Stefan Podkowinski
  Status: Patch Available  (was: Open)

||3.0||3.11||trunk||
|[branch|https://github.com/spodkowinski/cassandra/tree/CASSANDRA-13020-3.0]|[branch|https://github.com/spodkowinski/cassandra/tree/CASSANDRA-13020-3.11]|[branch|https://github.com/spodkowinski/cassandra/tree/CASSANDRA-13020-trunk]|
|[dtest|http://cassci.datastax.com/view/Dev/view/spodkowinski/job/spodkowinski-CASSANDRA-13020-3.0-dtest/]|[dtest|http://cassci.datastax.com/view/Dev/view/spodkowinski/job/spodkowinski-CASSANDRA-13020-3.11-dtest/]|[dtest|http://cassci.datastax.com/view/Dev/view/spodkowinski/job/spodkowinski-CASSANDRA-13020-trunk-dtest/]|
|[testall|http://cassci.datastax.com/view/Dev/view/spodkowinski/job/spodkowinski-CASSANDRA-13020-3.0-testall/]|[testall|http://cassci.datastax.com/view/Dev/view/spodkowinski/job/spodkowinski-CASSANDRA-13020-3.11-testall/]|[testall|http://cassci.datastax.com/view/Dev/view/spodkowinski/job/spodkowinski-CASSANDRA-13020-trunk-testall/]|

Anyone up for review? 

Assumptions:
tokenMetadata will only contain public broadcast addresses as keys, so we must 
not use the internal IP for retrieving the nodeID
Hints will only be streamed to public addresses in the end anyways

> Stuck in LEAVING state (Transferring all hints to null)
> ---
>
> Key: CASSANDRA-13020
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13020
> Project: Cassandra
>  Issue Type: Bug
>  Components: Streaming and Messaging
> Environment: v3.0.9
>Reporter: Aleksandr Ivanov
>Assignee: Stefan Podkowinski
>  Labels: decommission, hints
>
> I tried to decommission one node.
> Node sent all data to another node and got stuck in LEAVING state.
> Log message shows Exception in HintsDispatcher thread.
> Could it be reason of stuck in LEAVING state?
> command output:
> {noformat}
> root@cas-node6:~# time nodetool decommission
> error: null
> -- StackTrace --
> java.lang.NullPointerException
> at 
> java.util.concurrent.ConcurrentHashMap.replaceNode(ConcurrentHashMap.java:1106)
> at 
> java.util.concurrent.ConcurrentHashMap.remove(ConcurrentHashMap.java:1097)
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:203)
> at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
> at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
> at 
> java.util.concurrent.ConcurrentHashMap$ValueSpliterator.forEachRemaining(ConcurrentHashMap.java:3566)
> at 
> java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
> at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
> at 
> java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
> at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
> at 
> java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
> at 
> java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$TransferHintsTask.transfer(HintsDispatchExecutor.java:168)
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$TransferHintsTask.run(HintsDispatchExecutor.java:141)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> real147m7.483s
> user0m17.388s
> sys 0m1.968s
> {noformat}
> nodetool netstats:
> {noformat}
> root@cas-node6:~# nodetool netstats
> Mode: LEAVING
> Not sending any streams.
> Read Repair Statistics:
> Attempted: 35082
> Mismatch (Blocking): 18
> Mismatch (Background): 0
> Pool NameActive   Pending  Completed   Dropped
> Large messages  n/a 1  0 0
> Small messages  n/a 0   16109860   112
> Gossip messages n/a 0 287074 0
> {noformat}
> Log:
> {noformat}
> INFO  [RMI TCP Connection(58)-127.0.0.1] 2016-12-07 12:52:59,467 
> StorageService.java:1170 - LEAVING: sleeping 3 ms for batch processing 
> and pending range setup
> INFO  [RMI TCP Connection(58)-127.0.0.1] 2016-12-07 12:53:39,455 
> StorageService.java:1170 - LEAVING: replaying batch log 

[jira] [Updated] (CASSANDRA-13020) Stuck in LEAVING state (Transferring all hints to null)

2016-12-08 Thread Aleksandr Ivanov (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Ivanov updated CASSANDRA-13020:
-
Summary: Stuck in LEAVING state (Transferring all hints to null)  (was: 
Transferring all hints to null)

> Stuck in LEAVING state (Transferring all hints to null)
> ---
>
> Key: CASSANDRA-13020
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13020
> Project: Cassandra
>  Issue Type: Bug
>  Components: Streaming and Messaging
> Environment: v3.0.9
>Reporter: Aleksandr Ivanov
>  Labels: decommission, hints
>
> I tried to decommission one node.
> Node sent all data to another node and got stuck in LEAVING state.
> Log message shows Exception in HintsDispatcher thread.
> Could it be reason of stuck in LEAVING state?
> command output:
> {noformat}
> root@cas-node6:~# time nodetool decommission
> error: null
> -- StackTrace --
> java.lang.NullPointerException
> at 
> java.util.concurrent.ConcurrentHashMap.replaceNode(ConcurrentHashMap.java:1106)
> at 
> java.util.concurrent.ConcurrentHashMap.remove(ConcurrentHashMap.java:1097)
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:203)
> at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
> at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
> at 
> java.util.concurrent.ConcurrentHashMap$ValueSpliterator.forEachRemaining(ConcurrentHashMap.java:3566)
> at 
> java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
> at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
> at 
> java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
> at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
> at 
> java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
> at 
> java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$TransferHintsTask.transfer(HintsDispatchExecutor.java:168)
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$TransferHintsTask.run(HintsDispatchExecutor.java:141)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> real147m7.483s
> user0m17.388s
> sys 0m1.968s
> {noformat}
> nodetool netstats:
> {noformat}
> root@cas-node6:~# nodetool netstats
> Mode: LEAVING
> Not sending any streams.
> Read Repair Statistics:
> Attempted: 35082
> Mismatch (Blocking): 18
> Mismatch (Background): 0
> Pool NameActive   Pending  Completed   Dropped
> Large messages  n/a 1  0 0
> Small messages  n/a 0   16109860   112
> Gossip messages n/a 0 287074 0
> {noformat}
> Log:
> {noformat}
> INFO  [RMI TCP Connection(58)-127.0.0.1] 2016-12-07 12:52:59,467 
> StorageService.java:1170 - LEAVING: sleeping 3 ms for batch processing 
> and pending range setup
> INFO  [RMI TCP Connection(58)-127.0.0.1] 2016-12-07 12:53:39,455 
> StorageService.java:1170 - LEAVING: replaying batch log and streaming data to 
> other nodes
> INFO  [RMI TCP Connection(58)-127.0.0.1] 2016-12-07 12:53:39,910 
> StreamResultFuture.java:87 - [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] 
> Executing streaming plan for Unbootstrap
> INFO  [StreamConnectionEstablisher:1] 2016-12-07 12:53:39,911 
> StreamSession.java:239 - [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] 
> Starting streaming to /10.10.10.17
> INFO  [StreamConnectionEstablisher:2] 2016-12-07 12:53:39,911 
> StreamSession.java:232 - [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] 
> Session does not have any tasks.
> INFO  [StreamConnectionEstablisher:3] 2016-12-07 12:53:39,912 
> StreamSession.java:232 - [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] 
> Session does not have any tasks.
> INFO  [StreamConnectionEstablisher:4] 2016-12-07 12:53:39,912 
> StreamSession.java:232 - [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] 
> Session does not have any tasks.
> INFO  [RMI TCP Connection(58)-127.0.0.1] 2016-12-07 12:53:39,912 
> StorageService.java:1170 - LEAVING: streaming hints to other nodes
> INFO  [StreamConnectionEstablisher:2] 2016-12-07 12:53:39,912 
> StreamResultFuture.java:183