[ 
https://issues.apache.org/jira/browse/CASSANDRA-13020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Podkowinski updated CASSANDRA-13020:
-------------------------------------------
    Assignee: Stefan Podkowinski
      Status: Patch Available  (was: Open)

||3.0||3.11||trunk||
|[branch|https://github.com/spodkowinski/cassandra/tree/CASSANDRA-13020-3.0]|[branch|https://github.com/spodkowinski/cassandra/tree/CASSANDRA-13020-3.11]|[branch|https://github.com/spodkowinski/cassandra/tree/CASSANDRA-13020-trunk]|
|[dtest|http://cassci.datastax.com/view/Dev/view/spodkowinski/job/spodkowinski-CASSANDRA-13020-3.0-dtest/]|[dtest|http://cassci.datastax.com/view/Dev/view/spodkowinski/job/spodkowinski-CASSANDRA-13020-3.11-dtest/]|[dtest|http://cassci.datastax.com/view/Dev/view/spodkowinski/job/spodkowinski-CASSANDRA-13020-trunk-dtest/]|
|[testall|http://cassci.datastax.com/view/Dev/view/spodkowinski/job/spodkowinski-CASSANDRA-13020-3.0-testall/]|[testall|http://cassci.datastax.com/view/Dev/view/spodkowinski/job/spodkowinski-CASSANDRA-13020-3.11-testall/]|[testall|http://cassci.datastax.com/view/Dev/view/spodkowinski/job/spodkowinski-CASSANDRA-13020-trunk-testall/]|

Anyone up for review? 

Assumptions:
tokenMetadata will only contain public broadcast addresses as keys, so we must 
not use the internal IP for retrieving the nodeID
Hints will only be streamed to public addresses in the end anyways

> Stuck in LEAVING state (Transferring all hints to null)
> -------------------------------------------------------
>
>                 Key: CASSANDRA-13020
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13020
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Streaming and Messaging
>         Environment: v3.0.9
>            Reporter: Aleksandr Ivanov
>            Assignee: Stefan Podkowinski
>              Labels: decommission, hints
>
> I tried to decommission one node.
> Node sent all data to another node and got stuck in LEAVING state.
> Log message shows Exception in HintsDispatcher thread.
> Could it be reason of stuck in LEAVING state?
> command output:
> {noformat}
> root@cas-node6:~# time nodetool decommission
> error: null
> -- StackTrace --
> java.lang.NullPointerException
>         at 
> java.util.concurrent.ConcurrentHashMap.replaceNode(ConcurrentHashMap.java:1106)
>         at 
> java.util.concurrent.ConcurrentHashMap.remove(ConcurrentHashMap.java:1097)
>         at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:203)
>         at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
>         at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>         at 
> java.util.concurrent.ConcurrentHashMap$ValueSpliterator.forEachRemaining(ConcurrentHashMap.java:3566)
>         at 
> java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
>         at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
>         at 
> java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
>         at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
>         at 
> java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
>         at 
> java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
>         at 
> org.apache.cassandra.hints.HintsDispatchExecutor$TransferHintsTask.transfer(HintsDispatchExecutor.java:168)
>         at 
> org.apache.cassandra.hints.HintsDispatchExecutor$TransferHintsTask.run(HintsDispatchExecutor.java:141)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> real    147m7.483s
> user    0m17.388s
> sys     0m1.968s
> {noformat}
> nodetool netstats:
> {noformat}
> root@cas-node6:~# nodetool netstats
> Mode: LEAVING
> Not sending any streams.
> Read Repair Statistics:
> Attempted: 35082
> Mismatch (Blocking): 18
> Mismatch (Background): 0
> Pool Name                    Active   Pending      Completed   Dropped
> Large messages                  n/a         1              0         0
> Small messages                  n/a         0       16109860       112
> Gossip messages                 n/a         0         287074         0
> {noformat}
> Log:
> {noformat}
> INFO  [RMI TCP Connection(58)-127.0.0.1] 2016-12-07 12:52:59,467 
> StorageService.java:1170 - LEAVING: sleeping 30000 ms for batch processing 
> and pending range setup
> INFO  [RMI TCP Connection(58)-127.0.0.1] 2016-12-07 12:53:39,455 
> StorageService.java:1170 - LEAVING: replaying batch log and streaming data to 
> other nodes
> INFO  [RMI TCP Connection(58)-127.0.0.1] 2016-12-07 12:53:39,910 
> StreamResultFuture.java:87 - [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] 
> Executing streaming plan for Unbootstrap
> INFO  [StreamConnectionEstablisher:1] 2016-12-07 12:53:39,911 
> StreamSession.java:239 - [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] 
> Starting streaming to /10.10.10.17
> INFO  [StreamConnectionEstablisher:2] 2016-12-07 12:53:39,911 
> StreamSession.java:232 - [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] 
> Session does not have any tasks.
> INFO  [StreamConnectionEstablisher:3] 2016-12-07 12:53:39,912 
> StreamSession.java:232 - [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] 
> Session does not have any tasks.
> INFO  [StreamConnectionEstablisher:4] 2016-12-07 12:53:39,912 
> StreamSession.java:232 - [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] 
> Session does not have any tasks.
> INFO  [RMI TCP Connection(58)-127.0.0.1] 2016-12-07 12:53:39,912 
> StorageService.java:1170 - LEAVING: streaming hints to other nodes
> INFO  [StreamConnectionEstablisher:2] 2016-12-07 12:53:39,912 
> StreamResultFuture.java:183 - [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] 
> Session with /10.10.10.10 is complete
> INFO  [StreamConnectionEstablisher:4] 2016-12-07 12:53:39,912 
> StreamResultFuture.java:183 - [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] 
> Session with /10.10.10.11 is complete
> INFO  [StreamConnectionEstablisher:3] 2016-12-07 12:53:39,912 
> StreamResultFuture.java:183 - [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] 
> Session with /10.10.10.12 is complete
> INFO  [StreamConnectionEstablisher:5] 2016-12-07 12:53:39,912 
> StreamSession.java:239 - [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] 
> Starting streaming to /10.10.10.13
> INFO  [StreamConnectionEstablisher:6] 2016-12-07 12:53:39,912 
> StreamSession.java:232 - [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] 
> Session does not have any tasks.
> INFO  [StreamConnectionEstablisher:8] 2016-12-07 12:53:39,912 
> StreamSession.java:239 - [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] 
> Starting streaming to /10.10.10.14
> INFO  [RMI TCP Connection(58)-127.0.0.1] 2016-12-07 12:53:39,913 
> HintsService.java:218 - Resumed hints dispatch
> INFO  [StreamConnectionEstablisher:7] 2016-12-07 12:53:39,913 
> StreamSession.java:239 - [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] 
> Starting streaming to /10.10.10.15
> INFO  [StreamConnectionEstablisher:6] 2016-12-07 12:53:39,914 
> StreamResultFuture.java:183 - [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] 
> Session with /10.10.10.16 is complete
> INFO  [StreamConnectionEstablisher:2] 2016-12-07 12:53:39,914 
> StreamCoordinator.java:213 - [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb, 
> ID#0] Beginning stream session with /10.10.10.10
> INFO  [StreamConnectionEstablisher:4] 2016-12-07 12:53:39,914 
> StreamCoordinator.java:213 - [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb, 
> ID#0] Beginning stream session with /10.10.10.11
> INFO  [StreamConnectionEstablisher:3] 2016-12-07 12:53:39,914 
> StreamCoordinator.java:213 - [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb, 
> ID#0] Beginning stream session with /10.10.10.12
> INFO  [StreamConnectionEstablisher:6] 2016-12-07 12:53:39,917 
> StreamCoordinator.java:213 - [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb, 
> ID#0] Beginning stream session with /10.10.10.16
> INFO  [StreamConnectionEstablisher:2] 2016-12-07 12:53:39,917 
> StreamSession.java:232 - [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] 
> Session does not have any tasks.
> INFO  [StreamConnectionEstablisher:4] 2016-12-07 12:53:39,917 
> StreamSession.java:232 - [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] 
> Session does not have any tasks.
> INFO  [StreamConnectionEstablisher:6] 2016-12-07 12:53:39,917 
> StreamSession.java:239 - [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] 
> Starting streaming to /10.10.10.18
> INFO  [StreamConnectionEstablisher:3] 2016-12-07 12:53:39,917 
> StreamSession.java:239 - [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] 
> Starting streaming to /10.10.10.19
> INFO  [StreamConnectionEstablisher:2] 2016-12-07 12:53:39,917 
> StreamResultFuture.java:183 - [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] 
> Session with /10.10.10.20 is complete
> INFO  [StreamConnectionEstablisher:4] 2016-12-07 12:53:39,917 
> StreamResultFuture.java:183 - [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] 
> Session with /10.10.10.21 is complete
> INFO  [StreamConnectionEstablisher:2] 2016-12-07 12:53:39,920 
> StreamCoordinator.java:213 - [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb, 
> ID#0] Beginning stream session with /10.10.10.20
> INFO  [StreamConnectionEstablisher:4] 2016-12-07 12:53:39,920 
> StreamCoordinator.java:213 - [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb, 
> ID#0] Beginning stream session with /10.10.10.21
> INFO  [StreamConnectionEstablisher:2] 2016-12-07 12:53:39,920 
> StreamSession.java:232 - [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] 
> Session does not have any tasks.
> INFO  [StreamConnectionEstablisher:4] 2016-12-07 12:53:39,920 
> StreamSession.java:239 - [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] 
> Starting streaming to /10.10.10.22
> INFO  [StreamConnectionEstablisher:2] 2016-12-07 12:53:39,921 
> StreamResultFuture.java:183 - [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] 
> Session with /10.10.10.23 is complete
> INFO  [StreamConnectionEstablisher:2] 2016-12-07 12:53:39,921 
> StreamCoordinator.java:213 - [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb, 
> ID#0] Beginning stream session with /10.10.10.23
> INFO  [StreamConnectionEstablisher:2] 2016-12-07 12:53:39,921 
> StreamSession.java:232 - [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] 
> Session does not have any tasks.
> INFO  [StreamConnectionEstablisher:2] 2016-12-07 12:53:39,921 
> StreamResultFuture.java:183 - [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] 
> Session with /10.10.10.24 is complete
> INFO  [StreamConnectionEstablisher:2] 2016-12-07 12:53:39,921 
> StreamCoordinator.java:213 - [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb, 
> ID#0] Beginning stream session with /10.10.10.24
> INFO  [StreamConnectionEstablisher:2] 2016-12-07 12:53:39,921 
> StreamSession.java:232 - [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] 
> Session does not have any tasks.
> INFO  [StreamConnectionEstablisher:2] 2016-12-07 12:53:39,922 
> StreamResultFuture.java:183 - [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] 
> Session with /10.10.10.25 is complete
> INFO  [StreamConnectionEstablisher:2] 2016-12-07 12:53:39,922 
> StreamCoordinator.java:213 - [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb, 
> ID#0] Beginning stream session with /10.10.10.25
> INFO  [StreamConnectionEstablisher:2] 2016-12-07 12:53:39,922 
> StreamSession.java:239 - [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] 
> Starting streaming to /10.10.10.26 through /192.168.101.70
> INFO  [HintsDispatcher:1] 2016-12-07 12:53:39,926 
> HintsDispatchExecutor.java:140 - Transferring all hints to null
> ERROR [HintsDispatcher:1] 2016-12-07 12:53:39,928 CassandraDaemon.java:205 - 
> Exception in thread Thread[HintsDispatcher:1,1,main]
> java.lang.NullPointerException: null
>         at 
> java.util.concurrent.ConcurrentHashMap.replaceNode(ConcurrentHashMap.java:1106)
>  ~[na:1.8.0_92]
>         at 
> java.util.concurrent.ConcurrentHashMap.remove(ConcurrentHashMap.java:1097) 
> ~[na:1.8.0_92]
>         at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:203)
>  ~[apache-cassandra-3.0.9.jar:3.0.9]
>         at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184) 
> ~[na:1.8.0_92]
>         at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) 
> ~[na:1.8.0_92]
>         at 
> java.util.concurrent.ConcurrentHashMap$ValueSpliterator.forEachRemaining(ConcurrentHashMap.java:3566)
>  ~[na:1.8.0_92]
>         at 
> java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481) 
> ~[na:1.8.0_92]
>         at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471) 
> ~[na:1.8.0_92]
>         at 
> java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151) 
> ~[na:1.8.0_92]
>         at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
>  ~[na:1.8.0_92]
>         at 
> java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) 
> ~[na:1.8.0_92]
>         at 
> java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418) 
> ~[na:1.8.0_92]
>         at 
> org.apache.cassandra.hints.HintsDispatchExecutor$TransferHintsTask.transfer(HintsDispatchExecutor.java:168)
>  ~[apache-cassandra-3.0.9.jar:3.0.9]
>         at 
> org.apache.cassandra.hints.HintsDispatchExecutor$TransferHintsTask.run(HintsDispatchExecutor.java:141)
>  ~[apache-cassandra-3.0.9.jar:3.0.9]
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_92]
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[na:1.8.0_92]
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[na:1.8.0_92]
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_92]
>         at java.lang.Thread.run(Thread.java:745) [na:1.8.0_92]
> INFO  [StreamConnectionEstablisher:2] 2016-12-07 12:53:40,093 
> StreamResultFuture.java:169 - [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb 
> ID#0] Prepare completed. Receiving 0 files(0 bytes), sending 2 files(
> 133006 bytes)
> INFO  [StreamConnectionEstablisher:2] 2016-12-07 12:53:40,097 
> StreamCoordinator.java:213 - [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb, 
> ID#0] Beginning stream session with /10.10.10.26
> INFO  [StreamConnectionEstablisher:2] 2016-12-07 12:53:40,097 
> StreamSession.java:232 - [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] 
> Session does not have any tasks.
> ...
> INFO  [IndexSummaryManager:1] 2016-12-07 14:51:44,761 
> IndexSummaryRedistribution.java:74 - Redistributing index summaries
> INFO  [STREAM-IN-/10.10.10.27] 2016-12-07 15:20:06,006 
> StreamResultFuture.java:183 - [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] 
> Session with /10.10.10.27 is complete
> INFO  [STREAM-IN-/10.10.10.27] 2016-12-07 15:20:06,007 
> StreamResultFuture.java:215 - [Stream #2cc874c0-bc7c-11e6-b0df-e7f1ecd3dcfb] 
> All sessions completed
> {noformat}
> all real IPs replaced with IPs from 10.10.10.x network in log



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to