[
https://issues.apache.org/jira/browse/CASSANDRA-3730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194630#comment-13194630
]
Vitalii Tymchyshyn commented on CASSANDRA-3730:
-----------------------------------------------
I've introduced simplistic handling that should at least abort decommission or
move command with problematic streaming sessions:
https://github.com/apache/cassandra/pull/6
> If some streaming sessions fail on decommission, decommission hangs
> -------------------------------------------------------------------
>
> Key: CASSANDRA-3730
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3730
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Affects Versions: 1.1
> Environment: FreeBSD
> Reporter: Vitalii Tymchyshyn
>
> Currently cassandra do not handle StreamOutSession fails, e.g.:
> // Instead of just not calling the callback on failure, we could have
> // allow to register a specific callback for failures, but we leave
> // that to a future ticket (likely CASSANDRA-3112)
> if (callback != null && success)
> callback.run();
> This means that if during decommission a node that receives decommission data
> fails or (my case) the node that tries to decommission becomes overloaded,
> the streaming session fails and decommission don't know anything about this.
> This makes it hard to decommission overloaded nodes because I need to restart
> the node to restart decommission.
> Also I can see next errors because of streaming files try to get streaming
> session that is closed by gossip:
> ERROR [Streaming to /10.112.0.216:1] 2012-01-11 15:57:28,882
> AbstractCassandraDaemon.java (line 138) Fatal exception in thread
> Thread[Streaming to /10.112.0.216:1,5,main]
> java.lang.NullPointerException
> at
> org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:97)
> at
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> at java.lang.Thread.run(Thread.java:679)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira