[ https://issues.apache.org/jira/browse/CASSANDRA-3730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194630#comment-13194630 ]
Vitalii Tymchyshyn commented on CASSANDRA-3730: ----------------------------------------------- I've introduced simplistic handling that should at least abort decommission or move command with problematic streaming sessions: https://github.com/apache/cassandra/pull/6 > If some streaming sessions fail on decommission, decommission hangs > ------------------------------------------------------------------- > > Key: CASSANDRA-3730 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3730 > Project: Cassandra > Issue Type: Bug > Components: Core > Affects Versions: 1.1 > Environment: FreeBSD > Reporter: Vitalii Tymchyshyn > > Currently cassandra do not handle StreamOutSession fails, e.g.: > // Instead of just not calling the callback on failure, we could have > // allow to register a specific callback for failures, but we leave > // that to a future ticket (likely CASSANDRA-3112) > if (callback != null && success) > callback.run(); > This means that if during decommission a node that receives decommission data > fails or (my case) the node that tries to decommission becomes overloaded, > the streaming session fails and decommission don't know anything about this. > This makes it hard to decommission overloaded nodes because I need to restart > the node to restart decommission. > Also I can see next errors because of streaming files try to get streaming > session that is closed by gossip: > ERROR [Streaming to /10.112.0.216:1] 2012-01-11 15:57:28,882 > AbstractCassandraDaemon.java (line 138) Fatal exception in thread > Thread[Streaming to /10.112.0.216:1,5,main] > java.lang.NullPointerException > at > org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:97) > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) > at java.lang.Thread.run(Thread.java:679) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira