[jira] [Commented] (CASSANDRA-3730) If some streaming sessions fail on decommission, decommission hangs

Vitalii Tymchyshyn (Commented) (JIRA) Fri, 27 Jan 2012 04:34:04 -0800

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-3730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194630#comment-13194630
 ]


Vitalii Tymchyshyn commented on CASSANDRA-3730:
-----------------------------------------------

I've introduced simplistic handling that should at least abort decommission or 
move command with problematic streaming sessions: 
https://github.com/apache/cassandra/pull/6
                
> If some streaming sessions fail on decommission, decommission hangs
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-3730
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3730
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.1
>         Environment: FreeBSD
>            Reporter: Vitalii Tymchyshyn
>
> Currently cassandra do not handle StreamOutSession fails, e.g.:
>         // Instead of just not calling the callback on failure, we could have
>         // allow to register a specific callback for failures, but we leave
>         // that to a future ticket (likely CASSANDRA-3112)
>         if (callback != null && success)
>             callback.run();
> This means that if during decommission a node that receives decommission data 
> fails or (my case) the node that tries to decommission becomes overloaded, 
> the streaming session fails and decommission don't know anything about this. 
> This makes it hard to decommission overloaded nodes because I need to restart 
> the node to restart decommission.
> Also I can see next errors because of streaming files try to get streaming 
> session that is closed by gossip:
> ERROR [Streaming to /10.112.0.216:1] 2012-01-11 15:57:28,882 
> AbstractCassandraDaemon.java (line 138) Fatal exception in thread 
> Thread[Streaming to /10.112.0.216:1,5,main]
> java.lang.NullPointerException
>         at 
> org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:97)
>         at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>         at java.lang.Thread.run(Thread.java:679)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3730) If some streaming sessions fail on decommission, decommission hangs

Reply via email to