[jira] [Commented] (CASSANDRA-3730) If some streaming sessions fail on decommission, decommission hangs

2012-08-20 Thread Vitalii Tymchyshyn (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13438100#comment-13438100
 ] 

Vitalii Tymchyshyn commented on CASSANDRA-3730:
---

As for me it can do exactly as if decomissioning node was restarted. 

 If some streaming sessions fail on decommission, decommission hangs
 ---

 Key: CASSANDRA-3730
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3730
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.0
 Environment: FreeBSD
Reporter: Vitalii Tymchyshyn
  Labels: streaming

 Currently cassandra do not handle StreamOutSession fails, e.g.:
 // Instead of just not calling the callback on failure, we could have
 // allow to register a specific callback for failures, but we leave
 // that to a future ticket (likely CASSANDRA-3112)
 if (callback != null  success)
 callback.run();
 This means that if during decommission a node that receives decommission data 
 fails or (my case) the node that tries to decommission becomes overloaded, 
 the streaming session fails and decommission don't know anything about this. 
 This makes it hard to decommission overloaded nodes because I need to restart 
 the node to restart decommission.
 Also I can see next errors because of streaming files try to get streaming 
 session that is closed by gossip:
 ERROR [Streaming to /10.112.0.216:1] 2012-01-11 15:57:28,882 
 AbstractCassandraDaemon.java (line 138) Fatal exception in thread 
 Thread[Streaming to /10.112.0.216:1,5,main]
 java.lang.NullPointerException
 at 
 org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:97)
 at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
 at java.lang.Thread.run(Thread.java:679)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3730) If some streaming sessions fail on decommission, decommission hangs

2012-01-27 Thread Vitalii Tymchyshyn (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13194630#comment-13194630
 ] 

Vitalii Tymchyshyn commented on CASSANDRA-3730:
---

I've introduced simplistic handling that should at least abort decommission or 
move command with problematic streaming sessions: 
https://github.com/apache/cassandra/pull/6

 If some streaming sessions fail on decommission, decommission hangs
 ---

 Key: CASSANDRA-3730
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3730
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1
 Environment: FreeBSD
Reporter: Vitalii Tymchyshyn

 Currently cassandra do not handle StreamOutSession fails, e.g.:
 // Instead of just not calling the callback on failure, we could have
 // allow to register a specific callback for failures, but we leave
 // that to a future ticket (likely CASSANDRA-3112)
 if (callback != null  success)
 callback.run();
 This means that if during decommission a node that receives decommission data 
 fails or (my case) the node that tries to decommission becomes overloaded, 
 the streaming session fails and decommission don't know anything about this. 
 This makes it hard to decommission overloaded nodes because I need to restart 
 the node to restart decommission.
 Also I can see next errors because of streaming files try to get streaming 
 session that is closed by gossip:
 ERROR [Streaming to /10.112.0.216:1] 2012-01-11 15:57:28,882 
 AbstractCassandraDaemon.java (line 138) Fatal exception in thread 
 Thread[Streaming to /10.112.0.216:1,5,main]
 java.lang.NullPointerException
 at 
 org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:97)
 at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
 at java.lang.Thread.run(Thread.java:679)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira