[jira] [Updated] (CASSANDRA-8208) Inconsistent failure handling with repair
[ https://issues.apache.org/jira/browse/CASSANDRA-8208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuki Morishita updated CASSANDRA-8208: -- Component/s: Streaming and Messaging > Inconsistent failure handling with repair > - > > Key: CASSANDRA-8208 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8208 > Project: Cassandra > Issue Type: Bug > Components: Streaming and Messaging >Reporter: Marcus Eriksson >Assignee: Yuki Morishita > Labels: repair > Fix For: 2.2.0 beta 1 > > Attachments: 8208.txt > > > I think we introduced this with CASSANDRA-6455, problem is that we now treat > all repair futures as a single unit (Futures.allAsList(..)) which makes the > whole thing fail if one sub-future fails. Also, when one of those fail, we > notify nodetool that we failed and we stop the executor with shutdownNow() > which throws out any pending RepairJobs. > [~yukim] I think we used to be able to proceed with the other RepairSessions > even if one fails, right? If not, we should probably call cancel on the > RepairJob runnables which are in queue for the executor after calling > shutdownNow() in repairComplete() in StorageService. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8208) Inconsistent failure handling with repair
[ https://issues.apache.org/jira/browse/CASSANDRA-8208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuki Morishita updated CASSANDRA-8208: -- Labels: repair (was: ) > Inconsistent failure handling with repair > - > > Key: CASSANDRA-8208 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8208 > Project: Cassandra > Issue Type: Bug > Components: Streaming and Messaging >Reporter: Marcus Eriksson >Assignee: Yuki Morishita > Labels: repair > Fix For: 2.2.0 beta 1 > > Attachments: 8208.txt > > > I think we introduced this with CASSANDRA-6455, problem is that we now treat > all repair futures as a single unit (Futures.allAsList(..)) which makes the > whole thing fail if one sub-future fails. Also, when one of those fail, we > notify nodetool that we failed and we stop the executor with shutdownNow() > which throws out any pending RepairJobs. > [~yukim] I think we used to be able to proceed with the other RepairSessions > even if one fails, right? If not, we should probably call cancel on the > RepairJob runnables which are in queue for the executor after calling > shutdownNow() in repairComplete() in StorageService. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8208) Inconsistent failure handling with repair
[ https://issues.apache.org/jira/browse/CASSANDRA-8208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-8208: --- Issue Type: Improvement (was: Bug) Inconsistent failure handling with repair - Key: CASSANDRA-8208 URL: https://issues.apache.org/jira/browse/CASSANDRA-8208 Project: Cassandra Issue Type: Improvement Reporter: Marcus Eriksson Assignee: Yuki Morishita Fix For: 3.0 Attachments: 8208.txt I think we introduced this with CASSANDRA-6455, problem is that we now treat all repair futures as a single unit (Futures.allAsList(..)) which makes the whole thing fail if one sub-future fails. Also, when one of those fail, we notify nodetool that we failed and we stop the executor with shutdownNow() which throws out any pending RepairJobs. [~yukim] I think we used to be able to proceed with the other RepairSessions even if one fails, right? If not, we should probably call cancel on the RepairJob runnables which are in queue for the executor after calling shutdownNow() in repairComplete() in StorageService. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8208) Inconsistent failure handling with repair
[ https://issues.apache.org/jira/browse/CASSANDRA-8208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuki Morishita updated CASSANDRA-8208: -- Attachment: 8208.txt Also https://github.com/yukim/cassandra/tree/8208 Changed from allAsList to successfulAsList so that repair does not fail immediately. Also switched to addListener since successfulAsList does not fail. Inconsistent failure handling with repair - Key: CASSANDRA-8208 URL: https://issues.apache.org/jira/browse/CASSANDRA-8208 Project: Cassandra Issue Type: Bug Reporter: Marcus Eriksson Assignee: Yuki Morishita Fix For: 3.0 Attachments: 8208.txt I think we introduced this with CASSANDRA-6455, problem is that we now treat all repair futures as a single unit (Futures.allAsList(..)) which makes the whole thing fail if one sub-future fails. Also, when one of those fail, we notify nodetool that we failed and we stop the executor with shutdownNow() which throws out any pending RepairJobs. [~yukim] I think we used to be able to proceed with the other RepairSessions even if one fails, right? If not, we should probably call cancel on the RepairJob runnables which are in queue for the executor after calling shutdownNow() in repairComplete() in StorageService. -- This message was sent by Atlassian JIRA (v6.3.4#6332)