[jira] [Commented] (CASSANDRA-10446) Run repair with down replicas
[ https://issues.apache.org/jira/browse/CASSANDRA-10446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16767466#comment-16767466 ] Blake Eggleston commented on CASSANDRA-10446: - [~laxmikant99] because it's a new feature. 3.11.x is for bugfixes only. > Run repair with down replicas > - > > Key: CASSANDRA-10446 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10446 > Project: Cassandra > Issue Type: Improvement >Reporter: sankalp kohli >Assignee: Blake Eggleston >Priority: Minor > Fix For: 4.0 > > > We should have an option of running repair when replicas are down. We can > call it -force. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-10446) Run repair with down replicas
[ https://issues.apache.org/jira/browse/CASSANDRA-10446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16765765#comment-16765765 ] Laxmikant Upadhyay commented on CASSANDRA-10446: Any specific reason of not introducing this change in 3.11.x ? > Run repair with down replicas > - > > Key: CASSANDRA-10446 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10446 > Project: Cassandra > Issue Type: Improvement >Reporter: sankalp kohli >Assignee: Blake Eggleston >Priority: Minor > Fix For: 4.0 > > > We should have an option of running repair when replicas are down. We can > call it -force. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-10446) Run repair with down replicas
[ https://issues.apache.org/jira/browse/CASSANDRA-10446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16034729#comment-16034729 ] Marcus Eriksson commented on CASSANDRA-10446: - LGTM, just missing a few brace-on-newline in the RepairSession changes feel free to fix that on commit > Run repair with down replicas > - > > Key: CASSANDRA-10446 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10446 > Project: Cassandra > Issue Type: Improvement >Reporter: sankalp kohli >Assignee: Blake Eggleston >Priority: Minor > Fix For: 4.0 > > > We should have an option of running repair when replicas are down. We can > call it -force. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-10446) Run repair with down replicas
[ https://issues.apache.org/jira/browse/CASSANDRA-10446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15723734#comment-15723734 ] Paulo Motta commented on CASSANDRA-10446: - Sounds good! > Run repair with down replicas > - > > Key: CASSANDRA-10446 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10446 > Project: Cassandra > Issue Type: Improvement >Reporter: sankalp kohli >Assignee: Blake Eggleston >Priority: Minor > Fix For: 4.0 > > > We should have an option of running repair when replicas are down. We can > call it -force. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10446) Run repair with down replicas
[ https://issues.apache.org/jira/browse/CASSANDRA-10446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15723526#comment-15723526 ] Blake Eggleston commented on CASSANDRA-10446: - bq. skipping anti-compaction on the coordinator is not sufficient, since anti-compaction is what cleans repair state on the replicas Good catch. Post CASSANDRA-9143, this will most likely no longer be the case. Why don't we wait until that gets committed before continuing with this one. > Run repair with down replicas > - > > Key: CASSANDRA-10446 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10446 > Project: Cassandra > Issue Type: Improvement >Reporter: sankalp kohli >Assignee: Blake Eggleston >Priority: Minor > Fix For: 4.0 > > > We should have an option of running repair when replicas are down. We can > call it -force. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10446) Run repair with down replicas
[ https://issues.apache.org/jira/browse/CASSANDRA-10446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15716721#comment-15716721 ] Paulo Motta commented on CASSANDRA-10446: - Sorry for the delay Blake, got sucked into other things... will try to be reply more promptly on the next round. skipping anti-compaction on the coordinator is not sufficient, since anti-compaction is what cleans repair state on the replicas, a simpler approach here is to set the parent repair session as {{!isGlobal}} when the force flag is set and this will already skip anti-compaction and set {{repairedAt}} as {{ActiveRepairService.UNREPAIRED_SSTABLE}}. By not needing to set the {{repairedAt}} dynamically per-repair session, we can probably simplify this a bit and move the force flag enforcement from {{RepairSession}}'s constructor to the alive check on {{RepairSession.start}}. What do you think? If you agree with the suggestions, after you submit a new patch, could you please rebase, prepare for commit and resubmit tests? Thanks! > Run repair with down replicas > - > > Key: CASSANDRA-10446 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10446 > Project: Cassandra > Issue Type: Improvement >Reporter: sankalp kohli >Assignee: Blake Eggleston >Priority: Minor > Fix For: 4.0 > > > We should have an option of running repair when replicas are down. We can > call it -force. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10446) Run repair with down replicas
[ https://issues.apache.org/jira/browse/CASSANDRA-10446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15663973#comment-15663973 ] Sylvain Lebresne commented on CASSANDRA-10446: -- bq. Likewise, when you trigger a repair {{--force}}, only a subset of the child repair sessions may have down nodes Make sense, that's the part I missed. Thanks. > Run repair with down replicas > - > > Key: CASSANDRA-10446 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10446 > Project: Cassandra > Issue Type: Improvement >Reporter: sankalp kohli >Assignee: Blake Eggleston >Priority: Minor > Fix For: 4.0 > > > We should have an option of running repair when replicas are down. We can > call it -force. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10446) Run repair with down replicas
[ https://issues.apache.org/jira/browse/CASSANDRA-10446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15663961#comment-15663961 ] Paulo Motta commented on CASSANDRA-10446: - bq. Which means we obviously cannot mark anything "repaired" if some node was down. This seems to be what the last patch is doing, but some of the discussions above seems to suggest this could be done differently in the future, after CASSANDRA-9143 in particular. Did I misread those discussions or did I miss something more fundamental? When you trigger a repair command (parent repair session) it will trigger many (child) repair sessions, typically one for each vnode subrange. In the end of the parent repair session, it will anti-compact only the ranges of successful child repair sessions, since a subset of the child repair sessions may have failed due to node failures or whatever, and so their ranges cannot be marked as repaired. Likewise, when you trigger a repair {{--force}}, only a subset of the child repair sessions may have down nodes, so we can still mark ranges of successful child repair sessions as repaired (the ones where all nodes were up), and this is what the patch is currently doing and will be kept after CASSANDRA-9143. What was brought here and might have confused things a bit is that in both cases (with and without {{--force}}), streamed sstables are always marked as repaired, what may cause problems in some edge failure scenarios (if a repair session fails after part of the syncs are completed), and this limitation in particular will be addressed on CASSANDRA-9143. Does this clarify your concerns or is there something else we may be missing? > Run repair with down replicas > - > > Key: CASSANDRA-10446 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10446 > Project: Cassandra > Issue Type: Improvement >Reporter: sankalp kohli >Assignee: Blake Eggleston >Priority: Minor > Fix For: 4.0 > > > We should have an option of running repair when replicas are down. We can > call it -force. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10446) Run repair with down replicas
[ https://issues.apache.org/jira/browse/CASSANDRA-10446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15663766#comment-15663766 ] Sylvain Lebresne commented on CASSANDRA-10446: -- I wouldn't mind some clarification on this and incremental repair. My understanding of incremental repair, as it's currently implemented, is that having a sstable marked "repaired" is a global property (it means "all the replicas have the data in that sstable"). Which means we obviously cannot mark anything "repaired" if some node was down. This seems to be what the last patch is doing, but some of the discussions above seems to suggest this could be done differently in the future, after CASSANDRA-9143 in particular. Did I misread those discussions or did I miss something more fundamental? > Run repair with down replicas > - > > Key: CASSANDRA-10446 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10446 > Project: Cassandra > Issue Type: Improvement >Reporter: sankalp kohli >Assignee: Blake Eggleston >Priority: Minor > Fix For: 4.0 > > > We should have an option of running repair when replicas are down. We can > call it -force. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10446) Run repair with down replicas
[ https://issues.apache.org/jira/browse/CASSANDRA-10446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15648625#comment-15648625 ] Paulo Motta commented on CASSANDRA-10446: - bq. doesn't CASSANDRA-6503 handle this issue? Good point! I didn't recall this so this is not as bad as I though initially but there is still at least one hairy scenario where things could go wrong: {noformat} A: unrepaired={1} repaired={} B: unrepaired={2} repaired={} C: unrepaired={3} repaired={} {noformat} During incremental repair, A sends key 1 to B and dies. B and C stream successful. At the end of the failed repair session, things will look like: {noformat} A: unrepaired={1} repaired={} B: unrepaired={2} repaired={1, 2, 3} C: unrepaired={3} repaired={2, 3} {noformat} If A dies permanently before next repair, key 1 will never be incrementally repaired between B and C. Likewise, if C dies, A will never get key 3 from B via incremental repair. Maybe this is such an edge case it that wouldn't justify a change per se, but if we defer setting repairedAt of streamed sstables to anti-compaction phase then we could make this slightly more correct while supporting session-based --force repair without adding a new repairedAt field to {{SyncRequest}}. > Run repair with down replicas > - > > Key: CASSANDRA-10446 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10446 > Project: Cassandra > Issue Type: Improvement >Reporter: sankalp kohli >Assignee: Blake Eggleston >Priority: Minor > Fix For: 4.0 > > > We should have an option of running repair when replicas are down. We can > call it -force. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10446) Run repair with down replicas
[ https://issues.apache.org/jira/browse/CASSANDRA-10446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15648584#comment-15648584 ] Blake Eggleston commented on CASSANDRA-10446: - In the {{--force}} case it doesn't because {{RepairMessageVerbHandler}} will apply the repairedAt value computed at the beginning of the parent session, even if some nodes are being left out of the repair. In the normal case, CASSANDRA-6503 helps, but the inconsistency is still possible because {{OnCompletionRunnable}} is run once a node has received all the files _it's_ expecting, but not necessarily before other nodes involved in the repair have received all their data, and there could still be a failure in that time. > Run repair with down replicas > - > > Key: CASSANDRA-10446 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10446 > Project: Cassandra > Issue Type: Improvement >Reporter: sankalp kohli >Assignee: Blake Eggleston >Priority: Minor > Fix For: 4.0 > > > We should have an option of running repair when replicas are down. We can > call it -force. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10446) Run repair with down replicas
[ https://issues.apache.org/jira/browse/CASSANDRA-10446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15648431#comment-15648431 ] Marcus Eriksson commented on CASSANDRA-10446: - doesn't CASSANDRA-6503 handle this issue? > Run repair with down replicas > - > > Key: CASSANDRA-10446 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10446 > Project: Cassandra > Issue Type: Improvement >Reporter: sankalp kohli >Assignee: Blake Eggleston >Priority: Minor > Fix For: 4.0 > > > We should have an option of running repair when replicas are down. We can > call it -force. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10446) Run repair with down replicas
[ https://issues.apache.org/jira/browse/CASSANDRA-10446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15645856#comment-15645856 ] Blake Eggleston commented on CASSANDRA-10446: - Oh wow, good catch, that's not good. So that issue, and several others, will be addressed in CASSANDRA-9143. I'm hoping to post a patch for it by the end of this week. Since that should address the fundamental issue of data being misclassified as repaired I've just pushed a commit up to my branch that doesn't set repairedAt, or run anti-compaction when the force flag is set. > Run repair with down replicas > - > > Key: CASSANDRA-10446 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10446 > Project: Cassandra > Issue Type: Improvement >Reporter: sankalp kohli >Assignee: Blake Eggleston >Priority: Minor > Fix For: 4.0 > > > We should have an option of running repair when replicas are down. We can > call it -force. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10446) Run repair with down replicas
[ https://issues.apache.org/jira/browse/CASSANDRA-10446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15645492#comment-15645492 ] Paulo Motta commented on CASSANDRA-10446: - It seems repairedAt field is not being set on remote sync tasks, what will cause streamed data to be marked as repaired (see SYNC_REQUEST handling on RepairMessageVerbHandler). We could add a repairedAt field to SyncRequest message (what would break minor compatibility, so could only go on 4.0), but this shows a more fundamental problem with repair failure handling which is that if a repair session fails in the middle of sync, streamed sstables will be marked as repaired even if not all nodes got the data. In order to solve this we could stream sstables with repairedAt=0, and add them to the pool of sstables to be anti-compacted, so they will only be marked as repaired at the end of the parent repair session. If we want to add support to -force without fixing the more fundamental problem with repair sync failure handling, we could mark a forced ParentRepairSession as !isGlobal, what would mark all streamed sstables as *not* repaired as well as skip anti-compaction for the whole parent repair session. > Run repair with down replicas > - > > Key: CASSANDRA-10446 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10446 > Project: Cassandra > Issue Type: Improvement >Reporter: sankalp kohli >Assignee: Blake Eggleston >Priority: Minor > Fix For: 4.0 > > > We should have an option of running repair when replicas are down. We can > call it -force. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10446) Run repair with down replicas
[ https://issues.apache.org/jira/browse/CASSANDRA-10446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15569398#comment-15569398 ] Blake Eggleston commented on CASSANDRA-10446: - | [trunk|https://github.com/bdeggleston/cassandra/commits/10446-trunk] | [dtest|http://cassci.datastax.com/view/Dev/view/bdeggleston/job/bdeggleston-10446-trunk-dtest/] | [testall|http://cassci.datastax.com/view/Dev/view/bdeggleston/job/bdeggleston-10446-trunk-testall/]| [associated dtest|https://github.com/bdeggleston/cassandra-dtest/commits/10446] > Run repair with down replicas > - > > Key: CASSANDRA-10446 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10446 > Project: Cassandra > Issue Type: Improvement >Reporter: sankalp kohli >Assignee: Blake Eggleston >Priority: Minor > Fix For: 3.x > > > We should have an option of running repair when replicas are down. We can > call it -force. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10446) Run repair with down replicas
[ https://issues.apache.org/jira/browse/CASSANDRA-10446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15109045#comment-15109045 ] Anuj Wadehra commented on CASSANDRA-10446: -- I think this option won't do the job. Referring to scenario, when a node failed in 20 node cluster, what nodes will you set in -hosts and how will you ensure that the entire ring is repaired? Suppose host20 failed, you would run "full repair with -hosts hosts1,host2...host19 option" on all 19 healthy nodes.This option is unrealistic. Clusters generally use repair -pr option to repair the cluster. With RF=5, Repair time would be 5 times more for 19 nodes. Moreover, it requires special planning and manual intervention with just one node failure which should be undesirable in a distributed fault tolerant system. Another option would be to run repair -pr on 19 nodes and run repair separately on the ranges for which the failed node was responsible. But that wont work because -pr and -hosts options don't work together. Can you provide a better way to use -hosts option for addressing the issue? > Run repair with down replicas > - > > Key: CASSANDRA-10446 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10446 > Project: Cassandra > Issue Type: Improvement >Reporter: sankalp kohli >Priority: Minor > Fix For: 3.x > > > We should have an option of running repair when replicas are down. We can > call it -force. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10446) Run repair with down replicas
[ https://issues.apache.org/jira/browse/CASSANDRA-10446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15108701#comment-15108701 ] Yuki Morishita commented on CASSANDRA-10446: You can still use '-hosts' repair option to specify which hosts to repair. You can just give live nodes like 'nodetool repair -hosts node1 -hosts node2 -hosts node3', and cassandra will repair among those nodes. > Run repair with down replicas > - > > Key: CASSANDRA-10446 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10446 > Project: Cassandra > Issue Type: Improvement >Reporter: sankalp kohli >Priority: Minor > Fix For: 3.x > > > We should have an option of running repair when replicas are down. We can > call it -force. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10446) Run repair with down replicas
[ https://issues.apache.org/jira/browse/CASSANDRA-10446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15108146#comment-15108146 ] Anuj Wadehra commented on CASSANDRA-10446: -- Whether its bug or an improvement is debatable. The intent of the suggestion to increase the priority and change the type was to ensure that it gets due attention. I think by giving detailed scneario, I have tried to explain the critically of the issue. No, I was not interested in working on this. > Run repair with down replicas > - > > Key: CASSANDRA-10446 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10446 > Project: Cassandra > Issue Type: Improvement >Reporter: sankalp kohli >Priority: Minor > Fix For: 3.x > > > We should have an option of running repair when replicas are down. We can > call it -force. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10446) Run repair with down replicas
[ https://issues.apache.org/jira/browse/CASSANDRA-10446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15108124#comment-15108124 ] sankalp kohli commented on CASSANDRA-10446: --- This is an improvement and not a bug. Seems like you are interested in working on it...Should I assign it to you? > Run repair with down replicas > - > > Key: CASSANDRA-10446 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10446 > Project: Cassandra > Issue Type: Improvement >Reporter: sankalp kohli >Priority: Minor > Fix For: 3.x > > > We should have an option of running repair when replicas are down. We can > call it -force. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10446) Run repair with down replicas
[ https://issues.apache.org/jira/browse/CASSANDRA-10446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15107876#comment-15107876 ] Anuj Wadehra commented on CASSANDRA-10446: -- I think, this an issue with the way we handled the "downed replica" scenario in repairs. We should increase the priority and change the type from Improvement to Bug. Consider following scenario and flow of events which demonstrate the importance of this issue: Scenario: I have a 20 node clsuter, RF=5, Read/Write Quorum, gc grace period=20. My cluster is fault tolerant and it can afford 2 node failures. Suddenly, one node goes down due to some hardware issue. The failed node would prevent repair on many nodes in the cluster as it has approximately 5/20th share of total data ..1/20 which it owns and 4/20 which is stored as replica of data owned by other nodes. Now Its 10 days since the node is down, most of the nodes are not being repaired and now its decision time. I am not sure how soon the issue would be fixed may be next 2 days i.e. 8 days before gc grace, so I shouldnt remove node early and add node back as it would cause significant and unnecessary streaming due to token re-arrangement. At the same time, if I dont remove the failed node at this time i.e. 10 days (much before gc grace), my entire system health would be in question and it would be a panic situation as most of the data didnt get repaired in last 10 days and gc grace is approaching. I need sufficient time to repair all nodes. What looked like a fault tolerant Cassandra cluster which can easily afford 2 node failure, required urgent attention and manual decision making when a single node went down. If some replicas are down, we should allow Repair to proceed with remaining replicas. If failed nodes comes up before gc grace period, we would run repair to fix inconsistencies and otheriwse we would discard data and bootstrap. I think that would be a really robust fault tolerant system. > Run repair with down replicas > - > > Key: CASSANDRA-10446 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10446 > Project: Cassandra > Issue Type: Improvement >Reporter: sankalp kohli >Priority: Minor > Fix For: 3.x > > > We should have an option of running repair when replicas are down. We can > call it -force. -- This message was sent by Atlassian JIRA (v6.3.4#6332)