[ 
https://issues.apache.org/jira/browse/CASSANDRA-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15133253#comment-15133253
 ] 

Paulo Motta commented on CASSANDRA-10070:
-----------------------------------------

Nice work [~molsson]. Overall the design doc looks great and addresses most of 
the issues raised previously, just a few minor comments/questions:
* I second [~yukim]'s first question above, in that we need to better specify 
how is cluster-wide repair parallelism handled: is it fixed or configurable? 
can a node run repair for multiple ranges in parallel? Perhaps we should have a 
 {{node_repair_paralellism}} (default 1) and {{dc_repair_parallelism}} (default 
1) global config and reject starting repairs above those thresholds.
* For subrange repair, we could maybe have something similar to 
[reaper|https://github.com/spotify/cassandra-reaper]'s {{segmentCount}} option, 
but since this would add more complexity we could leave for a separate ticket.
* While pausing repair is a nice future for user-based interruptions, we could 
probably embed system known interruptions (such as when a bootstrap or upgrade 
is going on) in the default rejection logic.

Maybe the spotify reaper folks have something to add based on their experience 
with automatic repair scheduling (cc [~Bj0rn], [~zvo]).

> Automatic repair scheduling
> ---------------------------
>
>                 Key: CASSANDRA-10070
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10070
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Marcus Olsson
>            Assignee: Marcus Olsson
>            Priority: Minor
>             Fix For: 3.x
>
>         Attachments: Distributed Repair Scheduling.doc
>
>
> Scheduling and running repairs in a Cassandra cluster is most often a 
> required task, but this can both be hard for new users and it also requires a 
> bit of manual configuration. There are good tools out there that can be used 
> to simplify things, but wouldn't this be a good feature to have inside of 
> Cassandra? To automatically schedule and run repairs, so that when you start 
> up your cluster it basically maintains itself in terms of normal 
> anti-entropy, with the possibility for manual configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to