[ 
https://issues.apache.org/jira/browse/CASSANDRA-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15147579#comment-15147579
 ] 

Marcus Olsson commented on CASSANDRA-10070:
-------------------------------------------

{quote}
All data centers involved in a repair must be available for a repair to 
start/succeed, so if we make the lock resource dc-aware and try to create the 
lock by contacting a node in each involved data center with LOCAL_SERIAL 
consistency that should be sufficient to ensure correctness without the need 
for a global lock. This will also play along well with both dc_parallelism 
global option and with the --local or --dcs table repair options.
{quote}

{quote}
The second alternative is probably the most desireable. Actually dc_parallelism 
by itself might cause problems, since we can have a situation where all repairs 
run in a single node or range, overloading those nodes. If we are to support 
concurrent repairs in the first pass, I think we need both dc_parallelism and 
node_parallelism options together.
{quote}

{quote}
This is becoming a bit complex and there probably are some edge cases and/or 
starvation scenarios so we should think carefully about before jumping into 
implementation. What do you think about this approach? Should we stick to a 
simpler non-parallel version in the first pass or think this through and 
already support parallelism in the first version?
{quote}

I like the approach with using local serial for each dc and having specialized 
keys. I think we could include the dc parallelism lock with 
"RepairResource-\{dc}-\{i}" but only allow one repair per data center by 
hardcoding "i" to 1 in the first pass. This should make the upgrades easier 
when we do allow parallel repairs. I like the node locks approach as well, but 
as you say there are probably some edge cases so we could wait with adding them 
until we allow parallel repairs and I don't think it would break the upgrades 
by introducing them later.

{quote}
We should also think better about possible failure scenarios and network 
partitions. What happens if the node cannot renew locks in a remote DC due to a 
temporary network partition but the repair is still running ? We should 
probably cancel a repair if not able to renew the lock and also have some kind 
of garbage collector to kill ongoing repair sessions without associated locks 
to protect from disrespecting the configured dc_parallelism and 
node_paralellism.
{quote}
I agree and we could probably store the parent repair session id in an extra 
column of the lock table and have a thread wake up periodically to see if there 
are repair sessions without locks. But then we must somehow be able to 
differentiate user-defined and automatically scheduled repair sessions. It 
could be done by having all repairs go through this scheduling interface, which 
also would reduce user mistakes with multiple repairs in parallel. Another 
alternative is to have a custom flag in the parent repair that makes the 
garbage collector ignore it if it's user-defined. I think that the garbage 
collector/cancel repairs when unable to lock feature is something that should 
be included in the first pass.

The most basic failure scenarios should be covered by retrying a repair if it 
fails and log a warning/error based on how many times it failed. Could the 
retry behaviour cause some unexpected consequences?

> Automatic repair scheduling
> ---------------------------
>
>                 Key: CASSANDRA-10070
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10070
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Marcus Olsson
>            Assignee: Marcus Olsson
>            Priority: Minor
>             Fix For: 3.x
>
>         Attachments: Distributed Repair Scheduling.doc
>
>
> Scheduling and running repairs in a Cassandra cluster is most often a 
> required task, but this can both be hard for new users and it also requires a 
> bit of manual configuration. There are good tools out there that can be used 
> to simplify things, but wouldn't this be a good feature to have inside of 
> Cassandra? To automatically schedule and run repairs, so that when you start 
> up your cluster it basically maintains itself in terms of normal 
> anti-entropy, with the possibility for manual configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to