[jira] [Commented] (CASSANDRA-11461) Failed incremental repairs never cleared from pending list

Nick Bailey (JIRA) Wed, 30 Mar 2016 16:13:53 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-11461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15219041#comment-15219041
 ]


Nick Bailey commented on CASSANDRA-11461:
-----------------------------------------

Yeah. So OpsCenter lets you configure some tables for incremental repair and 
some for normal subrange repair, which is what was happening in this case. So 
OpsCenter is doing:

* Break up the ring into small chunks for subrange repair
* Visit a node and repair a small range for all tables that are using subrange 
repair
* If any tables are configured for incremental repair, run an incremental 
repair on those tables
** By default this would do a full incremental repair on those tables, which is 
what was in use when this bug was hit
* Jump across the ring to a different node and repeat the above process.

It does all this in a single datacenter, since opscenter does cross dc repair.

That's at least the very high level overview.

> Failed incremental repairs never cleared from pending list
> ----------------------------------------------------------
>
>                 Key: CASSANDRA-11461
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11461
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Adam Hattrell
>
> Set up a test cluster with 2 DC's, heavy use of LCS (not sure if that's 
> relevant).
> Kick off cassandra-stress against it.
> Kick of an automated incremental repair cycle.  
> After a bit a node starts flapping which causes a few repairs to fail.  This 
> is never cleared out of pending repairs - given the keyspace is replicated to 
> all nodes it means they all have pending repairs that will never complete.  
> Repairs  are basically blocked at this point.
> Given we're using Incremental repairs you're now spammed with:
> "Cannot start multiple repair sessions over the same sstables"
> Cluster and logs are still available for review - message me for details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11461) Failed incremental repairs never cleared from pending list

Reply via email to