[ 
https://issues.apache.org/jira/browse/CASSANDRA-193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12723639#action_12723639
 ] 

Stu Hood commented on CASSANDRA-193:
------------------------------------

> ...what is the behavior after a fresh start if a Repair phase begins?
Well, you would have a single large invalid range (or some number of smaller 
ranges summing to the full range), which could be fetched sequentially from 
disk.

Rather than being completely lazy, (depending on how much extra load it would 
cause) we could hook the AEService into compactions that are happening for 
other reasons, so that before the compaction begins, the compactor fetches the 
current list of invalid ranges and fills them in based on the merged data. I'm 
not sure how much of a win this would be, since we probably don't want to slow 
down compactions, but if they aren't CPU bound, then it shouldn't hurt.

> Proactive repair
> ----------------
>
>                 Key: CASSANDRA-193
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-193
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Jonathan Ellis
>            Assignee: Stu Hood
>             Fix For: 0.5
>
>
> Currently cassandra supports "read repair," i.e., lazy repair when a read is 
> done.  This is better than nothing but is not sufficient for some cases (e.g. 
> catastrophic node failure where you need to rebuild all of a node's data on a 
> new machine).
> Dynamo uses merkle trees here.  This is harder for Cassandra given the CF 
> data model but I suppose we could just hash the serialized CF value.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to