[ 
https://issues.apache.org/jira/browse/CASSANDRA-193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12759990#action_12759990
 ] 

Stu Hood commented on CASSANDRA-193:
------------------------------------

I've been working on this ticket a bit more in the past few days:
 * Added o.a.c.service.AntiEntropyService - Maintains trees for each CF, and 
accepts invalidations when values change.

Still TODO:
 * Implement TreeRequestVerbHandler/TreeResponseVerbHandler - The AEService on 
a first endpoint will periodically wake up and send a TreeRequest to a replica. 
The replica endpoint will handle the TreeRequest by validating one or all of 
its MerkleTrees, and responding with a TreeResponse. Handling the TreeResponse 
on the first endpoint will involve validating the local tree, and then 
comparing the two trees.
   * Validation is the only part that is fuzzy here: we need to iterate over 
keys in each CF (essentially, a major compaction, except that we can skip 
processing for anything that is still valid in the tree).
 * Begin implementing the actual repair step - There isn't a design for this 
part yet: any thoughts would be appreciated. The output of the 
TreeRequest/TreeResponse conversation will be a list of ranges in a given CF 
that disagree between the two endpoints.

> Proactive repair
> ----------------
>
>                 Key: CASSANDRA-193
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-193
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Stu Hood
>             Fix For: 0.5
>
>         Attachments: CASSANDRA-193.diff
>
>
> Currently cassandra supports "read repair," i.e., lazy repair when a read is 
> done.  This is better than nothing but is not sufficient for some cases (e.g. 
> catastrophic node failure where you need to rebuild all of a node's data on a 
> new machine).
> Dynamo uses merkle trees here.  This is harder for Cassandra given the CF 
> data model but I suppose we could just hash the serialized CF value.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to