[ 
https://issues.apache.org/jira/browse/CASSANDRA-193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12777171#action_12777171
 ] 

Stu Hood commented on CASSANDRA-193:
------------------------------------

> Another complexity arises because you assume that the ranges in 2 different 
> Merkle trees are different.
This is another decision that arose from the idea of maintaining the tree over 
a longer period by invalidating ranges, and only compacting the ranges that had 
changed recently. The replicas would end up with divergent splits in the trees. 
If we go back to assuming that we never want to maintain a tree between 
compactions, then compact() and invalidate() could be removed, and 
differences() could be simplified.

But considering the fact that we are hesitant to trigger a major compaction for 
every repair, maintaining the tree between repairs becomes a more interesting 
option.

> Proactive repair
> ----------------
>
>                 Key: CASSANDRA-193
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-193
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Stu Hood
>             Fix For: 0.5
>
>         Attachments: 193-1-tree-preparation.diff, 193-2-tree.diff, 
> 193-3-aes-preparation.diff, 193-4-aes.diff, mktree-and-binary-tree.png
>
>
> Currently cassandra supports "read repair," i.e., lazy repair when a read is 
> done.  This is better than nothing but is not sufficient for some cases (e.g. 
> catastrophic node failure where you need to rebuild all of a node's data on a 
> new machine).
> Dynamo uses merkle trees here.  This is harder for Cassandra given the CF 
> data model but I suppose we could just hash the serialized CF value.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to