[ 
https://issues.apache.org/jira/browse/CASSANDRA-193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12723089#action_12723089
 ] 

Jonathan Ellis commented on CASSANDRA-193:
------------------------------------------

The more I think about this the less convinced I am that the 
partially-invalidated live tree is going to be worth the overhead of 
maintaining it (and initializing it on startup).

If you instead just create a mini-merkle tree from the first N keys and 
exchange that with the replica nodes, then repeat for the next N, you still get 
a big win on network traffic (which is the main concern here) but you have no 
startup overhead, no complicated extra maintenance to perform on insert, better 
performance in the worst case and (probably) in the average case, since you are 
avoiding random reads in favor of (a potentially greater number of) streaming 
reads which assuming a constant workload profile (i.e. the same proportion of 
keys being overwritten) is always going to be a win for the streaming case.

Implementation detail: you'd want to add an internal message [merkle startkey] 
where startkey is initially "" and after each iteration you update it to the 
N'th key _after_ merging any missing ones.

> Proactive repair
> ----------------
>
>                 Key: CASSANDRA-193
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-193
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Jonathan Ellis
>            Assignee: Stu Hood
>             Fix For: 0.5
>
>
> Currently cassandra supports "read repair," i.e., lazy repair when a read is 
> done.  This is better than nothing but is not sufficient for some cases (e.g. 
> catastrophic node failure where you need to rebuild all of a node's data on a 
> new machine).
> Dynamo uses merkle trees here.  This is harder for Cassandra given the CF 
> data model but I suppose we could just hash the serialized CF value.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to