[ 
https://issues.apache.org/jira/browse/CASSANDRA-193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12723696#action_12723696
 ] 

Stu Hood commented on CASSANDRA-193:
------------------------------------

> Then the question becomes, is it worth trying to keep (partial) results of 
> that scan in memory to avoid re-doing the work next time around. If writes 
> are randomly distributed across the range then ISTM the answer is a clear No, 
> but I'm not sure how close real-world workloads would come to that. 
You're right that the tree is basically a 'range hash cache', but I don't think 
that writes will be randomly distributed. Especially since we allow complex 
values, I think people are more likely to have 'hot' keys. Adding in the 
OrderPreservingPartitioner makes it even more likely to have hot ranges.

> Does it make sense to start with a non-caching version like I describe?
Perhaps: we could initialize a new MerkleTree at repair time, use the range 
hashing API I've described, and throw it away at the end of the repair. Next, 
we could implement maintaining/invalidating the tree between repairs. I'm not 
sure how much simpler this is (since the invalidation of ranges is probably the 
simplest part of the whole deal).

> Incidently, I don't think we should worry about the kind of locking you 
> mentioned.
You're right: since a separate thread/agent/executor is maintaining the tree, 
the locking should be completely unnecessary. Whenever we're performing a 
repair, we're not accepting invalidations, so we're looking at a snapshot of 
the tree.

> Proactive repair
> ----------------
>
>                 Key: CASSANDRA-193
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-193
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Jonathan Ellis
>            Assignee: Stu Hood
>             Fix For: 0.5
>
>
> Currently cassandra supports "read repair," i.e., lazy repair when a read is 
> done.  This is better than nothing but is not sufficient for some cases (e.g. 
> catastrophic node failure where you need to rebuild all of a node's data on a 
> new machine).
> Dynamo uses merkle trees here.  This is harder for Cassandra given the CF 
> data model but I suppose we could just hash the serialized CF value.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to