[ 
https://issues.apache.org/jira/browse/CASSANDRA-193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12780912#action_12780912
 ] 

Stu Hood edited comment on CASSANDRA-193 at 11/22/09 3:07 AM:
--------------------------------------------------------------

EDIT: After a little more consideration, the caching bug had nothing to do with 
our hash function. It was a disconnect between the binary tree, and the b-tree 
we are using to store it. To be honest, I don't want to merge something so 
complex that even the person who created it still has trouble reasoning about 
it.

I'm going to refactor the b-tree into a binary tree tomorrow.

      was (Author: stuhood):
    The comment explains it better, but you need something like XOR, which is 
associative: (1 ^ 2) ^ 3 == 1 ^ (2 ^ 3) in order to cache partially computed 
values. MD5 needs to be computed sequentially over all of the inputs, so it can 
be used for leaves, but not inner nodes.
  
> Proactive repair
> ----------------
>
>                 Key: CASSANDRA-193
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-193
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Stu Hood
>             Fix For: 0.5
>
>         Attachments: 193-1-tree-preparation.diff, 
> 193-1-tree-preparation.diff, 193-2-tree.diff, 193-2-tree.diff, 
> 193-3-aes-preparation.diff, 193-3-aes-preparation.diff, 193-4-aes.diff, 
> 193-4-aes.diff, 193-5-manual-repair.diff, 193-6-inverted-filter.diff, 
> 193-6-inverted-filter.diff, 193-7-disable-caching-and-fix-minimum-token.diff, 
> 193-breakdown.txt, mktree-and-binary-tree.png
>
>
> Currently cassandra supports "read repair," i.e., lazy repair when a read is 
> done.  This is better than nothing but is not sufficient for some cases (e.g. 
> catastrophic node failure where you need to rebuild all of a node's data on a 
> new machine).
> Dynamo uses merkle trees here.  This is harder for Cassandra given the CF 
> data model but I suppose we could just hash the serialized CF value.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to