[
https://issues.apache.org/jira/browse/CASSANDRA-193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12775105#action_12775105
]
Jun Rao commented on CASSANDRA-193:
-----------------------------------
Started reviewing this patch. Here is a high-level question. From the code, the
process roughly looks as the follows:
1. Each node N periodically computes a Merkle tree for rows in each new SSTable
generated through compaction.
2. The Merkle tree is sent and registered to other nodes that share key ranges
with N.
3. The locally computed Merkle tree will be compared with those registered
remote Merkle trees. If there is any difference, trigger a repair.
Since compaction is triggered independently at each node, the rows in two
compacted SSTables generated in two neighboring nodes are unlikely to match.
Won't the above approach trigger too many unnecessary repairs?
> Proactive repair
> ----------------
>
> Key: CASSANDRA-193
> URL: https://issues.apache.org/jira/browse/CASSANDRA-193
> Project: Cassandra
> Issue Type: New Feature
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Stu Hood
> Fix For: 0.5
>
> Attachments: 193-1-tree-preparation.diff, 193-2-tree.diff,
> 193-3-aes-preparation.diff, 193-4-aes.diff
>
>
> Currently cassandra supports "read repair," i.e., lazy repair when a read is
> done. This is better than nothing but is not sufficient for some cases (e.g.
> catastrophic node failure where you need to rebuild all of a node's data on a
> new machine).
> Dynamo uses merkle trees here. This is harder for Cassandra given the CF
> data model but I suppose we could just hash the serialized CF value.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.