[
https://issues.apache.org/jira/browse/CASSANDRA-6758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13910212#comment-13910212
]
Benedict commented on CASSANDRA-6758:
-------------------------------------
This doesn't seem like a bad idea at all. The only problem that I can see is
that the "first half" of the repair process is actually one of the more
expensive actions a cluster can perform, as the entire cluster needs to walk
all of its data to compute its merkle tree. I wonder if it would be possible to
calculate and save an abbreviated merkle tree when writing each sstable, that
could be combined cheaply to give this answer.
> Measure data consistency in the cluster
> ---------------------------------------
>
> Key: CASSANDRA-6758
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6758
> Project: Cassandra
> Issue Type: New Feature
> Reporter: Jimmy MÃ¥rdell
> Priority: Minor
>
> Running multi-DC Cassandra can be a challenge as the cluster easily tends to
> get out-of-sync. We have been thinking it would be nice to measure how out of
> sync a cluster is and expose those metrics somehow.
> One idea would be to just run the first half of the repair process and output
> the result of the differencer. If you use Random or the Murmur3 partitioner,
> it should be enough to calculate the merkle tree over a small subset of the
> ring as the result can be extrapolated.
> This could be exposed in nodetool. Either a separate command or perhaps a
> dry-run flag to repair?
> Not sure about the output format. I think it would be nice to have one value
> ("% consistent"?) within a DC, and also one value for every pair of DC's
> perhaps?
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)