[ 
https://issues.apache.org/jira/browse/CASSANDRA-6758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13910830#comment-13910830
 ] 

Jimmy Mårdell commented on CASSANDRA-6758:
------------------------------------------

A range is not the same as a leaf, is it? If two leaves with the same parent 
mismatches, it's still only one range (I think?). So it's hard to know from the 
logs how much was out of sync.

We've had problems in the past with overstreaming causing serious performance 
problems. Had we known the cluster was that out of sync, we might have taken 
some extra measure before running the repair. With subrange repairs, and 
CASSANDRA-6713, perhaps this will no longer be an issue.


> Measure data consistency in the cluster
> ---------------------------------------
>
>                 Key: CASSANDRA-6758
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6758
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Jimmy Mårdell
>            Priority: Minor
>
> Running multi-DC Cassandra can be a challenge as the cluster easily tends to 
> get out-of-sync. We have been thinking it would be nice to measure how out of 
> sync a cluster is and expose those metrics somehow.
> One idea would be to just run the first half of the repair process and output 
> the result of the differencer. If you use Random or the Murmur3 partitioner, 
> it should be enough to calculate the merkle tree over a small subset of the 
> ring as the result can be extrapolated.
> This could be exposed in nodetool. Either a separate command or perhaps a 
> dry-run flag to repair?
> Not sure about the output format. I think it would be nice to have one value 
> ("% consistent"?) within a DC, and also one value for every pair of DC's 
> perhaps?



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to