[ 
https://issues.apache.org/jira/browse/CASSANDRA-2698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13609625#comment-13609625
 ] 

Benedict commented on CASSANDRA-2698:
-------------------------------------

Hi,

I've uploaded a patch for this issue (patch.diff - apologies for the 
potentially future-clashing name). Logging is performed in two places, both on 
the source (not requesting) node of any comparison:

1) On the requesting node in AntiEntropyService.Difference.run(), after the 
MerkleTree difference is calculated and before the StreamingRepairTask is 
created
2) On the source node, on which StreamingRepairTask is run, in 
StreamOut.createPendingFiles()

In both cases we log, at debug level, a sample of the largest ranges followed 
by a histogram of the range size distribution.  The first is achieved by 
inserting each range directly into an EstimatedHistogram, on which we call the 
new logSummary() method; the second by calling the new groupByFrequency() 
method on that same histogram, to yield a histogram based on the frequency of 
sizes present in the original (on which we simply call log()).

In case 1, we construct the MerkleTree to include a size taken from the 
AbstractCompactedRow we compute the hash from, and use this in 
MerkleTree.difference to estimate the size of mismatching ranges. This tends to 
underestimate, versus that reported by StreamOut, by around 15%. One design 
decision of note here: instead of modifying AbstractCompactedRow to return a 
size (which would be invasive and in some cases incur an unnecessary penalty) 
we use a custom implementation of MessageDigest that counts the number of bytes 
provided to it.

Case 2 is much simpler, as we already have the ranges and their sizes available 
to us.

There are some other changes, particularly in MerkleTree, with some 
refactoring/renames/new subclasses as part of updating MerkleTree.difference(). 
In particular, TreeDifference is returned instead of TreeRange (to accommodate 
the extra size information), and it is used generally in place of it within 
this method tree where applicable; hash() and hashHelper() have also been 
renamed to find() and findHelper(), with a new hash() implementation depending 
on find(). I'm sure there are other minutiae, but hopefully nothing too opaque. 
If you need any clarification, feel free to ask.
                
> Instrument repair to be able to assess it's efficiency (precision)
> ------------------------------------------------------------------
>
>                 Key: CASSANDRA-2698
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2698
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Sylvain Lebresne
>            Priority: Minor
>              Labels: lhf
>         Attachments: nodetool_repair_and_cfhistogram.tar.gz, 
> patch_2698_v1.txt, patch.diff
>
>
> Some reports indicate that repair sometime transfer huge amounts of data. One 
> hypothesis is that the merkle tree precision may deteriorate too much at some 
> data size. To check this hypothesis, it would be reasonably to gather 
> statistic during the merkle tree building of how many rows each merkle tree 
> range account for (and the size that this represent). It is probably an 
> interesting statistic to have anyway.   

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to