[ https://issues.apache.org/jira/browse/CASSANDRA-2841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sylvain Lebresne updated CASSANDRA-2841: ---------------------------------------- Attachment: 2841.patch Patch is against 0.7. > Always use even distribution for merkle tree with RandomPartitionner > -------------------------------------------------------------------- > > Key: CASSANDRA-2841 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2841 > Project: Cassandra > Issue Type: Improvement > Components: Core > Affects Versions: 0.7.0 > Reporter: Sylvain Lebresne > Assignee: Sylvain Lebresne > Priority: Trivial > Labels: repair > Fix For: 0.7.7, 0.8.2 > > Attachments: 2841.patch > > > When creating the initial merkle tree, repair tries to be (too) smart and use > the key samples to "guide" the tree splitting. While this is a good idea for > OPP where there is a good change the data distribution is uneven, you can't > beat an even distribution for the RandomPartitionner. And a quick experiment > even shows that the method used is significantly less efficient than an even > distribution for the ranges of the merkle tree (that is, an even distribution > gives a much better of distribution of the number of keys by range of the > tree). > Thus let's switch to an even distribution for RandomPartitionner. That 3 > lines change alone amounts for a significant improvement of repair's > precision. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira