[
https://issues.apache.org/jira/browse/CASSANDRA-12580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15452820#comment-15452820
]
Paulo Motta commented on CASSANDRA-12580:
-
Attaching patch to fix the calculation formula to:
{{int depth = numPartitions > 0 ? (int)
Math.min(Math.ceil(Math.log(numPartitions) / Math.log(2)), 20) : 0;}}
Besides fixing from {{ln}} to {{lg}}, this also changes the rounding formula
from {{floor}} to {{ceil}}, so we overestimate the depth rather than
underestimate.
I added a new test on {{ValidationTest}} that runs a validation compaction with
N=128 and N=1500 keys and expect the merkle tree depth to be {{ceil(lg(N))}}. I
also modified the other tests on this class to use a {{ListenableFuture}}
({{CompletableFuture}} on 3.0+) instead of {{SimpleCondition}}, since the JUnit
assertions are not enforced in other threads.
Patch and tests available below:
||2.1||2.2||3.0||trunk||
|[branch|https://github.com/apache/cassandra/compare/cassandra-2.1...pauloricardomg:2.1-CASSANDRA-12580]|[branch|https://github.com/apache/cassandra/compare/cassandra-2.2...pauloricardomg:2.2-CASSANDRA-12580]|[branch|https://github.com/apache/cassandra/compare/cassandra-3.0...pauloricardomg:3.0-CASSANDRA-12580]|[branch|https://github.com/apache/cassandra/compare/trunk...pauloricardomg:trunk-CASSANDRA-12580]|
|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-2.1-CASSANDRA-12580-testall/lastCompletedBuild/testReport/]|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-2.2-CASSANDRA-12580-testall/lastCompletedBuild/testReport/]|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-3.0-CASSANDRA-12580-testall/lastCompletedBuild/testReport/]|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-trunk-CASSANDRA-12580-testall/lastCompletedBuild/testReport/]|
|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-2.1-CASSANDRA-12580-dtest/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-2.2-CASSANDRA-12580-dtest/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-3.0-CASSANDRA-12580-dtest/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-trunk-CASSANDRA-12580-dtest/lastCompletedBuild/testReport/]|
> Fix merkle tree size calculation
>
>
> Key: CASSANDRA-12580
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12580
> Project: Cassandra
> Issue Type: Bug
>Reporter: Paulo Motta
>Assignee: Paulo Motta
>
> On CASSANDRA-5263 it was introduced dynamic merkle tree sizing based on
> estimated number of partitions as {{estimatedDepth = lg(numPartitions)}}, but
> on
> [CompactionManager.doValidationCompaction|https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L1052]
> this is being calculated as:
> {{int depth = numPartitions > 0 ? (int)
> Math.min(Math.floor(Math.log(numPartitions)), 20) : 0;}}
> This is actually calculating {{ln(numPartitions)}} (base-e) instead of
> {{lg(numPartitions)}} (base-2), which causes merkle trees to lose resolution,
> what may result in overstreaming.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)