[jira] [Commented] (CASSANDRA-12580) Fix merkle tree size calculation

2016-09-27 Thread Yuki Morishita (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15528174#comment-15528174
 ] 

Yuki Morishita commented on CASSANDRA-12580:


Nice catch. Patch looks good to me.


> Fix merkle tree size calculation
> 
>
> Key: CASSANDRA-12580
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12580
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Paulo Motta
>Assignee: Paulo Motta
> Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x
>
>
> On CASSANDRA-5263 it was introduced dynamic merkle tree sizing based on 
> estimated number of partitions as {{estimatedDepth = lg(numPartitions)}}, but 
> on 
> [CompactionManager.doValidationCompaction|https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L1052]
>  this is being calculated as:
> {{int depth = numPartitions > 0 ? (int) 
> Math.min(Math.floor(Math.log(numPartitions)), 20) : 0;}}
> This is actually calculating {{ln(numPartitions)}} (base-e) instead of 
> {{lg(numPartitions)}} (base-2), which causes merkle trees to lose resolution, 
> what may result in overstreaming.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12580) Fix merkle tree size calculation

2016-08-31 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15452820#comment-15452820
 ] 

Paulo Motta commented on CASSANDRA-12580:
-

Attaching patch to fix the calculation formula to:

{{int depth = numPartitions > 0 ? (int) 
Math.min(Math.ceil(Math.log(numPartitions) / Math.log(2)), 20) : 0;}}

Besides fixing from {{ln}} to {{lg}}, this also changes the rounding formula 
from {{floor}} to  {{ceil}}, so we overestimate the depth rather than 
underestimate.

I added a new test on {{ValidationTest}} that runs a validation compaction with 
N=128 and N=1500 keys and expect the merkle tree depth to be {{ceil(lg(N))}}. I 
also modified the other tests on this class to use a {{ListenableFuture}} 
({{CompletableFuture}} on 3.0+) instead of {{SimpleCondition}}, since the JUnit 
assertions are not enforced in other threads.


Patch and tests available below:
||2.1||2.2||3.0||trunk||
|[branch|https://github.com/apache/cassandra/compare/cassandra-2.1...pauloricardomg:2.1-CASSANDRA-12580]|[branch|https://github.com/apache/cassandra/compare/cassandra-2.2...pauloricardomg:2.2-CASSANDRA-12580]|[branch|https://github.com/apache/cassandra/compare/cassandra-3.0...pauloricardomg:3.0-CASSANDRA-12580]|[branch|https://github.com/apache/cassandra/compare/trunk...pauloricardomg:trunk-CASSANDRA-12580]|
|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-2.1-CASSANDRA-12580-testall/lastCompletedBuild/testReport/]|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-2.2-CASSANDRA-12580-testall/lastCompletedBuild/testReport/]|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-3.0-CASSANDRA-12580-testall/lastCompletedBuild/testReport/]|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-trunk-CASSANDRA-12580-testall/lastCompletedBuild/testReport/]|
|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-2.1-CASSANDRA-12580-dtest/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-2.2-CASSANDRA-12580-dtest/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-3.0-CASSANDRA-12580-dtest/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-trunk-CASSANDRA-12580-dtest/lastCompletedBuild/testReport/]|

> Fix merkle tree size calculation
> 
>
> Key: CASSANDRA-12580
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12580
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Paulo Motta
>Assignee: Paulo Motta
>
> On CASSANDRA-5263 it was introduced dynamic merkle tree sizing based on 
> estimated number of partitions as {{estimatedDepth = lg(numPartitions)}}, but 
> on 
> [CompactionManager.doValidationCompaction|https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L1052]
>  this is being calculated as:
> {{int depth = numPartitions > 0 ? (int) 
> Math.min(Math.floor(Math.log(numPartitions)), 20) : 0;}}
> This is actually calculating {{ln(numPartitions)}} (base-e) instead of 
> {{lg(numPartitions)}} (base-2), which causes merkle trees to lose resolution, 
> what may result in overstreaming.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)