[ 
https://issues.apache.org/jira/browse/CASSANDRA-11390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15206315#comment-15206315
 ] 

Marcus Olsson commented on CASSANDRA-11390:
-------------------------------------------

Good point, the % responsibility should probably be in the calculation. So 
instead it could be:
{noformat}
2^d < percent * 2^20
<=>
d < log2(percent * 2^20)
<=>
d < log2(percent) + 20
{noformat}
Where {{log2\(percent)}} would be negative for each value such as {{0.0 < 
percent < 1.0}}.

Or in java:
{code}
int maxDepth = (int) Math.floor(20 + Math.log(percent) / Math.log(2));
{code}

Using this I'd say there are two options, either we base the percentage on the 
range sizes or on the estimated partitions for each range. Both would require 
us to iterate through all ranges once before estimating the depth for each 
tree, but using the estimated partition count would probably be more effective.

---

Another thing we should consider is if the total limit should be {{2^20}}. 
Before vnodes this was used for a single token range and after vnodes it was 
instead {{vnodes * 2^20}} per node. This gave us a much higher resolution in 
the merkle trees with vnodes. If we divide {{2^20}} between all ranges we go 
back to the pre-vnode merkle tree resolution for each node (possibly lower).

> Too big MerkleTrees allocated during repair
> -------------------------------------------
>
>                 Key: CASSANDRA-11390
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11390
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Marcus Eriksson
>            Assignee: Marcus Eriksson
>             Fix For: 3.0.x, 3.x
>
>
> Since CASSANDRA-5220 we create one merkle tree per range, but each of those 
> trees is allocated to hold all the keys on the node, taking up too much memory



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to