[ 
https://issues.apache.org/jira/browse/CASSANDRA-11390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15206714#comment-15206714
 ] 

Marcus Olsson commented on CASSANDRA-11390:
-------------------------------------------

bq. I imagine this is what was always intended - perhaps we should open a new 
ticket to investigate if we should increase it
It would probably be good to test if it's a reasonable limit, but it might not 
have that high priority unless we see lots of over-streaming from the current 
one.

bq. Note that we don't care about the ranges when we calculate this, so we have 
to assume that gain within a range is the same as the total gain. Biggest 
problem is how to test this, will try to figure something out.
If it gets too complex to the test it might not be worth to have the compaction 
gain as part of the calculation. It would most probably reduce the MerkleTrees 
sizes, which is good unless the compaction gain comes from data that is not 
part of the repair. Capping the MerkleTrees total size might be good enough 
alone since the only thing the duplicate partitions should bring is 
unnecessarily large resolution, not the memory problems. It could possibly be a 
separate ticket to investigate if there would be a gain from using the 
compaction gain in the calculation.

---

For:
{code}
logger.trace("Created {} merkle trees, {} partitions, {} bytes", tree.size(), 
allPartitions, MerkleTrees.serializer.serializedSize(tree, 0));
{code}
The {{MerkleTrees.size()}} method returns the combined value from calling 
{{MerkleTree.size()}} on all MerkleTrees, which returns {{2^d}}. To get the 
number of merkle trees we could either create a new method in {{MerkleTrees}} 
(treeCount()?) or use {{MerkleTrees.ranges().size()}}. It could probably be 
good to have both the number of trees as well as the output from 
{{MerkleTrees.size()}} in the log output.

Other than that LGTM. :)

> Too big MerkleTrees allocated during repair
> -------------------------------------------
>
>                 Key: CASSANDRA-11390
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11390
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Marcus Eriksson
>            Assignee: Marcus Eriksson
>             Fix For: 3.0.x, 3.x
>
>
> Since CASSANDRA-5220 we create one merkle tree per range, but each of those 
> trees is allocated to hold all the keys on the node, taking up too much memory



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to