[
https://issues.apache.org/jira/browse/CASSANDRA-11390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15206714#comment-15206714
]
Marcus Olsson commented on CASSANDRA-11390:
-------------------------------------------
bq. I imagine this is what was always intended - perhaps we should open a new
ticket to investigate if we should increase it
It would probably be good to test if it's a reasonable limit, but it might not
have that high priority unless we see lots of over-streaming from the current
one.
bq. Note that we don't care about the ranges when we calculate this, so we have
to assume that gain within a range is the same as the total gain. Biggest
problem is how to test this, will try to figure something out.
If it gets too complex to the test it might not be worth to have the compaction
gain as part of the calculation. It would most probably reduce the MerkleTrees
sizes, which is good unless the compaction gain comes from data that is not
part of the repair. Capping the MerkleTrees total size might be good enough
alone since the only thing the duplicate partitions should bring is
unnecessarily large resolution, not the memory problems. It could possibly be a
separate ticket to investigate if there would be a gain from using the
compaction gain in the calculation.
---
For:
{code}
logger.trace("Created {} merkle trees, {} partitions, {} bytes", tree.size(),
allPartitions, MerkleTrees.serializer.serializedSize(tree, 0));
{code}
The {{MerkleTrees.size()}} method returns the combined value from calling
{{MerkleTree.size()}} on all MerkleTrees, which returns {{2^d}}. To get the
number of merkle trees we could either create a new method in {{MerkleTrees}}
(treeCount()?) or use {{MerkleTrees.ranges().size()}}. It could probably be
good to have both the number of trees as well as the output from
{{MerkleTrees.size()}} in the log output.
Other than that LGTM. :)
> Too big MerkleTrees allocated during repair
> -------------------------------------------
>
> Key: CASSANDRA-11390
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11390
> Project: Cassandra
> Issue Type: Bug
> Reporter: Marcus Eriksson
> Assignee: Marcus Eriksson
> Fix For: 3.0.x, 3.x
>
>
> Since CASSANDRA-5220 we create one merkle tree per range, but each of those
> trees is allocated to hold all the keys on the node, taking up too much memory
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)