[
https://issues.apache.org/jira/browse/CASSANDRA-5263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13869384#comment-13869384
]
Minh Do commented on CASSANDRA-5263:
------------------------------------
I also don't see how we use stable stats to adjust the MerkleTree depth
automatically. We can estimate number of rows for each sstable but we don't
know how many rows in a given range (unless we assume input is always a full
range).
In term of memory usage, MerkleTree with depth 20 uses around 100Mb and
MerkleTree with depth 17 uses around 15Mb. Does the extra 100Mb hurt Cassandra
performance on some nodes on some cases if we go to this extreme case?
Also, if we use depth 20 and multithreaded version to build MerkleTree, it is
going to impact the response latency.
Some thoughts?
> Allow Merkle tree maximum depth to be configurable
> --------------------------------------------------
>
> Key: CASSANDRA-5263
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5263
> Project: Cassandra
> Issue Type: Improvement
> Components: Config
> Affects Versions: 1.1.9
> Reporter: Ahmed Bashir
> Assignee: Minh Do
>
> Currently, the maximum depth allowed for Merkle trees is hardcoded as 15.
> This value should be configurable, just like phi_convict_treshold and other
> properties.
> Given a cluster with nodes responsible for a large number of row keys, Merkle
> tree comparisons can result in a large amount of unnecessary row keys being
> streamed.
> Empirical testing indicates that reasonable changes to this depth (18, 20,
> etc) don't affect the Merkle tree generation and differencing timings all
> that much, and they can significantly reduce the amount of data being
> streamed during repair.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)