[
https://issues.apache.org/jira/browse/CASSANDRA-19596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17845106#comment-17845106
]
Ariel Weisberg commented on CASSANDRA-19596:
--------------------------------------------
This is a quick and dirty improvement that removes the redundant sorting and
replaces it with re-use of the existing sorted data.
So instead of having to repeat the n * Lg(n) sort to construct every node we
only have to do linear scans of the already sorted data that is in that nodes
subtree.
> IntervalTree build throughput is low enough to be a bottleneck
> --------------------------------------------------------------
>
> Key: CASSANDRA-19596
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19596
> Project: Cassandra
> Issue Type: Improvement
> Components: Local/Compaction, Local/SSTable
> Reporter: Ariel Weisberg
> Assignee: Ariel Weisberg
> Priority: Normal
> Fix For: 5.x
>
>
> With several terabytes of data and 8 compactors it’s possible for the
> compactors to spend a lot of time blocked waiting on IntervalTrees to be
> built.
> There is also a lot of wasted CPU because it’s updated optimistically so most
> of them end up being thrown away.
> This can end up being quite painful because it can block memtable flushing as
> well and then a single slow CFS can block unrelated CFS because the memtable
> post flush executor is single threaded and shared across all CFS.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]