[ 
https://issues.apache.org/jira/browse/CASSANDRA-19596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17845106#comment-17845106
 ] 

Ariel Weisberg commented on CASSANDRA-19596:
--------------------------------------------

This is a quick and dirty improvement that removes the redundant sorting and 
replaces it with re-use of the existing sorted data.

So instead of having to repeat the n * Lg(n) sort to construct every node we 
only have to do linear scans of the already sorted data that is in that nodes 
subtree.

> IntervalTree build throughput is low enough to be a bottleneck
> --------------------------------------------------------------
>
>                 Key: CASSANDRA-19596
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-19596
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Local/Compaction, Local/SSTable
>            Reporter: Ariel Weisberg
>            Assignee: Ariel Weisberg
>            Priority: Normal
>             Fix For: 5.x
>
>
> With several terabytes of data and 8 compactors it’s possible for the 
> compactors to spend a lot of time blocked waiting on IntervalTrees to be 
> built.
> There is also a lot of wasted CPU because it’s updated optimistically so most 
> of them end up being thrown away.
> This can end up being quite painful because it can block memtable flushing as 
> well and then a single slow CFS can block unrelated CFS because the memtable 
> post flush executor is single threaded and shared across all CFS. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to