[
https://issues.apache.org/jira/browse/CASSANDRA-8731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14347056#comment-14347056
]
Jonathan Ellis commented on CASSANDRA-8731:
-------------------------------------------
Sylvain, can you link your ticket to this one?
> Optimise merges involving multiple clustering columns
> -----------------------------------------------------
>
> Key: CASSANDRA-8731
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8731
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Benedict
> Labels: performance
> Fix For: 3.0
>
>
> Since the new storage format is dead in the water for the moment, we should
> do our best to optimise current behaviour. When merging data from multiple
> sstables with multiple clustering columns, currently we must incur the full
> costs of comparison for the entire matching prefix, and must heapify every
> cell in our PriorityQueue, incurring lg(N) of these costlier comparisons for
> every cell we merge, where N is the number of sources we're merging.
> Essentially I'm proposing a trie-based merge approach as a replacement for
> the ManyToOne MergeIterator, wherein we treat each clustering component as a
> tree underwhich all Cells with a common prefix occur. We then perform a tree
> merge, rather than a flat merge. For byte-order fields this trie can even be
> a full binary-trie (although built on the fly). The advantage here is that we
> rapidly prune merges involving disjoint ranges, so that instead of always
> incurring lg(N) costs on each new record, we may often incur O(1) costs. For
> timeseries data, for instance, we could merge dozens of files and so long as
> they were non-overlapping our CPU burden would be little more than reading
> from a single file.
> On top of this, we no longer incur any of the shared prefix repetition costs,
> since we compare each prefix piece-wise, and only once.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)