[
https://issues.apache.org/jira/browse/CASSANDRA-4011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13563179#comment-13563179
]
Jonathan Ellis commented on CASSANDRA-4011:
-------------------------------------------
DataTracker.intervalTree is used on the read path regardless of
compactionstrategy.
> range-based log(n) elimination of sstables in read path
> -------------------------------------------------------
>
> Key: CASSANDRA-4011
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4011
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Peter Schuller
>
> If the read path was able to eliminate sstables based on token ranges, we
> would avoid {{O(n)}} bloom filter checks ({{n}} being number of sstables).
> Contributing motivation:
> * For maximally efficient bulk-import, you tend to want a lot of small
> sstables to avoid having to build up huge ones during the bulk creation
> process.
> * To avoid having to keep duplicate data when switching a data set (in a
> periodic bulk replace import process), keeping sstables partitioned on token
> range (similarly to leveled compaction) allows in-place replacement of
> sstables one sstable at a time.
> Those two in combination would mean that you can run a bulk-import based
> total-dataset-replacement cluster with zero compaction and with zero disk
> space overhead stemming from having to have overhead for compaction.
> In addition:
> * For e.g. leveled compaction where we have range based partitioning anyway,
> {{log(n)}} is preferable to {{o(n)}}; especially if it would allow us to have
> more than 10 "partitions" per level. I'm not sure yet whether there are other
> reasons to have "only" 10, but if we can make them smaller by eliminating the
> {{o(n)}} behavior in the read path, individual compactions can be even
> smaller with leveled and you would scale even more easily with large data
> sets while avoiding build-up in L0.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira