[
https://issues.apache.org/jira/browse/HBASE-6371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13439025#comment-13439025
]
Nicolas Spiegelberg commented on HBASE-6371:
--------------------------------------------
@Lars: I think we want to put level-based & tiered compactions in the core
instead of as coprocessors because these are generic strategies versus
app-specific logic.
@Akashnil: the algorithm you describe is technically referred to as a "tiered
compaction". DataStax has a nice writeup on tiered compactions versus
level-based:
http://www.datastax.com/dev/blog/leveled-compaction-in-apache-cassandra
> Level based compaction
> ----------------------
>
> Key: HBASE-6371
> URL: https://issues.apache.org/jira/browse/HBASE-6371
> Project: HBase
> Issue Type: Improvement
> Reporter: Akashnil
> Assignee: Akashnil
>
> Currently, the compaction selection is not very flexible and is not sensitive
> to the hotness of the data. Very old data is likely to be accessed less, and
> very recent data is likely to be in the block cache. Both of these
> considerations make it inefficient to compact these files as aggressively as
> other files. In some use-cases, the access-pattern is particularly obvious
> even though there is no way to control the compaction algorithm in those
> cases.
> In the new compaction selection algorithm, we plan to divide the candidate
> files into different levels according to oldness of the data that is present
> in those files. For each level, parameters like compaction ratio, minimum
> number of store-files in each compaction may be different. Number of levels,
> time-ranges, and parameters for each level will be configurable online on a
> per-column family basis.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira