[ 
https://issues.apache.org/jira/browse/HBASE-6371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13478543#comment-13478543
 ] 

Lars Hofhansl commented on HBASE-6371:
--------------------------------------

Specifically a scenario I'd be interested in, is to keep a days (or two) worth 
of changes in a live HBase cluster. In extreme cases this might be lead to 
1000's of versions, and scan performance of the latest version suffers 
significantly, *especially* after a major compaction which will cause all 
version of KVs to be jumbled together in the same file.
                
> [89-fb] Tier based compaction
> -----------------------------
>
>                 Key: HBASE-6371
>                 URL: https://issues.apache.org/jira/browse/HBASE-6371
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Akashnil
>            Assignee: Liyin Tang
>              Labels: noob
>
> Currently, the compaction selection is not very flexible and is not sensitive 
> to the hotness of the data. Very old data is likely to be accessed less, and 
> very recent data is likely to be in the block cache. Both of these 
> considerations make it inefficient to compact these files as aggressively as 
> other files. In some use-cases, the access-pattern is particularly obvious 
> even though there is no way to control the compaction algorithm in those 
> cases.
> In the new compaction selection algorithm, we plan to divide the candidate 
> files into different levels according to oldness of the data that is present 
> in those files. For each level, parameters like compaction ratio, minimum 
> number of store-files in each compaction may be different. Number of levels, 
> time-ranges, and parameters for each level will be configurable online on a 
> per-column family basis.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to