[
https://issues.apache.org/jira/browse/HBASE-3745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrew Kyle Purtell closed HBASE-3745.
--------------------------------------
> Add the ability to restrict major-compactible files by timestamp
> ----------------------------------------------------------------
>
> Key: HBASE-3745
> URL: https://issues.apache.org/jira/browse/HBASE-3745
> Project: HBase
> Issue Type: Improvement
> Affects Versions: 0.92.0
> Reporter: Todd Lipcon
> Priority: Major
>
> In some applications, a common access pattern is to frequently scan tables
> with a time range predicate restricted to a fairly recent time window. For
> example, you may want to do an incremental aggregation or indexing step only
> on rows that have changed in the last hour. We do this efficiently by
> tracking min and max timestamp on an HFile level, so that old HFiles don't
> have to be read.
> After a major compaction, however, the entire dataset will need to be read,
> which can hurt performance of this access pattern.
> We should add a column family attribute that can specify a policy like: When
> major compacting, never include an HFile that contains data with a timestamp
> in the last 4 hours. This, recently flushed HFiles will always be uncompacted
> and provide the good scan performance required for these applications.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)