[ https://issues.apache.org/jira/browse/HBASE-15339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15179113#comment-15179113 ]
Enis Soztutar commented on HBASE-15339: --------------------------------------- Delayed compactions for the incoming window is also relevant: HBASE-14496. [~vrodionov] can comment more on the reasoning, but my understanding is that if query load for very-recent data is high, delaying the compaction of files for some time can help with smoothing the latencies out. > Improve DateTieredCompactionPolicy > ---------------------------------- > > Key: HBASE-15339 > URL: https://issues.apache.org/jira/browse/HBASE-15339 > Project: HBase > Issue Type: Improvement > Components: Compaction > Reporter: Duo Zhang > > For our MiCloud service, the old data is rarely touched but we still need to > keep it, so we want to put the data on inexpensive device and reduce > redundancy using EC to cut down the cost. > With date based tiered compaction introduced in HBASE-15181, new data and old > data can be placed in different tier. But the tier boundary moves as time > lapse so it is still possible that we do compaction on old tier which breaks > our block moving and EC work. > So here we want to introduce an "archive tier" to better fit our scenario. > Add an configuration called "archive unit", for example, year. That means, if > we find that the tier boundary is already in the previous year, then we reset > the boundary to the start of year and end of year, and if we want to do > compaction in this tier, just compact all files into one file. The file will > never be changed unless we force a major compaction so it is safe to apply EC > and other cost reducing approach on the file. And we make more tiers before > this tier year by year. -- This message was sent by Atlassian JIRA (v6.3.4#6332)