[
https://issues.apache.org/jira/browse/HBASE-16981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15669559#comment-15669559
]
Jingcheng Du commented on HBASE-16981:
--------------------------------------
This can reduce the IO, but this cannot help reduce the number of files.
If we want to keep a small number of files, we have to set this merge threshold
in a large number which might introduce IO amplification.
Maybe we can add a threshold for the number of the files. The files that are
larger than the merge threshold won't be touched until the number of files is
larger than the new threshold? In the compaction, the files that are less than
the merge threshold should be selected first.
> Expand Mob Compaction Partition policy from daily to weekly, monthly and
> beyond
> -------------------------------------------------------------------------------
>
> Key: HBASE-16981
> URL: https://issues.apache.org/jira/browse/HBASE-16981
> Project: HBase
> Issue Type: New Feature
> Components: mob
> Affects Versions: 2.0.0
> Reporter: huaxiang sun
> Assignee: huaxiang sun
> Attachments: HBASE-16981.master.001.patch,
> HBASE-16981.master.002.patch,
> Supportingweeklyandmonthlymobcompactionpartitionpolicyinhbase.pdf
>
>
> Today the mob region holds all mob files for all regions. With daily
> partition mob compaction policy, after major mob compaction, there is still
> one file per region daily. Given there is 365 days in one year, at least 365
> files per region. Since HDFS has limitation for number of files under one
> folder, this is not going to scale if there are lots of regions. To reduce
> mob file number, we want to introduce other partition policies such as
> weekly, monthly to compact mob files within one week or month into one file.
> This jira is create to track this effort.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)