[ 
https://issues.apache.org/jira/browse/HBASE-16981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15673271#comment-15673271
 ] 

Jingcheng Du commented on HBASE-16981:
--------------------------------------

Hi [~huaxiang]. I discussed this with Anoop offline, and think his proposal is 
a good idea.
We can do the compaction in different stages/ways according to the compaction 
policy and interval.
For instance, if we use a monthly policy and week interval. We should run the 
compaction in one of the two ways (group files in the same week, or group them 
in the same month).
1. If now-lastMonthCompaction>=1month, run the monthly policy. At this time use 
a larger mergeable threshold (maybe 4*7*mergeableThreshold).
2. If now-lastWeekCompaction>=1week, run the weekly pollicy. At this time use 
7*mergeableThreshold as the mergeable threshold.
For weekly policy and daily compaction interval, the ways should be,
1. If now-lastWeekCompaction>=1week, run the weekly pollicy. At this time use 
7*mergeableThreshold as the mergeable threshold.
2. If now -lastDailyCompaction>=1day, run daily policy, and directly use 
mergetableThreshold.

This can reduce the number of files and write amplification. What do you think? 
Thanks.

> Expand Mob Compaction Partition policy from daily to weekly, monthly and 
> beyond
> -------------------------------------------------------------------------------
>
>                 Key: HBASE-16981
>                 URL: https://issues.apache.org/jira/browse/HBASE-16981
>             Project: HBase
>          Issue Type: New Feature
>          Components: mob
>    Affects Versions: 2.0.0
>            Reporter: huaxiang sun
>            Assignee: huaxiang sun
>         Attachments: HBASE-16981.master.001.patch, 
> HBASE-16981.master.002.patch, 
> Supportingweeklyandmonthlymobcompactionpartitionpolicyinhbase.pdf
>
>
> Today the mob region holds all mob files for all regions. With daily 
> partition mob compaction policy, after major mob compaction, there is still 
> one file per region daily. Given there is 365 days in one year, at least 365 
> files per region. Since HDFS has limitation for number of files under one 
> folder, this is not going to scale if there are lots of regions. To reduce 
> mob file number,  we want to introduce other partition policies such as 
> weekly, monthly to compact mob files within one week or month into one file. 
> This jira is create to track this effort.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to