[ 
https://issues.apache.org/jira/browse/HBASE-16981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15673291#comment-15673291
 ] 

Anoop Sam John commented on HBASE-16981:
----------------------------------------

Thanks for the summary..  Ya we had some long discuss on this :-)
Let us detail every thing in a pdf or a shared google doc. Latter is better so 
that can add comments/Qs directly.  Consider all possible cases..  Also we need 
to care abt the possible TTLs in the MOB data and that get cleared. We had a 
TTLCleaner chore which directly remove full MOB files..  I believe MOB kind of 
data mostly will have TTLs.  So when consider the strategy keep this also 
mind.. We should allow TTL expired data to get removed easily also..  Just 
giving direction..

So here the change from the original way proposed is this.  In both the aim is 
reduce # files under region dir
Original proposal included aim to reduce this # files in every compaction..  
Instead of strict daily grouping, we will do monthly/weekly.
The staged one will not do this always.. By default the compactions will try to 
do the lowest possible grouping ie. daily..  When we pass a week we might do a 
weekly 2nd stage, and when pass a month , we might do a monthly 3rd stage.
And regarding size restrictions on selection of files, we can do all maths as 
Jingcheng mentions above..

Thanks for taking up this important work [~huaxiang].


> Expand Mob Compaction Partition policy from daily to weekly, monthly and 
> beyond
> -------------------------------------------------------------------------------
>
>                 Key: HBASE-16981
>                 URL: https://issues.apache.org/jira/browse/HBASE-16981
>             Project: HBase
>          Issue Type: New Feature
>          Components: mob
>    Affects Versions: 2.0.0
>            Reporter: huaxiang sun
>            Assignee: huaxiang sun
>         Attachments: HBASE-16981.master.001.patch, 
> HBASE-16981.master.002.patch, 
> Supportingweeklyandmonthlymobcompactionpartitionpolicyinhbase.pdf
>
>
> Today the mob region holds all mob files for all regions. With daily 
> partition mob compaction policy, after major mob compaction, there is still 
> one file per region daily. Given there is 365 days in one year, at least 365 
> files per region. Since HDFS has limitation for number of files under one 
> folder, this is not going to scale if there are lots of regions. To reduce 
> mob file number,  we want to introduce other partition policies such as 
> weekly, monthly to compact mob files within one week or month into one file. 
> This jira is create to track this effort.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to