[jira] [Updated] (HBASE-7842) Add compaction policy that explores more storefile groups

Elliott Clark (JIRA) Fri, 05 Apr 2013 14:29:17 -0700

     [ 
https://issues.apache.org/jira/browse/HBASE-7842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Elliott Clark updated HBASE-7842:
---------------------------------

       Resolution: Fixed
    Fix Version/s: 0.98.0
                   0.95.1
     Release Note: Default compaction policy has been changed to a new policy 
that will explore more groups of files and is more strict about enforcing the 
size ratio requirements.
     Hadoop Flags: Reviewed
           Status: Resolved  (was: Patch Available)
    
> Add compaction policy that explores more storefile groups
> ---------------------------------------------------------
>
>                 Key: HBASE-7842
>                 URL: https://issues.apache.org/jira/browse/HBASE-7842
>             Project: HBase
>          Issue Type: New Feature
>          Components: Compaction
>            Reporter: Elliott Clark
>            Assignee: Elliott Clark
>             Fix For: 0.95.1, 0.98.0
>
>         Attachments: HBASE-7842-0.patch, HBASE-7842-2.patch, 
> HBASE-7842-3.patch, HBASE-7842-4.patch, HBASE-7842-5.patch, 
> HBASE-7842-6.patch, HBASE-7842-7.patch
>
>
> Some workloads that are not as stable can have compactions that are too large 
> or too small using the current storefile selection algorithm.
> Currently:
> * Find the first file that Size(fi) <= Sum(0, i-1, FileSize(fx))
> * Ensure that there are the min number of files (if there aren't then bail 
> out)
> * If there are too many files keep the larger ones.
> I would propose something like:
> * Find all sets of storefiles where every file satisfies 
> ** FileSize(fi) <= Sum(0, i-1, FileSize(fx))
> ** Num files in set =< max
> ** Num Files in set >= min
> * Then pick the set of files that maximizes ((# storefiles in set) / 
> Sum(FileSize(fx)))
> The thinking is that the above algorithm is pretty easy reason about, all 
> files satisfy the ratio, and should rewrite the least amount of data to get 
> the biggest impact in seeks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7842) Add compaction policy that explores more storefile groups

Reply via email to