[ 
https://issues.apache.org/jira/browse/HBASE-7763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13572650#comment-13572650
 ] 

Elliott Clark commented on HBASE-7763:
--------------------------------------

* Default: What we have now.  
** Class: {{DefaultCompactionPolicy}}
* Sort: Default + Sort by size. 
** Class: {{DefaultCompactionPolicySort}}
* SortSmall: Currently after finding some files that satisfy some definition of 
the ratio the compaction select then keeps only Max number of files.  To do 
that it drops the smallest.  The hope was to get the bigger files togther so 
that later they wouldn't be part of a smaller compaction.  From testing this 
doesn't seem to work.  So after sorting I take the smaller files to compact.   
eg if I have [101, 100, 99] and my max is 2.  The old method would take 101 + 
100.  Now I would take 99 + 100.
** Class: DefaultCompactionPolicySortSmall
* Ratio: Default + the guarantee that all files satisfy Fi < sum( f0..Fi-1) * 
ratio.  Right now only the left most file in the compaction has to follow this. 
 Which means you can compact wildly different files together if large files 
dominate the sum. eg [1000000000, 999, 998, 2, 1] that will compact 999 + 998 + 
2 + 1.  1 and 2 seem like very strange candidates to compact with such unlike 
files.
** Class: DefaultCompactionPolicyRatioRight
* SortSmallRatio: Everything discussed in one class.  Sort by size.  Take 
smallest files when truncating list for max files.  And ensure all files 
satisfy the raito.
** Class: DefaultCompactionPolicySortSmallRatioRight
                
> Compactions not sorting based on size anymore.
> ----------------------------------------------
>
>                 Key: HBASE-7763
>                 URL: https://issues.apache.org/jira/browse/HBASE-7763
>             Project: HBase
>          Issue Type: Bug
>          Components: Compaction
>    Affects Versions: 0.96.0, 0.94.4
>            Reporter: Elliott Clark
>            Assignee: Elliott Clark
>            Priority: Critical
>             Fix For: 0.96.0, 0.94.6
>
>         Attachments: HBASE-7763-trunk-TESTING.patch, 
> HBASE-7763-trunk-TESTING.patch, HBASE-7763-trunk-TESTING.patch
>
>
> Currently compaction selection is not sorting based on size.  This causes 
> selection to choose larger files to re-write than are needed when bulk loads 
> are involved.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to