[ 
https://issues.apache.org/jira/browse/HBASE-8450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13662290#comment-13662290
 ] 

Lars Hofhansl commented on HBASE-8450:
--------------------------------------

This is a bit of a fundamental flaw in HBase.

True, if you never delete stuff, you don't need major compactions.
But if you do, especially when you place a lot of delete markers, you 
definitely want major compactions. Sergey's striped compaction will make this a 
moot point, but until then it seems that we need to be able to clean up.

Major compactions also help to eventually regain data locality if regions were 
moved.

Whether it's a day or a week almost does not matter. A week seems fine to me as 
default. Can always document to disable major compactions if you do not delete 
a lot.

Otherwise we need to introduce a "Routine Maintenance" section to the book and 
instruct to run major compactions "when it makes sense" (they have no metric to 
go by really).

Maybe a way to tackle this is to have "remove delete marker" compaction, which 
will still read all files but only rewrites HFile when a delete marker was 
found.

                
> Update hbase-default.xml and general recommendations to better suit current 
> hw, h2, experience, etc.
> ----------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-8450
>                 URL: https://issues.apache.org/jira/browse/HBASE-8450
>             Project: HBase
>          Issue Type: Task
>          Components: Usability
>            Reporter: stack
>            Assignee: stack
>            Priority: Critical
>             Fix For: 0.95.1
>
>         Attachments: 8450.txt, 8450v2.txt, 8450v3.txt, 8450v5.txt
>
>
> This is a critical task we need to do before we release; review our defaults.
> On cursory review, there are configs in hbase-default.xml that no longer have 
> matching code; there are some that should be changed because we know better 
> now and others that should be amended because hardware and deploys are bigger 
> than they used to be.
> We could also move stuff around so that the must-edit stuff is near the top 
> (zk quorum config. is mid-way down the page) and beef up the descriptions -- 
> especially since these descriptions shine through in the refguide.
> Lastly, I notice that our tgz does not "include" an hbase-default.xml other 
> than the one bundled up in the jar.  Maybe we should make it more accessible.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to