[jira] [Commented] (HBASE-26229) Make L0 files compaction in StripeCompactionPolicy faster and size under control

Xiaolin Ha (Jira) Fri, 27 Aug 2021 19:46:06 -0700


    [ 
https://issues.apache.org/jira/browse/HBASE-26229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17406096#comment-17406096
 ]


Xiaolin Ha commented on HBASE-26229:
------------------------------------

We have a table using bulkload+read mode faces this issue. I think there are 
two main problems here. One is that the only bulkload write mode makes the 
region cannot compact and split as soon as possible. It depends on the regular 
compaction thread to complete one compaction, and then split. Another problem 
is that the first compaction of the child regions must be a major compaction, 
regardless of whether the split files in the correct stripes. 

It's a cumulate problem. The major compaction after split makes the 
circumstance worse. The problem occurs before we apply the changes in 
HBASE-25302, which uses only reference files compaction instead of major 
compaction after split, we discover it recently, and it's L0 size is TBs now.  
But to resolve the huge region split and compact issue, we need to spin off the 
L0 files compaction. 

> Make L0 files compaction in StripeCompactionPolicy faster and size under 
> control 
> ---------------------------------------------------------------------------------
>
>                 Key: HBASE-26229
>                 URL: https://issues.apache.org/jira/browse/HBASE-26229
>             Project: HBase
>          Issue Type: Improvement
>          Components: Compaction
>            Reporter: Xiaolin Ha
>            Assignee: Xiaolin Ha
>            Priority: Major
>
> When selecting L0 files in the stripe store file manager to compact, they all 
> will be selected. This is the key problem. No file count control and no 
> compaction size control for L0 files compactions now. If the compaction size 
> is large, e.g. some TBs, then the L0 compaction will need a lot of time to 
> complete. 
> Since L0 files not only contains the recently flushed files, bulk loaded 
> files will also be put into L0. And what's more, when opening a daughter 
> region, if the parent stripes can not be rebuild in the daughter, all the 
> files will be put to L0.
> So when there are large enough files in L0, there will exists a quite long 
> compaction for all the L0 files. If the compaction speed less than the file 
> flush speed to L0, larger compactions afterwards. This is a big problem 
> especially in bulkloading files. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HBASE-26229) Make L0 files compaction in StripeCompactionPolicy faster and size under control

Reply via email to