[ 
https://issues.apache.org/jira/browse/HBASE-14735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15029342#comment-15029342
 ] 

Shuaifeng Zhou commented on HBASE-14735:
----------------------------------------

Thanks a lot for the explain, [~stack]
We met the problem. The huge region can not be compacted to a few files because 
high input load, and if cannot be split, the input load aways on the region, 
this situation become worse and worse.
If split the region to 2, the input load will be split and balanced on the 2 
children.
What you wary about the patch is reasonable, we also met the the reference file 
problem. After we apply the patch on our cluster, the huge region also cannot 
be split, because there is a reference file, for some reason, the file aways 
cannot be selected to compact, and we sent a major compact request to solve the 
problem. The patch may not solve the huge region problem, but can prevent it.
In the patch, we respect the rule that compact comes first, but give a chance 
to split if region is too big. 
If region split before it grows too big, compact on the children may be easily, 
and can clean the reference intime before the children grow too big. 

> Region may grow too big and can not be split
> --------------------------------------------
>
>                 Key: HBASE-14735
>                 URL: https://issues.apache.org/jira/browse/HBASE-14735
>             Project: HBase
>          Issue Type: Bug
>          Components: Compaction, regionserver
>    Affects Versions: 1.1.2, 0.98.15
>            Reporter: Shuaifeng Zhou
>            Assignee: Shuaifeng Zhou
>         Attachments: 14735-0.98.patch, 14735-branch-1.1.patch, 
> 14735-branch-1.2.patch, 14735-branch-1.2.patch, 14735-master (2).patch, 
> 14735-master.patch, 14735-master.patch
>
>
> When a compaction completed, may there are also many storefiles in the store, 
> and CompactPriority < 0, then compactSplitThread will do a "Recursive 
> enqueue" compaction request instead of request a split:
> {code:title=CompactSplitThread.java|borderStyle=solid}
>         if (completed) {
>           // degenerate case: blocked regions require recursive enqueues
>           if (store.getCompactPriority() <= 0) {
>             requestSystemCompaction(region, store, "Recursive enqueue");
>           } else {
>             // see if the compaction has caused us to exceed max region size
>             requestSplit(region);
>           }
> {code}
> But in some situation, the "recursive enqueue" request may return null, and 
> not build up a new compaction runner. For example, an other compaction of the 
> same region is running, and compaction selection will exclude all files older 
> than the newest files currently compacting, this may cause no enough files 
> can be selected by the "recursive enqueue" request. When this happen, split 
> will not be trigged. If the input load is high enough, compactions aways 
> running on the region, and split will never be triggered.
> In our cluster, this situation happened, and a huge region more than 400GB 
> and 100+ storefiles appeared. Version is 0.98.10, and the trank also have the 
> problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to