[ 
https://issues.apache.org/jira/browse/HBASE-14735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15031504#comment-15031504
 ] 

Shuaifeng Zhou commented on HBASE-14735:
----------------------------------------

Hi, [~stack]
We running with this patch applied on our clusters. We have many clusters, some 
0.94 and some 0.98 version. Recently we are upgrading, not finished. This patch 
really works. 
In 0.98 version, there are no reference problem, but 0.94 have. Because in 
0.98, if there is any reference, compact will force to major, but in 0.94, it's 
not.  Both version have huge region problem. Because 0.94 is too old and to be 
upgraded, I haven't provide the patch on 0.94.
Below are some of the du result and lsr result of one example in 0.94( after 
split onece, alse have a 200G+ huge region, a file more than 100G, but aways 
being selected during compaction. And also hive 2 reference after several 
compactions), the regionsize configured is 40GB
du:
{noformat}
32796614610   
hdfs://hm101:9000/hbase/TAB_INTERESTING/effa8658177d023f4001b5d169bca149
24719467342   
hdfs://hm101:9000/hbase/TAB_INTERESTING/f0819cb446cbdf785fb85638553605c5
210031594622  
hdfs://hm101:9000/hbase/TAB_INTERESTING/f0c180d817bd74e1743c56f6478ac019
40210595441   
hdfs://hm101:9000/hbase/TAB_INTERESTING/f0e08b1a4b1169a7b1f537c068a577bb
50824015435   
hdfs://hm101:9000/hbase/TAB_INTERESTING/f0e710bb05dbc394d11524fa6dc34016
21566277612   
hdfs://hm101:9000/hbase/TAB_INTERESTING/f11affc0f157e8f4cacce13c6faefe52
{noformat}
lsr:
{noformat}
-rw-r--r--   2 root supergroup   4181396311 2015-11-23 09:48 
/hbase/TAB_INTERESTING/f0c180d817bd74e1743c56f6478ac019/F/1f5b326bbbe64b178ce98783fe8223af
-rw-r--r--   2 root supergroup   4128995550 2015-11-23 10:03 
/hbase/TAB_INTERESTING/f0c180d817bd74e1743c56f6478ac019/F/29289c0ea8a746a284d585a928611d65
-rw-r--r--   2 root supergroup   4137771163 2015-11-22 08:05 
/hbase/TAB_INTERESTING/f0c180d817bd74e1743c56f6478ac019/F/3458f12dd5d842fa8629ade59fbc5443
-rw-r--r--   2 root supergroup   4122308215 2015-11-23 10:08 
/hbase/TAB_INTERESTING/f0c180d817bd74e1743c56f6478ac019/F/4ecdd970f2d845d680d5273b13a4d463
-rw-r--r--   2 root supergroup           74 2015-11-22 01:34 
/hbase/TAB_INTERESTING/f0c180d817bd74e1743c56f6478ac019/F/5b4fa0edcd37427cadc50602b0a0758a.78b89f6a03d5e5f61e7e49b2cb1bb0a8
-rw-r--r--   2 root supergroup 122997494766 2015-11-22 22:22 
/hbase/TAB_INTERESTING/f0c180d817bd74e1743c56f6478ac019/F/6517dbf39bd449c1ae97cdcc0f341100
-rw-r--r--   2 root supergroup   4121185787 2015-11-22 07:57 
/hbase/TAB_INTERESTING/f0c180d817bd74e1743c56f6478ac019/F/72c864a6b36148b98a26f6e9fd52e89c
-rw-r--r--   2 root supergroup   4131467137 2015-11-23 09:58 
/hbase/TAB_INTERESTING/f0c180d817bd74e1743c56f6478ac019/F/74ff9ced889f43839fa520dcaba1744a
-rw-r--r--   2 root supergroup   1963236714 2015-11-23 10:34 
/hbase/TAB_INTERESTING/f0c180d817bd74e1743c56f6478ac019/F/75e5ab25679e4bc6bc7490577e90b166
-rw-r--r--   2 root supergroup   4141563183 2015-11-23 09:54 
/hbase/TAB_INTERESTING/f0c180d817bd74e1743c56f6478ac019/F/7c5d0db31f92424fa06e6070dc4d0817
{noformat}

> Region may grow too big and can not be split
> --------------------------------------------
>
>                 Key: HBASE-14735
>                 URL: https://issues.apache.org/jira/browse/HBASE-14735
>             Project: HBase
>          Issue Type: Bug
>          Components: Compaction, regionserver
>    Affects Versions: 1.1.2, 0.98.15
>            Reporter: Shuaifeng Zhou
>            Assignee: Shuaifeng Zhou
>         Attachments: 14735-0.98.patch, 14735-branch-1.1.patch, 
> 14735-branch-1.2.patch, 14735-branch-1.2.patch, 14735-master (2).patch, 
> 14735-master.patch, 14735-master.patch
>
>
> When a compaction completed, may there are also many storefiles in the store, 
> and CompactPriority < 0, then compactSplitThread will do a "Recursive 
> enqueue" compaction request instead of request a split:
> {code:title=CompactSplitThread.java|borderStyle=solid}
>         if (completed) {
>           // degenerate case: blocked regions require recursive enqueues
>           if (store.getCompactPriority() <= 0) {
>             requestSystemCompaction(region, store, "Recursive enqueue");
>           } else {
>             // see if the compaction has caused us to exceed max region size
>             requestSplit(region);
>           }
> {code}
> But in some situation, the "recursive enqueue" request may return null, and 
> not build up a new compaction runner. For example, an other compaction of the 
> same region is running, and compaction selection will exclude all files older 
> than the newest files currently compacting, this may cause no enough files 
> can be selected by the "recursive enqueue" request. When this happen, split 
> will not be trigged. If the input load is high enough, compactions aways 
> running on the region, and split will never be triggered.
> In our cluster, this situation happened, and a huge region more than 400GB 
> and 100+ storefiles appeared. Version is 0.98.10, and the trank also have the 
> problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to