[
https://issues.apache.org/jira/browse/HBASE-14735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15031504#comment-15031504
]
Shuaifeng Zhou commented on HBASE-14735:
----------------------------------------
Hi, [~stack]
We running with this patch applied on our clusters. We have many clusters, some
0.94 and some 0.98 version. Recently we are upgrading, not finished. This patch
really works.
In 0.98 version, there are no reference problem, but 0.94 have. Because in
0.98, if there is any reference, compact will force to major, but in 0.94, it's
not. Both version have huge region problem. Because 0.94 is too old and to be
upgraded, I haven't provide the patch on 0.94.
Below are some of the du result and lsr result of one example in 0.94( after
split onece, alse have a 200G+ huge region, a file more than 100G, but aways
being selected during compaction. And also hive 2 reference after several
compactions), the regionsize configured is 40GB
du:
{noformat}
32796614610
hdfs://hm101:9000/hbase/TAB_INTERESTING/effa8658177d023f4001b5d169bca149
24719467342
hdfs://hm101:9000/hbase/TAB_INTERESTING/f0819cb446cbdf785fb85638553605c5
210031594622
hdfs://hm101:9000/hbase/TAB_INTERESTING/f0c180d817bd74e1743c56f6478ac019
40210595441
hdfs://hm101:9000/hbase/TAB_INTERESTING/f0e08b1a4b1169a7b1f537c068a577bb
50824015435
hdfs://hm101:9000/hbase/TAB_INTERESTING/f0e710bb05dbc394d11524fa6dc34016
21566277612
hdfs://hm101:9000/hbase/TAB_INTERESTING/f11affc0f157e8f4cacce13c6faefe52
{noformat}
lsr:
{noformat}
-rw-r--r-- 2 root supergroup 4181396311 2015-11-23 09:48
/hbase/TAB_INTERESTING/f0c180d817bd74e1743c56f6478ac019/F/1f5b326bbbe64b178ce98783fe8223af
-rw-r--r-- 2 root supergroup 4128995550 2015-11-23 10:03
/hbase/TAB_INTERESTING/f0c180d817bd74e1743c56f6478ac019/F/29289c0ea8a746a284d585a928611d65
-rw-r--r-- 2 root supergroup 4137771163 2015-11-22 08:05
/hbase/TAB_INTERESTING/f0c180d817bd74e1743c56f6478ac019/F/3458f12dd5d842fa8629ade59fbc5443
-rw-r--r-- 2 root supergroup 4122308215 2015-11-23 10:08
/hbase/TAB_INTERESTING/f0c180d817bd74e1743c56f6478ac019/F/4ecdd970f2d845d680d5273b13a4d463
-rw-r--r-- 2 root supergroup 74 2015-11-22 01:34
/hbase/TAB_INTERESTING/f0c180d817bd74e1743c56f6478ac019/F/5b4fa0edcd37427cadc50602b0a0758a.78b89f6a03d5e5f61e7e49b2cb1bb0a8
-rw-r--r-- 2 root supergroup 122997494766 2015-11-22 22:22
/hbase/TAB_INTERESTING/f0c180d817bd74e1743c56f6478ac019/F/6517dbf39bd449c1ae97cdcc0f341100
-rw-r--r-- 2 root supergroup 4121185787 2015-11-22 07:57
/hbase/TAB_INTERESTING/f0c180d817bd74e1743c56f6478ac019/F/72c864a6b36148b98a26f6e9fd52e89c
-rw-r--r-- 2 root supergroup 4131467137 2015-11-23 09:58
/hbase/TAB_INTERESTING/f0c180d817bd74e1743c56f6478ac019/F/74ff9ced889f43839fa520dcaba1744a
-rw-r--r-- 2 root supergroup 1963236714 2015-11-23 10:34
/hbase/TAB_INTERESTING/f0c180d817bd74e1743c56f6478ac019/F/75e5ab25679e4bc6bc7490577e90b166
-rw-r--r-- 2 root supergroup 4141563183 2015-11-23 09:54
/hbase/TAB_INTERESTING/f0c180d817bd74e1743c56f6478ac019/F/7c5d0db31f92424fa06e6070dc4d0817
{noformat}
> Region may grow too big and can not be split
> --------------------------------------------
>
> Key: HBASE-14735
> URL: https://issues.apache.org/jira/browse/HBASE-14735
> Project: HBase
> Issue Type: Bug
> Components: Compaction, regionserver
> Affects Versions: 1.1.2, 0.98.15
> Reporter: Shuaifeng Zhou
> Assignee: Shuaifeng Zhou
> Attachments: 14735-0.98.patch, 14735-branch-1.1.patch,
> 14735-branch-1.2.patch, 14735-branch-1.2.patch, 14735-master (2).patch,
> 14735-master.patch, 14735-master.patch
>
>
> When a compaction completed, may there are also many storefiles in the store,
> and CompactPriority < 0, then compactSplitThread will do a "Recursive
> enqueue" compaction request instead of request a split:
> {code:title=CompactSplitThread.java|borderStyle=solid}
> if (completed) {
> // degenerate case: blocked regions require recursive enqueues
> if (store.getCompactPriority() <= 0) {
> requestSystemCompaction(region, store, "Recursive enqueue");
> } else {
> // see if the compaction has caused us to exceed max region size
> requestSplit(region);
> }
> {code}
> But in some situation, the "recursive enqueue" request may return null, and
> not build up a new compaction runner. For example, an other compaction of the
> same region is running, and compaction selection will exclude all files older
> than the newest files currently compacting, this may cause no enough files
> can be selected by the "recursive enqueue" request. When this happen, split
> will not be trigged. If the input load is high enough, compactions aways
> running on the region, and split will never be triggered.
> In our cluster, this situation happened, and a huge region more than 400GB
> and 100+ storefiles appeared. Version is 0.98.10, and the trank also have the
> problem.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)