[
https://issues.apache.org/jira/browse/HBASE-14735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14989897#comment-14989897
]
stack commented on HBASE-14735:
-------------------------------
FYI, the build passed. It found a zombie and tried to get a stack trace on it
but got this:
Suspicious java process found - waiting 30s to see if there are just slow to
stop
There appear to be 1 zombie tests, they should have been killed by surefire but
survived
************ BEGIN zombies jstack extract
8270: Unable to open socket file: target process not responding or HotSpot VM
not loaded
The -F option can be used when the target process is not responding
************ END zombies jstack extract
Which is interesting. Could be an actual, real, live zombie... all hung up so
can't even take a signal.
> Region may grow too big and can not be split
> --------------------------------------------
>
> Key: HBASE-14735
> URL: https://issues.apache.org/jira/browse/HBASE-14735
> Project: HBase
> Issue Type: Bug
> Components: Compaction, regionserver
> Affects Versions: 1.1.2, 0.98.15
> Reporter: Shuaifeng Zhou
> Assignee: Shuaifeng Zhou
> Attachments: 14735-0.98.patch, 14735-branch-1.1.patch,
> 14735-branch-1.2.patch, 14735-master (2).patch, 14735-master.patch,
> 14735-master.patch
>
>
> When a compaction completed, may there are also many storefiles in the store,
> and CompactPriority < 0, then compactSplitThread will do a "Recursive
> enqueue" compaction request instead of request a split:
> {code:title=CompactSplitThread.java|borderStyle=solid}
> if (completed) {
> // degenerate case: blocked regions require recursive enqueues
> if (store.getCompactPriority() <= 0) {
> requestSystemCompaction(region, store, "Recursive enqueue");
> } else {
> // see if the compaction has caused us to exceed max region size
> requestSplit(region);
> }
> {code}
> But in some situation, the "recursive enqueue" request may return null, and
> not build up a new compaction runner. For example, an other compaction of the
> same region is running, and compaction selection will exclude all files older
> than the newest files currently compacting, this may cause no enough files
> can be selected by the "recursive enqueue" request. When this happen, split
> will not be trigged. If the input load is high enough, compactions aways
> running on the region, and split will never be triggered.
> In our cluster, this situation happened, and a huge region more than 400GB
> and 100+ storefiles appeared. Version is 0.98.10, and the trank also have the
> problem.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)