[
https://issues.apache.org/jira/browse/HBASE-19794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16329492#comment-16329492
]
stack edited comment on HBASE-19794 at 1/17/18 9:44 PM:
--------------------------------------------------------
I can't make this hang locally or on a test machine. I see it failing 16% of
time according to
[https://builds.apache.org/view/H-L/view/HBase/job/HBase-Find-Flaky-Tests-branch2.0/lastSuccessfulBuild/artifact/dashboard.html]
Its a timeout.
Log has loads of threads hanging out. Some Proc workers blocked:
Thread 2268 (RS_CLOSE_REGION-asf903:58756-1): State: BLOCKED Blocked count: 12
Waited count: 17 Blocked on
org.apache.hadoop.hbase.master.assignment.RegionStates$RegionStateNode@1c0991d8
Blocked by 2083 (ProcExecWrkr-6) Stack:
org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportTransition(AssignmentManager.java:869)
org.apache.hadoop.hbase.master.assignment.AssignmentManager.updateRegionTransition(AssignmentManager.java:857)
org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportRegionStateTransition(AssignmentManager.java:801)
org.apache.hadoop.hbase.master.MasterRpcServices.reportRegionStateTransition(MasterRpcServices.java:1561)
org.apache.hadoop.hbase.regionserver.HRegionServer.reportRegionStateTransition(HRegionServer.java:2263)
org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:121)
org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:748) Thread 2267
(RS_CLOSE_REGION-asf903:58756-0): State: BLOCKED Blocked count: 14 Waited
count: 17 Blocked on
org.apache.hadoop.hbase.master.assignment.RegionStates$RegionStateNode@75cdbae3
Blocked by 2086 (ProcExecWrkr-9) Stack:
org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportTransition(AssignmentManager.java:869)
org.apache.hadoop.hbase.master.assignment.AssignmentManager.updateRegionTransition(AssignmentManager.java:857)
org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportRegionStateTransition(AssignmentManager.java:801)
org.apache.hadoop.hbase.master.MasterRpcServices.reportRegionStateTransition(MasterRpcServices.java:1561)
org.apache.hadoop.hbase.regionserver.HRegionServer.reportRegionStateTransition(HRegionServer.java:2263)
org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:121)
org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:748)
The Proc Workers are not daemon threads. Let me change that so at least we stop
timing out. See HBASE-19527
was (Author: stack):
I can't make this hang locally or on a test machine. I see it failing 16% of
time according to
[https://builds.apache.org/view/H-L/view/HBase/job/HBase-Find-Flaky-Tests-branch2.0/lastSuccessfulBuild/artifact/dashboard.html]
Its a timeout.
Log has loads of threads hanging out. Some Proc workers blocked:
Thread 2268 (RS_CLOSE_REGION-asf903:58756-1): State: BLOCKED Blocked count: 12
Waited count: 17 Blocked on
org.apache.hadoop.hbase.master.assignment.RegionStates$RegionStateNode@1c0991d8
Blocked by 2083 (ProcExecWrkr-6) Stack:
org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportTransition(AssignmentManager.java:869)
org.apache.hadoop.hbase.master.assignment.AssignmentManager.updateRegionTransition(AssignmentManager.java:857)
org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportRegionStateTransition(AssignmentManager.java:801)
org.apache.hadoop.hbase.master.MasterRpcServices.reportRegionStateTransition(MasterRpcServices.java:1561)
org.apache.hadoop.hbase.regionserver.HRegionServer.reportRegionStateTransition(HRegionServer.java:2263)
org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:121)
org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:748) Thread 2267
(RS_CLOSE_REGION-asf903:58756-0): State: BLOCKED Blocked count: 14 Waited
count: 17 Blocked on
org.apache.hadoop.hbase.master.assignment.RegionStates$RegionStateNode@75cdbae3
Blocked by 2086 (ProcExecWrkr-9) Stack:
org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportTransition(AssignmentManager.java:869)
org.apache.hadoop.hbase.master.assignment.AssignmentManager.updateRegionTransition(AssignmentManager.java:857)
org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportRegionStateTransition(AssignmentManager.java:801)
org.apache.hadoop.hbase.master.MasterRpcServices.reportRegionStateTransition(MasterRpcServices.java:1561)
org.apache.hadoop.hbase.regionserver.HRegionServer.reportRegionStateTransition(HRegionServer.java:2263)
org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:121)
org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:748)
The Proc Workers are not daemon threads. Let me change that so at least we stop
timing out.
> TestZooKeeper hangs
> -------------------
>
> Key: HBASE-19794
> URL: https://issues.apache.org/jira/browse/HBASE-19794
> Project: HBase
> Issue Type: Bug
> Reporter: Duo Zhang
> Assignee: stack
> Priority: Critical
> Fix For: 2.0.0-beta-2
>
> Attachments: org.apache.hadoop.hbase.TestZooKeeper-output.txt
>
>
> Seems like the TestZKAsyncRegistry that hangs in shutdown.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)