stack created HBASE-14284:
-----------------------------
Summary: In TRUNK, AsyncRpcClient does not timeout; hangs
TestDistributedLogReplay, etc.
Key: HBASE-14284
URL: https://issues.apache.org/jira/browse/HBASE-14284
Project: HBase
Issue Type: Bug
Reporter: stack
Assignee: stack
TestDistributedLogReplay puts up regionservers with *40* priority handlers
each. This makes for TDLR running with many hundreds of threads. Trying to
figure why 40, I see the test can hang if less with all client use stuck never
timing out:
{code}
"RS:2;localhost:58498" prio=5 tid=0x00007fd284d4e800 nid=0x416af in
Object.wait() [0x000000012952e000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:461)
at
io.netty.util.concurrent.DefaultPromise.await0(DefaultPromise.java:355)
- locked <0x00000007dff93ea0> (a org.apache.hadoop.hbase.ipc.AsyncCall)
at
io.netty.util.concurrent.DefaultPromise.await(DefaultPromise.java:266)
at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:42)
at
org.apache.hadoop.hbase.ipc.AsyncRpcClient.call(AsyncRpcClient.java:231)
at
org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:214)
at
org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:288)
at
org.apache.hadoop.hbase.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$BlockingStub.regionServerReport(RegionServerStatusProtos.java:8994)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.tryRegionServerReport(HRegionServer.java:1148)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:957)
at
org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.runRegionServer(MiniHBaseCluster.java:156)
at
org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.access$000(MiniHBaseCluster.java:108)
at
org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer$1.run(MiniHBaseCluster.java:140)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:356)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1594)
at
org.apache.hadoop.hbase.security.User$SecureHadoopUser.runAs(User.java:279)
at
org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.run(MiniHBaseCluster.java:138)
at java.lang.Thread.run(Thread.java:744)
{code}
We never recover.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)