[
https://issues.apache.org/jira/browse/HBASE-3621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13005438#comment-13005438
]
Jean-Daniel Cryans commented on HBASE-3621:
-------------------------------------------
For example:
{code}
"somenode.prod.twitter.com:60000.timeoutMonitor" daemon prio=10
tid=0x00002aacb8567800 nid=0x772 in Object.wait() [0x0000000045bf1000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:485)
at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:757)
- locked <0x00002aaab2a10da8> (a org.apache.hadoop.hbase.ipc.HBaseClient$Call)
at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257)
at $Proxy6.closeRegion(Unknown Source)
at
org.apache.hadoop.hbase.master.ServerManager.sendRegionClose(ServerManager.java:589)
at
org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1093)
at
org.apache.hadoop.hbase.master.AssignmentManager$TimeoutMonitor.chore(AssignmentManager.java:1672)
- locked <0x00002aaabf759858> (a java.util.concurrent.ConcurrentSkipListMap)
at org.apache.hadoop.hbase.Chore.run(Chore.java:66
...
"main-EventThread" daemon prio=10 tid=0x00002aacb850b000 nid=0x761 waiting for
monitor entry [0x00000000455eb000]
java.lang.Thread.State: BLOCKED (on object monitor)
at
org.apache.hadoop.hbase.master.AssignmentManager.nodeDataChanged(AssignmentManager.java:525)
- waiting to lock <0x00002aaabf759858> (a
java.util.concurrent.ConcurrentSkipListMap)
at
org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:268)
at
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:530)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:506)
{code}
The ZK event thread is blocked by that other thread that talks to a RS that
doesn't answer. All ZK events get severely delayed.
> The timeout handler in AssignmentManager does an RPC while holding lock on
> RIT; a big no-no
> -------------------------------------------------------------------------------------------
>
> Key: HBASE-3621
> URL: https://issues.apache.org/jira/browse/HBASE-3621
> Project: HBase
> Issue Type: Bug
> Reporter: stack
> Fix For: 0.90.2
>
>
> J-D found this debugging a failure on Dmitriy's cluster; we're RPC'ing under
> a synchronized(regionsInTransition). Fix.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira