[ 
https://issues.apache.org/jira/browse/HBASE-2966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12914096#action_12914096
 ] 

Patrick Hunt commented on HBASE-2966:
-------------------------------------

A fix release for 3.3.2 is in progress on the zk dev list if you'd like to 
follow along. Hopefully we'll get this out soon. You could try the current ZK 
branch-3.3 which includes fixes for this issue (and others).

> HBase client stuck on 
> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1278) holding 
> regionLockObject lock
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-2966
>                 URL: https://issues.apache.org/jira/browse/HBASE-2966
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Kannan Muthukkaruppan
>         Attachments: stack.txt
>
>
> We noticed in one case the HBase client program got stuck on 
> Zookeeper.exists() call.
>  
> One of the threads was stuck here on the ZK call while holding an HBase level 
> lock (regionLockObject in locateRegionInMeta()).
> {code} 
> "thrift-0-thread-8" prio=10 tid=0x00007f189ca4c000 nid=0x550f in 
> Object.wait() [0x0000000044241000]
>    java.lang.Thread.State: WAITING (on object monitor)
>                 at java.lang.Object.wait(Native Method)
>                 at java.lang.Object.wait(Object.java:485)
>                 at 
> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1278)
>                 - locked <0x00007f1903a0c280> (a 
> org.apache.zookeeper.ClientCnxn$Packet)
>                 at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:804)
>                 at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:837)
>                 at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper.getRSDirectoryCount(ZooKeeperWrapper.java:765)
>                 at 
> org.apache.hadoop.hbase.client.HTable.getCurrentNrHRS(HTable.java:173)
>                 at 
> org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:147)
>                 at 
> org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:124)
>                 at 
> org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:89)
>                 at 
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.prefetchRegionCache(HConnectionManager.java:734)
>                 at 
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:785)
>                 - locked <0x00007f190d868848> (a java.lang.Object)
>                 at 
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:679)
>                 at 
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:646)
>                 at 
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionLocation(HConnectionManager.java:472)
>                 at 
> org.apache.hadoop.hbase.client.ServerCallable.instantiateServer(ServerCallable.java:57)
>                 at 
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionServerWithRetries(HConnectionManager.java:1147)
>                 at org.apache.hadoop.hbase.client.HTable.get(HTable.java:503)
> {code} 
> The remaining other threads are all waiting on the regionLockObject lock 
> (held by the above thread) with stacks like:
>  
> {code}
> thrift-0-thread-7" prio=10 tid=0x00007f189ca4a800 nid=0x550e waiting for 
> monitor entry [0x0000000044141000]
>    java.lang.Thread.State: BLOCKED (on object monitor)
>                 at 
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:783)
>                 - waiting to lock <0x00007f190d868848> (a java.lang.Object)
>                 at 
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:679)
>                 at 
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:646)
>                 at 
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionLocation(HConnectionManager.java:472)
>                 at 
> org.apache.hadoop.hbase.client.ServerCallable.instantiateServer(ServerCallable.java:57)
>                 at 
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionServerWithRetries(HConnectionManager.java:1147)
>                 at org.apache.hadoop.hbase.client.HTable.get(HTable.java:503)
> {code}
> Any ideas?
>  
> Meanwhile, I'll look into the ZK logs from the relevant time some more and 
> get back if I have more information.
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to