[ 
https://issues.apache.org/jira/browse/HBASE-3654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13007350#comment-13007350
 ] 

Jean-Daniel Cryans commented on HBASE-3654:
-------------------------------------------

Makes sense, as always Todd :)

So I'm debugging a MR job that does tons and tons of Gets in each task and we 
decided to debug the latency of each query since we weren't able to figure out 
the slowness (I see in the jstacks the Get on the client side but never on the 
RS). Turns out 99% of the Gets are done in 0-1ms and then it jumps over to 
170ms for the rest. I suspect that's due to this contention.

> Weird blocking between getOnlineRegion and createRegionLoad
> -----------------------------------------------------------
>
>                 Key: HBASE-3654
>                 URL: https://issues.apache.org/jira/browse/HBASE-3654
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>            Priority: Blocker
>             Fix For: 0.90.2
>
>
> Saw this when debugging something else:
> {code}
> "regionserver60020" prio=10 tid=0x00007f538c1c0000 nid=0x4c7 runnable 
> [0x00007f53931da000]
>    java.lang.Thread.State: RUNNABLE
>       at 
> org.apache.hadoop.hbase.regionserver.Store.getStorefilesIndexSize(Store.java:1380)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.createRegionLoad(HRegionServer.java:916)
>       - locked <0x0000000672aa0a00> (a 
> java.util.concurrent.ConcurrentSkipListMap)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.buildServerLoad(HRegionServer.java:767)
>       - locked <0x0000000656f62710> (a java.util.HashMap)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.tryRegionServerReport(HRegionServer.java:722)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:591)
>       at java.lang.Thread.run(Thread.java:662)
> "IPC Reader 9 on port 60020" prio=10 tid=0x00007f538c1be000 nid=0x4c6 waiting 
> for monitor entry [0x00007f53932db000]
>    java.lang.Thread.State: BLOCKED (on object monitor)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.getFromOnlineRegions(HRegionServer.java:2295)
>       - waiting to lock <0x0000000656f62710> (a java.util.HashMap)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.getOnlineRegion(HRegionServer.java:2307)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:2333)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegionServer$QosFunction.isMetaRegion(HRegionServer.java:379)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegionServer$QosFunction.apply(HRegionServer.java:422)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegionServer$QosFunction.apply(HRegionServer.java:361)
>       at 
> org.apache.hadoop.hbase.ipc.HBaseServer.getQosLevel(HBaseServer.java:1126)
>       at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:982)
>       at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:946)
>       at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:522)
>       at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:316)
>       - locked <0x0000000656e60068> (a 
> org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>       at java.lang.Thread.run(Thread.java:662)
> ...
> "IPC Reader 0 on port 60020" prio=10 tid=0x00007f538c08b000 nid=0x4bd waiting 
> for monitor entry [0x00007f5393be4000]
>    java.lang.Thread.State: BLOCKED (on object monitor)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.getFromOnlineRegions(HRegionServer.java:2295)
>       - waiting to lock <0x0000000656f62710> (a java.util.HashMap)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.getOnlineRegion(HRegionServer.java:2307)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:2333)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegionServer$QosFunction.isMetaRegion(HRegionServer.java:379)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegionServer$QosFunction.apply(HRegionServer.java:422)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegionServer$QosFunction.apply(HRegionServer.java:361)
>       at 
> org.apache.hadoop.hbase.ipc.HBaseServer.getQosLevel(HBaseServer.java:1126)
>       at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:982)
>       at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:946)
>       at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:522)
>       at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:316)
>       - locked <0x0000000656e635c8> (a 
> org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>       at java.lang.Thread.run(Thread.java:662)
> {code}
> All the readers are blocked! I have the feeling something much better could 
> be done.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to