Re: all threads frequently getting stuck in Result.getNoVersionMap, HTable.get and Scanner.hasNext

T Vinod Gupta Thu, 19 Jan 2012 00:58:03 -0800

The stack traces are from the client.
hbase version - 0.90.3-cdh3u1, r, Mon Jul 18 08:23:50 PDT 2011


java version "1.6.0_20"
OpenJDK Runtime Environment (IcedTea6 1.9.8)
(amazon-52.1.9.8.36.amzn1-x86_64)
OpenJDK 64-Bit Server VM (build 19.0-b09, mixed mode)

So I'll tell you the symptoms.. My writes are pretty smooth and don't face
problems.. this datapath typically involves row gets (with no filters and
such) and row puts and flushes in multiple threads (about 11 threads per
process and i have 2 such processes). Its the bursts in my read codepath
that causes these symptoms. In read codepath, I am doing scans (with
start/end row and regex qualifier filter) and row/column gets in multiple
threads (about 11 threads per process, 2 such processes). whenever there is
a burst, i see this happening. other accompanying events are region server
shutting down since the master thinks it is offline. sometimes, around
these events sequence, i see long gc pauses (5-40 sec range). im following
the gc recommendations of the hbase book. there are 33 regions currently,
max region file size set to 2GB, 3GB heap given to RS and master each.

thanks

On Wed, Jan 18, 2012 at 8:55 PM, Stack <[email protected]> wrote:

> On Wed, Jan 18, 2012 at 10:37 AM, T Vinod Gupta <[email protected]>
> wrote:
> > Has anyone seen this kind of behavior? i am wondering what is triggering
> it
> > and how can i resolve it.
> > I have a multithreaded app which is reading stuff out of hbase. i
> recently
> > made a change to switch from using Get for each row I am interested to
> > using Scan with row key ranges..  What i see this the process will pause
> > forever. and starts taking up a lot of cpu (100% on a 4 core machine - so
> > effectively 25%). all stacks look like this -
> >
>
> In the client or over on the regionserver (below seems to be from
> client)? What version of hbase?  What version of jvm?
> St.Ack
>
> > "pool-1-thread-13" prio=10 tid=0x00007fa094013000 nid=0x56b7 waiting for
> > monitor entry [0x00007fa0e4c56000]
> >   java.lang.Thread.State: BLOCKED (on object monitor)
> >        at java.util.TreeMap.put(TreeMap.java:571)
> >        at
> > org.apache.hadoop.hbase.client.Result.getNoVersionMap(Result.java:360)
> >
> > "pool-1-thread-11" prio=10 tid=0x00007fa0ec284000 nid=0x1c7 in
> > Object.wait() [0x00007fa0e5969000]
> >   java.lang.Thread.State: WAITING (on object monitor)
> >        at java.lang.Object.wait(Native Method)
> >        - waiting on <0x00000000f026f3b0> (a
> > org.apache.hadoop.hbase.ipc.HBaseClient$Call)
> >        at java.lang.Object.wait(Object.java:502)
> >        at
> > org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:757)
> >        - locked <0x00000000f026f3b0> (a
> > org.apache.hadoop.hbase.ipc.HBaseClient$Call)
> >        at
> > org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257)
> >        at $Proxy4.get(Unknown Source)
> >        at org.apache.hadoop.hbase.client.HTable$4.call(HTable.java:549)
> >        at org.apache.hadoop.hbase.client.HTable$4.call(HTable.java:547)
> >        at
> >
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1000)
> >        at org.apache.hadoop.hbase.client.HTable.get(HTable.java:546)
> >
> > "pool-1-thread-10" prio=10 tid=0x00007fa0ec282000 nid=0x1c6 waiting for
> > monitor entry [0x00007fa0e5a6a000]
> >   java.lang.Thread.State: BLOCKED (on object monitor)
> >        at org.apache.hadoop.hbase.KeyValue.split(KeyValue.java:1048)
> >        at org.apache.hadoop.hbase.client.Result.getMap(Result.java:309)
> >        at
> > org.apache.hadoop.hbase.client.Result.getNoVersionMap(Result.java:345)
> >
> > "pool-1-thread-6" prio=10 tid=0x00007fa0ec21f000 nid=0x1c2 in
> Object.wait()
> > [0x0
> > 0007fa0e5e6d000]
> >   java.lang.Thread.State: WAITING (on object monitor)
> >        at java.lang.Object.wait(Native Method)
> >        - waiting on <0x00000000f0766b10> (a
> > org.apache.hadoop.hbase.ipc.HBaseCl
> > ient$Call)
> >        at java.lang.Object.wait(Object.java:502)
> >        at
> > org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:757)
> >        - locked <0x00000000f0766b10> (a
> > org.apache.hadoop.hbase.ipc.HBaseClient
> > $Call)
> >        at
> > org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257
> > )
> >        at $Proxy4.next(Unknown Source)
> >        at
> > org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.j
> > ava:79)
> >        at
> > org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.j
> > ava:38)
> >        at
> > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplemen
> > tation.getRegionServerWithRetries(HConnectionManager.java:1000)
> >        at
> >
> org.apache.hadoop.hbase.client.HTable$ClientScanner.next(HTable.java:1100)
> >        at
> >
> org.apache.hadoop.hbase.client.HTable$ClientScanner$1.hasNext(HTable.java:1210)
> >
> >
> > The only thing I observed in the RS logs is there were some compactions
> > going on. And the GC logs did show some pauses as high as 5sec. but it is
> > past that state but the app still doesn't make progress.. any insights?
> >
> > thanks
>

Re: all threads frequently getting stuck in Result.getNoVersionMap, HTable.get and Scanner.hasNext

Reply via email to