On Thu, Jan 19, 2012 at 12:57 AM, T Vinod Gupta <[email protected]>wrote:

> The stack traces are from the client.
> hbase version - 0.90.3-cdh3u1, r, Mon Jul 18 08:23:50 PDT 2011
>
> java version "1.6.0_20"
> OpenJDK Runtime Environment (IcedTea6 1.9.8)
> (amazon-52.1.9.8.36.amzn1-x86_64)
> OpenJDK 64-Bit Server VM (build 19.0-b09, mixed mode)
>

The stack traces never move on from their current locations no matter how
often you thread dump?

Have you tried a more recent Oracle JVM?

St.Ack


>
> So I'll tell you the symptoms.. My writes are pretty smooth and don't face
> problems.. this datapath typically involves row gets (with no filters and
> such) and row puts and flushes in multiple threads (about 11 threads per
> process and i have 2 such processes). Its the bursts in my read codepath
> that causes these symptoms. In read codepath, I am doing scans (with
> start/end row and regex qualifier filter) and row/column gets in multiple
> threads (about 11 threads per process, 2 such processes). whenever there is
> a burst, i see this happening. other accompanying events are region server
> shutting down since the master thinks it is offline. sometimes, around
> these events sequence, i see long gc pauses (5-40 sec range). im following
> the gc recommendations of the hbase book. there are 33 regions currently,
> max region file size set to 2GB, 3GB heap given to RS and master each.
>
> thanks
>
> On Wed, Jan 18, 2012 at 8:55 PM, Stack <[email protected]> wrote:
>
> > On Wed, Jan 18, 2012 at 10:37 AM, T Vinod Gupta <[email protected]>
> > wrote:
> > > Has anyone seen this kind of behavior? i am wondering what is
> triggering
> > it
> > > and how can i resolve it.
> > > I have a multithreaded app which is reading stuff out of hbase. i
> > recently
> > > made a change to switch from using Get for each row I am interested to
> > > using Scan with row key ranges..  What i see this the process will
> pause
> > > forever. and starts taking up a lot of cpu (100% on a 4 core machine -
> so
> > > effectively 25%). all stacks look like this -
> > >
> >
> > In the client or over on the regionserver (below seems to be from
> > client)? What version of hbase?  What version of jvm?
> > St.Ack
> >
> > > "pool-1-thread-13" prio=10 tid=0x00007fa094013000 nid=0x56b7 waiting
> for
> > > monitor entry [0x00007fa0e4c56000]
> > >   java.lang.Thread.State: BLOCKED (on object monitor)
> > >        at java.util.TreeMap.put(TreeMap.java:571)
> > >        at
> > > org.apache.hadoop.hbase.client.Result.getNoVersionMap(Result.java:360)
> > >
> > > "pool-1-thread-11" prio=10 tid=0x00007fa0ec284000 nid=0x1c7 in
> > > Object.wait() [0x00007fa0e5969000]
> > >   java.lang.Thread.State: WAITING (on object monitor)
> > >        at java.lang.Object.wait(Native Method)
> > >        - waiting on <0x00000000f026f3b0> (a
> > > org.apache.hadoop.hbase.ipc.HBaseClient$Call)
> > >        at java.lang.Object.wait(Object.java:502)
> > >        at
> > > org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:757)
> > >        - locked <0x00000000f026f3b0> (a
> > > org.apache.hadoop.hbase.ipc.HBaseClient$Call)
> > >        at
> > > org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257)
> > >        at $Proxy4.get(Unknown Source)
> > >        at org.apache.hadoop.hbase.client.HTable$4.call(HTable.java:549)
> > >        at org.apache.hadoop.hbase.client.HTable$4.call(HTable.java:547)
> > >        at
> > >
> >
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1000)
> > >        at org.apache.hadoop.hbase.client.HTable.get(HTable.java:546)
> > >
> > > "pool-1-thread-10" prio=10 tid=0x00007fa0ec282000 nid=0x1c6 waiting for
> > > monitor entry [0x00007fa0e5a6a000]
> > >   java.lang.Thread.State: BLOCKED (on object monitor)
> > >        at org.apache.hadoop.hbase.KeyValue.split(KeyValue.java:1048)
> > >        at org.apache.hadoop.hbase.client.Result.getMap(Result.java:309)
> > >        at
> > > org.apache.hadoop.hbase.client.Result.getNoVersionMap(Result.java:345)
> > >
> > > "pool-1-thread-6" prio=10 tid=0x00007fa0ec21f000 nid=0x1c2 in
> > Object.wait()
> > > [0x0
> > > 0007fa0e5e6d000]
> > >   java.lang.Thread.State: WAITING (on object monitor)
> > >        at java.lang.Object.wait(Native Method)
> > >        - waiting on <0x00000000f0766b10> (a
> > > org.apache.hadoop.hbase.ipc.HBaseCl
> > > ient$Call)
> > >        at java.lang.Object.wait(Object.java:502)
> > >        at
> > > org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:757)
> > >        - locked <0x00000000f0766b10> (a
> > > org.apache.hadoop.hbase.ipc.HBaseClient
> > > $Call)
> > >        at
> > > org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257
> > > )
> > >        at $Proxy4.next(Unknown Source)
> > >        at
> > > org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.j
> > > ava:79)
> > >        at
> > > org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.j
> > > ava:38)
> > >        at
> > > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplemen
> > > tation.getRegionServerWithRetries(HConnectionManager.java:1000)
> > >        at
> > >
> >
> org.apache.hadoop.hbase.client.HTable$ClientScanner.next(HTable.java:1100)
> > >        at
> > >
> >
> org.apache.hadoop.hbase.client.HTable$ClientScanner$1.hasNext(HTable.java:1210)
> > >
> > >
> > > The only thing I observed in the RS logs is there were some compactions
> > > going on. And the GC logs did show some pauses as high as 5sec. but it
> is
> > > past that state but the app still doesn't make progress.. any insights?
> > >
> > > thanks
> >
>

Reply via email to