RE: extremely sluggish hbase

Geoff Hendrey Tue, 20 Apr 2010 13:14:58 -0700

It took about 8 minutes to return from the following scan command:

hbase(main):002:0> scan ".META.",{LIMIT=>10}


The response included an erroneous report that "10 row(s) in 0.1770 seconds". 
It definitely was not subsecond. As I said, it was many many minutes before it 
returned.

-geoff

-----Original Message-----
From: saint....@gmail.com [mailto:saint....@gmail.com] On Behalf Of Stack
Sent: Tuesday, April 20, 2010 12:45 PM
To: hbase-user@hadoop.apache.org
Subject: Re: extremely sluggish hbase

If you scan '.META.' table is it slow also?  You could have a case of 
hbase-2451?  There is a script in the patch to that issue.  Try it.
See if that helps.
St.Ack

On Tue, Apr 20, 2010 at 12:02 PM, Geoff Hendrey <ghend...@decarta.com> wrote:
> Answers below, prefixed by "geoff:"
>
> -----Original Message-----
> From: saint....@gmail.com [mailto:saint....@gmail.com] On Behalf Of 
> Stack
> Sent: Tuesday, April 20, 2010 11:23 AM
> To: hbase-user@hadoop.apache.org
> Subject: Re: extremely sluggish hbase
>
> On Tue, Apr 20, 2010 at 10:29 AM, Geoff Hendrey <ghend...@decarta.com> wrote:
>> Hbase shell is taking 63 seconds to scan a table with {LIMIT=>1}!
>
> Is MR job running concurrently?
> Geoff: no
>
> Whats happening on your servers?  High load?
> Geoff: no, 99% idle on both servers
>
> I see
>> this error occur frequently in the region server  logs. Any ideas on 
>> what this might be>
>>
>> 2010-04-20 04:19:41,401 INFO org.apache.hadoop.ipc.HBaseServer: IPC 
>> Server handler 2 on 60020, call next(-750587486574522252) from
>> 10.241.6.80:51850: error:
>> org.apache.hadoop.hbase.UnknownScannerException: Name:
>> -750587486574522252
>>
>> I also see this in the regions server logs:
>>
>> 2010-04-20 04:21:44,559 INFO
>> org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
>> 5849633296569445699 lease expired
>> 2010-04-20 04:21:44,560 INFO org.apache.hadoop.hdfs.DFSClient: Could 
>> not obtain block blk_1799401938583830364_69702 from any node:
>> java.io.IOException: No live nodes contain current block
>>
>
>
> So, this is usually because the client took long between 'next'
> invocations on the scanner or the server is under such load its holding on to 
> the 'next' call for so long that the next time 'next' is called, the scanner 
> lease has expired.
>
>
>> However "hadoop dfsadmin -report" doesn't show any HDFS issues. Looks 
>> totally healthy. When I do "status" from HBase shell I get 
>> "hbase(main):008:0> status
>> 2 servers, 0 dead, 484.0000 average load" which also seems healthy to 
>> me.
>>
>
> Your servers are carrying 500 regions each.
> Geoff: Is this high, moderate, or low for a typical installation?
>
>> Any suggestions?
>>
>
> Look at top.  Look for loading.  Are you swapping?
> Geoff: I will look into the swapping and see if I can get some numbers.
>
> Look in hbase logs.  Whats it say its doing?  Fat GC pauses?
> Geoff: I monitor all the logs and I don't see any GC pauses. I am running 64 
> bit java with 8GB of heap. I'll look into GC further and see if I can get 
> some concrete data.
>
> St.Ack
>

RE: extremely sluggish hbase

Reply via email to