Re: Scanner call crashed regionserver under load

Stack Wed, 14 Jul 2010 21:47:22 -0700

Was it HBASE-2545 (fixed in 0.20.5)?
St.Ack


On Wed, Jul 14, 2010 at 9:14 PM, Ryan Rawson <[email protected]> wrote:
> One thing to watch out for is closing scanners when you are done with
> them to release server-side resources.  Without doing that it would be
> possible to overwhelm the server.  We currently don't protect against
> this, but its quite possible we should.
>
> On Wed, Jul 14, 2010 at 9:07 PM, Jonathan Gray <[email protected]> wrote:
>> Hard to imagine a scan of 1500 rows (~100k total KVs) taking down anything, 
>> but it is of course possible.
>>
>> Did the RSs actually die or the increments just got slow enough to timeout 
>> your application?  If they got slow, how slow did they get and how fast do 
>> they usually go?
>>
>> If you put up logs from the RSs during this time we might be able to see if 
>> there's anything strange going on.
>>
>> On this scanner, did you have the block cache enabled or disabled?  I'd 
>> recommend disabling the block cache on the Scan object, just in case it was 
>> increased GC activity that hurt performance.  On big scans I've seen this 
>> make a difference.
>>
>> JG
>>
>>> -----Original Message-----
>>> From: Vaibhav Puranik [mailto:[email protected]]
>>> Sent: Wednesday, July 14, 2010 8:57 PM
>>> To: [email protected]
>>> Subject: Scanner call crashed regionserver under load
>>>
>>> Hi all,
>>>
>>>
>>> We experienced a downtime in our HBase installation today.
>>>
>>> We have our HBase hosted in EC2 with 1 master (with ZK) and 3 slaves
>>> (all of
>>> them are m1.large). We are using HBase version 0.20.4
>>>
>>> We have a method that opens a scanner and retrieves some values. The
>>> method
>>> approximately scans 300 rows. Each row has three column families and
>>> approximately 75 longs.
>>> The table is fairly small and in total the table has approximately 1500
>>> rows.
>>>
>>> We tried calling this method under huge traffic and the CPU of the
>>> regionserver that hosted this table spiked to 100%. It brought down our
>>> application.
>>>
>>> We are doing multiple incrementColumnValue calls for every request for
>>> this
>>> traffic and HBase seems to take it well.
>>>
>>> So, does that mean, it's a bad idea to call a scanner under huge
>>> traffic?
>>> Will this problem get solved if we make a new table and store the
>>> values
>>> with a different key so that the exact value can be retrieved (with a
>>> Get
>>> call) ? Are there any other ways to resolve this hotspotting without
>>> duplicating data?
>>>
>>> Are we limited by what a machine can handle if we have a fairly small
>>> table
>>> (that can fit in a region server or possibly in a single region)? Are
>>> there
>>> any creative solutions people are using?
>>>
>>> Regards,
>>> Vaibhav Puranik
>>> GumGum
>>
>

Re: Scanner call crashed regionserver under load

Reply via email to