Re: Speeding up HBase read response

Michael Segel Wed, 11 Apr 2012 23:22:32 -0700

No need for pardon.
I mean its good to hear about the changes to help improve performance. :-)


I just wanted to try and answer the OPs question and set a realistic 
expectation. 

-Mike

On Apr 12, 2012, at 1:14 AM, Andrew Purtell wrote:

> Pardon yes that is probably true. I hijacked this thread anyway. /eot
> 
> Best regards,
> 
>    - Andy
> 
> 
> On Apr 11, 2012, at 11:04 PM, Michael Segel <[email protected]> wrote:
> 
>> Uhm, 
>> Lets take a look back at the original post :
>> "I'm confused with a read latency I got, comparing to what YCSB team achieved
>> and showed in their YCSB paper. They achieved throughput of up to 7000
>> ops/sec with a latency of 15 ms (page 10, read latency chart). I can't get
>> throughput higher than 2000 ops/sec on 90% reads/10% writes workload. Writes
>> are really fast with auto commit disabled (response within a few ms), while
>> read latency doesn't go lower than 70 ms in average.
>> "
>> While its great to look at how to reduce the read latency, something that 
>> you will have to consider that you won't get the same low latency if you 
>> have drives local to your node. 
>> 
>> So I have to ask if there's an unrealistic expectation on the part of the OP?
>> 
>> On Apr 12, 2012, at 12:40 AM, Andrew Purtell wrote:
>> 
>>> Hi Otis,
>>> 
>>>> You mention "Linux AMI (2012.03.1)", but which AMI is that?  Is this some 
>>>> specific AMI prepared by Amazon? 
>>> 
>>> Yes.
>>> 
>>> $ ec2-describe-images -a | grep amzn | grep 2012.03.1
>>> 
>>> should give you results. Use this as your base, install Hadoop etc on top. 
>>> 
>>> I used the pv x86_64 variant and tested the direct attached instance store 
>>> devices. 
>>> 
>>> Unfortunately I'm at the airport now and don't have an instance handy to 
>>> get you the command output you want.
>>> 
>>> For comparison I launched m1.xlarge instances, our usual for testing, all 
>>> in the same region of us-west-1. They should be roughly comparable. I ran 
>>> each test three times each with a new instance and warmed up the instance 
>>> devices with a preliminary FIO run. 
>>> 
>>> As you know EC2 isn't really good for performance benchmarking, the 
>>> variability is quite high. However I did take the basic steps above to try 
>>> and get a useful (albeit unscientific) result. 
>>> 
>>> It would be interesting if someone else finds similar results, or not, as 
>>> the case may be. 
>>> 
>>> Best regards,
>>> 
>>>  - Andy
>>> 
>>> 
>>> On Apr 11, 2012, at 2:31 PM, Otis Gospodnetic <[email protected]> 
>>> wrote:
>>> 
>>>> Hi Andy,
>>>> 
>>>> This email must have caught attention of a number of people...
>>>> You mention "Linux AMI (2012.03.1)", but which AMI is that?  Is this some 
>>>> specific AMI prepared by Amazon?  Or some AMI that somebody like Cloudera 
>>>> prepared?  Or are you saying it's just "some Linux" AMI that somebody 
>>>> built on 2012-03-01 and that you found in AWS?
>>>> 
>>>> Could you please share the outputs of:
>>>> 
>>>> $ cat /etc/*release
>>>> $ uname -a
>>>> 
>>>> $ df -T
>>>> 
>>>> Also, could it be that your old EC2 instance was unlucky and had a very 
>>>> noisy neighbour, while the new EC2 instance does not?  Not sure how one 
>>>> could run tests to get around this - perhaps by terminating the instance 
>>>> and restarting it a few times in order to get it hosted on different 
>>>> physical hosts?
>>>> 
>>>> Thanks,
>>>> Otis 
>>>> ----
>>>> Performance Monitoring SaaS for HBase - 
>>>> http://sematext.com/spm/hbase-performance-monitoring/index.html
>>>> 
>>>> 
>>>> 
>>>>> ________________________________
>>>>> From: Andrew Purtell <[email protected]>
>>>>> To: "[email protected]" <[email protected]> 
>>>>> Cc: Jack Levin <[email protected]>; "[email protected]" 
>>>>> <[email protected]> 
>>>>> Sent: Tuesday, April 10, 2012 2:14 PM
>>>>> Subject: Re: Speeding up HBase read response
>>>>> 
>>>>> What AMI are you using as your base?
>>>>> 
>>>>> I recently started using the new Linux AMI (2012.03.1) and noticed what 
>>>>> looks like significant improvement over what I had been using before 
>>>>> (2011.02 IIRC). I ran four simple tests repeated three times with FIO: a 
>>>>> read bandwidth test, a write bandwidth test, a read IOPS test, and a 
>>>>> write IOPS test. The write IOPS test was inconclusive but for the others 
>>>>> there was a consistent difference: reduced disk op latency (shorter tail) 
>>>>> and increased device bandwidth. I don't run anything in production in EC2 
>>>>> so this was the extent of my curiosity.
>>>>> 
>>>>> 
>>>>> Best regards,
>>>>> 
>>>>>  - Andy
>>>>> 
>>>>> Problems worthy of attack prove their worth by hitting back. - Piet Hein 
>>>>> (via Tom White)
>>>>> 
>>>>> 
>>>>> 
>>>>> ----- Original Message -----
>>>>>> From: Jeff Whiting <[email protected]>
>>>>>> To: [email protected]
>>>>>> Cc: Jack Levin <[email protected]>; [email protected]
>>>>>> Sent: Tuesday, April 10, 2012 11:03 AM
>>>>>> Subject: Re: Speeding up HBase read response
>>>>>> 
>>>>>> Do you have bloom filters enabled?  And compression?  Both of those can 
>>>>>> help 
>>>>>> reduce disk io load 
>>>>>> which seems to be the main issue you are having on the ec2 cluster.
>>>>>> 
>>>>>> ~Jeff
>>>>>> 
>>>>>> On 4/9/2012 8:28 AM, Jack Levin wrote:
>>>>>>> Yes, from  %util you can see that your disks are working at 100%
>>>>>>> pretty much.  Which means you can't push them go any faster.   So the
>>>>>>> solution is to add more disks, add faster disks, add nodes and disks.
>>>>>>> This type of overload should not be related to HBASE, but rather to
>>>>>>> your hardware setup.
>>>>>>> 
>>>>>>> -Jack
>>>>>>> 
>>>>>>> On Mon, Apr 9, 2012 at 2:29 AM, ijanitran<[email protected]>  wrote:
>>>>>>>> Hi, results of iostat are pretty much very similar on all nodes:
>>>>>>>> 
>>>>>>>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>>>>> avgrq-sz
>>>>>>>> avgqu-sz   await  svctm  %util
>>>>>>>> xvdap1            0.00     0.00  294.00    0.00     9.27     0.00    
>>>>>> 64.54
>>>>>>>> 21.97   75.44   3.40 100.10
>>>>>>>> 
>>>>>>>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>>>>> avgrq-sz
>>>>>>>> avgqu-sz   await  svctm  %util
>>>>>>>> xvdap1            0.00     4.00  286.00    8.00     9.11     0.27    
>>>>>> 65.33
>>>>>>>> 7.16 25.32 2.88  84.70
>>>>>>>> 
>>>>>>>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>>>>> avgrq-sz
>>>>>>>> avgqu-sz   await  svctm  %util
>>>>>>>> xvdap1            0.00     0.00  283.00    0.00     8.29     0.00    
>>>>>> 59.99
>>>>>>>> 10.31   35.43   2.97  84.10
>>>>>>>> 
>>>>>>>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>>>>> avgrq-sz
>>>>>>>> avgqu-sz   await  svctm  %util
>>>>>>>> xvdap1            0.00     0.00  320.00    0.00     9.12     0.00    
>>>>>> 58.38
>>>>>>>> 12.32   39.56   2.79  89.40
>>>>>>>> 
>>>>>>>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>>>>> avgrq-sz
>>>>>>>> avgqu-sz   await  svctm  %util
>>>>>>>> xvdap1            0.00     0.00  336.63    0.00     9.18     0.00    
>>>>>> 55.84
>>>>>>>> 10.67   31.42   2.78  93.47
>>>>>>>> 
>>>>>>>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>>>>> avgrq-sz
>>>>>>>> avgqu-sz   await  svctm  %util
>>>>>>>> xvdap1            0.00     0.00  312.00    0.00    10.00     0.00    
>>>>>> 65.62
>>>>>>>> 11.07   35.49   2.91  90.70
>>>>>>>> 
>>>>>>>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>>>>> avgrq-sz
>>>>>>>> avgqu-sz   await  svctm  %util
>>>>>>>> xvdap1            0.00     0.00  356.00    0.00    10.72     0.00    
>>>>>> 61.66
>>>>>>>> 9.38 26.63 2.57  91.40
>>>>>>>> 
>>>>>>>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>>>>> avgrq-sz
>>>>>>>> avgqu-sz   await  svctm  %util
>>>>>>>> xvdap1            0.00     0.00  258.00    0.00     8.20     0.00    
>>>>>> 65.05
>>>>>>>> 13.37   51.24   3.64  93.90
>>>>>>>> 
>>>>>>>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>>>>> avgrq-sz
>>>>>>>> avgqu-sz   await  svctm  %util
>>>>>>>> xvdap1            0.00     0.00  246.00    0.00     7.31     0.00    
>>>>>> 60.88
>>>>>>>> 5.87   24.53   3.14  77.30
>>>>>>>> 
>>>>>>>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>>>>> avgrq-sz
>>>>>>>> avgqu-sz   await  svctm  %util
>>>>>>>> xvdap1            0.00     2.00  297.00    3.00     9.11     0.02    
>>>>>> 62.29
>>>>>>>> 13.02   42.40   3.12  93.60
>>>>>>>> 
>>>>>>>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>>>>> avgrq-sz
>>>>>>>> avgqu-sz   await  svctm  %util
>>>>>>>> xvdap1            0.00     0.00  292.00    0.00     9.60     0.00    
>>>>>> 67.32
>>>>>>>> 11.30   39.51   3.36  98.00
>>>>>>>> 
>>>>>>>> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s 
>>>>>> avgrq-sz
>>>>>>>> avgqu-sz   await  svctm  %util
>>>>>>>> xvdap1            0.00     4.00  261.00    8.00     7.84     0.27    
>>>>>> 61.74
>>>>>>>> 16.07   55.72   3.39  91.30
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Jack Levin wrote:
>>>>>>>>> Please email iostat -xdm 1, run for one minute during load on each 
>>>>>> node
>>>>>>>>> --
>>>>>>>>> Sent from my Android phone with K-9 Mail. Please excuse my brevity.
>>>>>>>>> 
>>>>>>>>> ijanitran<[email protected]> wrote:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> I have 4 nodes HBase v0.90.4-cdh3u3 cluster deployed on Amazon 
>>>>>> XLarge
>>>>>>>>> instances (16Gb RAM, 4 cores CPU) with 8Gb heap -Xmx allocated for 
>>>>>> HRegion
>>>>>>>>> servers, 2Gb for datanodes. HMaster\ZK\Namenode is on the 
>>>>>> separate XLarge
>>>>>>>>> instance. Target dataset is 100 millions records (each record is 10 
>>>>>> fields
>>>>>>>>> by 100 bytes). Benchmarking performed concurrently from parallel 
>>>>>> 100
>>>>>>>>> threads.
>>>>>>>>> 
>>>>>>>>> I'm confused with a read latency I got, comparing to what YCSB 
>>>>>> team
>>>>>>>>> achieved
>>>>>>>>> and showed in their YCSB paper. They achieved throughput of up to 
>>>>>> 7000
>>>>>>>>> ops/sec with a latency of 15 ms (page 10, read latency chart). I 
>>>>>> can't get
>>>>>>>>> throughput higher than 2000 ops/sec on 90% reads/10% writes 
>>>>>> workload.
>>>>>>>>> Writes
>>>>>>>>> are really fast with auto commit disabled (response within a few 
>>>>>> ms),
>>>>>>>>> while
>>>>>>>>> read latency doesn't go lower than 70 ms in average.
>>>>>>>>> 
>>>>>>>>> These are some HBase settings I used:
>>>>>>>>> 
>>>>>>>>> hbase.regionserver.handler.count=50
>>>>>>>>> hfile.block.cache.size=0.4
>>>>>>>>> hbase.hregion.max.filesize=1073741824
>>>>>>>>> hbase.regionserver.codecs=lzo
>>>>>>>>> hbase.hregion.memstore.mslab.enabled=true
>>>>>>>>> hfile.min.blocksize.size=16384
>>>>>>>>> hbase.hregion.memstore.block.multiplier=4
>>>>>>>>> hbase.regionserver.global.memstore.upperLimit=0.35
>>>>>>>>> hbase.zookeeper.property.maxClientCnxns=100
>>>>>>>>> 
>>>>>>>>> Which settings do you recommend to look at\tune to speed up 
>>>>>> reads with
>>>>>>>>> HBase?
>>>>>>>>> 
>>>>>>>>> --
>>>>>>>>> View this message in context:
>>>>>>>>> 
>>>>>> http://old.nabble.com/Speeding-up-HBase-read-response-tp33635226p33635226.html
>>>>>>>>> Sent from the HBase User mailing list archive at Nabble.com.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> --
>>>>>>>> View this message in context: 
>>>>>> http://old.nabble.com/Speeding-up-HBase-read-response-tp33635226p33654666.html
>>>>>>>> Sent from the HBase User mailing list archive at Nabble.com.
>>>>>>>> 
>>>>>> 
>>>>>> -- 
>>>>>> Jeff Whiting
>>>>>> Qualtrics Senior Software Engineer
>>>>>> [email protected]
>>>>>> 
>>>>> 
>>>>> 
>>> 
>> 
>

Re: Speeding up HBase read response

Reply via email to