See http://hbase.apache.org/book.html#performance and the notes over
in the other thread, "How to improve HBase throughput with YCSB?"
St.Ack

On Sun, May 29, 2011 at 2:28 PM, Sean Bigdatafun
<[email protected]> wrote:
> For pure random read, I do not think there exists a good way to improve
> latency. Essentially, every single read would need to go through disk seek.
> The latency definitely has something to do with server (HBase server/HDFS
> client) rather than client (YCSB)
>
>
> On Sun, May 29, 2011 at 1:23 PM, Harold Lim <[email protected]> wrote:
>
>> Are there any configurations that I need to set to improve read latency?
>> I'm running HBase on 10 ec2 m1.large instances (7.5GB RAM).
>>
>> Also, as the size of the data gets bigger is it normal to get higher
>> latency for reads?
>>
>> I'm testing out the YCSB benchmark workload.
>>
>> With a data size of ~40-50GB (~200+ regions), I can get around 10-20ms and
>> I can push the throughput of a read-only workload to around 3000+ operations
>> per second.
>>
>> However, with a data size of 200GB (~1k+ regions), the smallest latency I
>> can get is 30+ms (with 100 operations per second) and I can't get the
>> throughput to go beyond 400+ operations per second (110+ms latency).
>>
>> I tried increasing the hbase.hregion.max.filesize to 2GB to reduce the
>> number of regions and it seems to make it worse.
>>
>>
>> I also tried increasing the heap size to 4GB,
>>  hbase.regionserver.handler.count = 100, and vm.swappiness = 0. However, it
>> still didn't improve the performance.
>>
>>
>> I'm also sure that the YCSB client benchmark driver is not becoming the
>> bottleneck because the CPU utilization is low.
>>
>>
>>
>>
>>
>>
>>
>> Thanks,
>> Harold
>>
>
>
>
> --
> --Sean
>

Reply via email to