Are there any configurations that I need to set to improve read latency? I'm 
running HBase on 10 ec2 m1.large instances (7.5GB RAM).

Also, as the size of the data gets bigger is it normal to get higher latency 
for reads?

I'm testing out the YCSB benchmark workload.

With a data size of ~40-50GB (~200+ regions), I can get around 10-20ms and I 
can push the throughput of a read-only workload to around 3000+ operations per 
second.

However, with a data size of 200GB (~1k+ regions), the smallest latency I can 
get is 30+ms (with 100 operations per second) and I can't get the throughput to 
go beyond 400+ operations per second (110+ms latency).  

I tried increasing the hbase.hregion.max.filesize to 2GB to reduce the number 
of regions and it seems to make it worse.


I also tried increasing the heap size to 4GB,  hbase.regionserver.handler.count 
= 100, and vm.swappiness = 0. However, it still didn't improve the performance.


I'm also sure that the YCSB client benchmark driver is not becoming the 
bottleneck because the CPU utilization is low.







Thanks,
Harold

Reply via email to