For pure random read, I do not think there exists a good way to improve
latency. Essentially, every single read would need to go through disk seek.
The latency definitely has something to do with server (HBase server/HDFS
client) rather than client (YCSB)


On Sun, May 29, 2011 at 1:23 PM, Harold Lim <[email protected]> wrote:

> Are there any configurations that I need to set to improve read latency?
> I'm running HBase on 10 ec2 m1.large instances (7.5GB RAM).
>
> Also, as the size of the data gets bigger is it normal to get higher
> latency for reads?
>
> I'm testing out the YCSB benchmark workload.
>
> With a data size of ~40-50GB (~200+ regions), I can get around 10-20ms and
> I can push the throughput of a read-only workload to around 3000+ operations
> per second.
>
> However, with a data size of 200GB (~1k+ regions), the smallest latency I
> can get is 30+ms (with 100 operations per second) and I can't get the
> throughput to go beyond 400+ operations per second (110+ms latency).
>
> I tried increasing the hbase.hregion.max.filesize to 2GB to reduce the
> number of regions and it seems to make it worse.
>
>
> I also tried increasing the heap size to 4GB,
>  hbase.regionserver.handler.count = 100, and vm.swappiness = 0. However, it
> still didn't improve the performance.
>
>
> I'm also sure that the YCSB client benchmark driver is not becoming the
> bottleneck because the CPU utilization is low.
>
>
>
>
>
>
>
> Thanks,
> Harold
>



-- 
--Sean

Reply via email to