The primary unit of load distribution in HBase is the region, make
sure you have more than one. This is well documented in the manual
http://hbase.apache.org/book/perf.writing.html

J-D

On Fri, Mar 1, 2013 at 4:17 AM, Dan Crosta <[email protected]> wrote:
> We are using a 6-node HBase cluster with a Thrift Server on each of the 
> RegionServer nodes, and trying to evaluate maximum write throughput for our 
> use case (which involves many processes sending mutateRowsTs commands). 
> Somewhere between about 30 and 40 processes writing into the system we cross 
> the threshold where adding additional writers yields only very limited 
> returns to throughput, and I'm not sure why. We see that the CPU and Disk on 
> the DataNode/RegionServer/ThriftServer machines are not saturated, nor is the 
> NIC in those machines. I'm a little unsure where to look next.
>
> A little more detail about our deployment:
>
> * CDH 4.1.2
> * DataNode/RegionServer/ThriftServer class: EC2 m1.xlarge
> ** RegionServer: 8GB heap
> ** ThriftServer: 1GB heap
> ** DataNode: 4GB heap
> ** EC2 ephemeral (i.e. local, not EBS) volumes used for HDFS
>
> If there's any other information that I can provide, or any other 
> configuration or system settings I should look at, I'd appreciate the 
> pointers.
>
> Thanks,
>  - Dan

Reply via email to