HBase balancer performance

Eric K Fri, 22 Aug 2014 11:20:41 -0700

I'm experimenting with HBase (0.94.4) and Hadoop (1.0.4) runtime balancers
on just a very tiny 4 node cluster and finding that performance is bad for
an hour after adding new nodes, even though all the data is supposedly
offloaded within a few minutes.


Using YCSB-generated requests, I load the database with about 18 GB worth
of data across 12 million records.  The keyspace is initially pre-split to
have ~30 regions per node with single replication of data.  Then I hit
HBase with read requests from a set of clients so that there are 2000
requests outstanding, and new requests are immediately made after replies
are received.  While these requests are running, after about 2 minutes, I
double the nodes from 4 to 8, add the new node information to the slaves
and regionservers files, start up the new datanode and regionserver
processes, and call the hdfs balancer with the smallest possible threshold
of 1.  The hbase.balancer.period is also set to 10 seconds so as to respond
fast to new nodes.  The dfs.bandwidth.bandwidthPerSec is set to 8 MB/s per
node, but I have also tried higher numbers that don't bottleneck the
offload rate and gotten similar results.

The expected response is that about half the data and regions are offloaded
to the new nodes in the next few minutes, and the logs (and hdfsadmin
reports) confirm that this is indeed happening.  However, I'm seeing the
throughput drop from 3000 to 500 requests/sec when the nodes are added,
with latencies jumping from ~1s to 4 or 5 seconds, and this poor
performance persists for almost 45 minutes, when it abruptly gets better to
4000 requests/sec and 500ms latencies.

I'd appreciate any ideas as to what could be causing that 45 minute
performance delay, and how I can debug this further.  If I add the new
nodes and balance onto them before starting up any read requests,
performance is much better from the start.

Thanks,
Eric

HBase balancer performance

Reply via email to