HBase and hadoop cluster rebalance

Daniel Ploeg Wed, 15 Oct 2008 14:48:40 -0700

Hi all,

I performed a cluster rebalance on my test cluster yesterday (5 regionserver
/ datanodes each with approx 400GB - total approx 2TB HDFS) and I would like
to know if the mailing lists have seen similar results to what I've seen.


I had a single table with a single column family and loaded it up so that it
just about filled the entire cluster. Actually one or two of the nodes had
run out of space, yet the fifth machine only had 50% of its disks utilised
(which is why I though a rebalance was in order). There are a total of 1475
regions in the cluster. Prior to starting the rebalance the cluster only had
about 250GB left to it's disposal. After the rebalance I now have almost
800GB free.

Furthermore, I was performing read tests prior to the rebalance and getting
a response time of approx 500ms per row (each row has 10000 column instances
of the column family which were deserialised as part of the test). After the
rebalance my read times reduced to around 340ms.

Has anybody experienced something like this, or can anyone explain why I
would see such a benefit? Does anybody regularly run a cluster rebalance on
the hadoop cluster running hbase?

Thanks,
Daniel

HBase and hadoop cluster rebalance

Reply via email to