Daniel Ploeg wrote:
Hi all,

I performed a cluster rebalance on my test cluster yesterday (5 regionserver
/ datanodes each with approx 400GB - total approx 2TB HDFS) and I would like
to know if the mailing lists have seen similar results to what I've seen.

I talked to the lads running hbase here at powerset. They believe they have seen something similar when they grow the cluster by some significant percentage (20-30%). The addition of new machines brings on a rebalancing and thereafter hbase runs "faster".

I had a single table with a single column family and loaded it up so that it
just about filled the entire cluster. Actually one or two of the nodes had
run out of space, yet the fifth machine only had 50% of its disks utilised
(which is why I though a rebalance was in order). There are a total of 1475
regions in the cluster. Prior to starting the rebalance the cluster only had
about 250GB left to it's disposal. After the rebalance I now have almost
800GB free.

If 1475 regions, update to 0.18.1 (coming soon).

Furthermore, I was performing read tests prior to the rebalance and getting
a response time of approx 500ms per row (each row has 10000 column instances
of the column family which were deserialised as part of the test). After the
rebalance my read times reduced to around 340ms.

If you could have fewer columns in a family column, you'll get a bit better performance: HBASE-867.

Good on you Daniel,
St.Ack

Reply via email to