Hi Wanted to find the group's experience on HBase performance with increasing number of regions/node. Also wanted to find out if there is an optimal number of regions one should aim for?
We are currently using 17 node HBase(0.20.4) cluster on a 20 node Hadoop(0.20.2) cluster 16G RAM per node, 4G RAM for HBase space available for (Hadoop + HBase) ~ 1.5T /per node We are currently loading 2 tables each with ~100m rows resulting in ~ 4000 regions (Using the default for hbase.hregion.max.filesize=256m) and half the number of region when we double the value for hbase.hregion.max.filesize to 512m Although the two runs did not differ in the time taken ~ 9hrs With the current load we are only using 10% of the disk space available, full utilization would result in increased # of regions and hence wanted to find group's experience/suggestions in this regards. ~Jacob