It is a bad idea only cause it will temporarily distort the perfect locality of the regions hosted by each RegionServer. This gets corrected only at the end of the next major compaction of all regions, eventually, but both the events would cause some small level of performance dips and increase in network use + I/O until done.
There's no way to escape the fact that if you write more HBase data, the 3 nodes of RS are bound to fill up faster than the others, but what we could do as an enhancement for aiding rebalancing the remaining replica nodes without affecting the RS is to provide an exclude-nodes feature to the balancer. By asking the Balancer to exclude the RS's nodes, you can rebalance the rest of the cluster while not causing a performance problem on the RS during the time. Most clusters run the RS+DN pair across all nodes, so this scenario of an imbalance won't really occur there. I filed https://issues.apache.org/jira/browse/HDFS-4509 with some ideas you could use (see comments). On Sun, Dec 9, 2012 at 4:57 PM, Jabir Ahmed <[email protected]> wrote: > Our cluster has around 12 data-nodes > > 9 nodes run datanodes + task trackers > 3 nodes run dtanodes + regions servers > 1 Namenode and Jotbtracker > > In this kind of a cluster setup is it advisable to run a re-blancer since > running a balancer affects the performance of hbase. > > Thnx > > Jabir > > -- > > > -- Harsh J
