Hi Ganesh, As per my recent experience for adding node in Hbase cluster I have below observation. (1) First you need to take down time. (2) You need to balance your cluster by running HDFS blancer it will take 4 to 5 hours, here i am assuming your all data node having same h/w configurations. (3) After getting balanced perform major compaction.
I hope this will cover your questions answer. Thanks Manjeet On Fri, Jan 6, 2017 at 7:57 AM, Ted Yu <[email protected]> wrote: > Question #1 seems better suited on the Ambari mailing list. > > Have you checked whether hdfs balancer (not hbase balancer) was active from > the restart to observation of locality drop ? > > For StochasticLoadBalancer, there is this cost factor: > > private static final String LOCALITY_COST_KEY = > "hbase.master.balancer.stochastic.localityCost"; > > private static final float DEFAULT_LOCALITY_COST = 25; > > Default weight is very low. If you don't want to see drop in locality, you > can increase the weight (to ballpark of 500, e.g.) so that hbase balancer > doesn't move many regions. > > What you described in #2 (locality drop prevention) and #3 (rapid increase > of locality for moved regions) are two sides of the coin - the reason of > adding new nodes is to offload from existing region servers which would > result in drop of locality. > > Cheers > > On Thu, Jan 5, 2017 at 4:47 PM, Ganesh Viswanathan <[email protected]> > wrote: > > > Hello, > > > > I have three questions related to Hbase major compactions: > > > > 1) During a scheduled maintenance event on the Hbase cluster to add 2 new > > regionservers, Ambari said restart of all HDFS nodes (both name and data) > > was required. In the logs, it looks like the Hbase balancer turned on > > actively after the two nodes got registered. > > Is it normal to restart all HDFS nodes to add a new node into the > cluster? > > I am using HDP 2.4. > > > > 2) Should I turn off the Hbase balancer before adding new nodes. If so, > > when should I turn it back on and what would be the impact? Would it > cause > > a large drop in locality again? > > > > 3) When all the nodes in the cluster were restarted with Ambari, locality > > dropped to ~13% and Hbase was almost non-responsive. Only triggering a > > manual major compaction seems to help improve the locality after this. > But > > the data-locality increase is very gradual (about 4% every hour). Is > there > > any way to speed up major compaction (increase the number of threads etc) > > in HDP distribution? > > > > > > Thanks, > > Ganesh > > > -- luv all
