They are working if you removed the last digit of the links. On Thu, Aug 11, 2016 at 10:49 AM, Zhe Zhang <z...@apache.org> wrote:
> Thanks Nicholas for the pointers; the first 2 links are not working. > > On Thu, Aug 11, 2016 at 10:09 AM Tsz Wo Sze <szets...@yahoo.com.invalid> > wrote: > > > Hi Senthi > > The Balancer performance was improved dramatically recently [1]. I am > not > > sure if you know about the new conf and parameters; see [2]. If you are > > interested in more details on how the Balancer works, please see [3]. > > Thanks. > > 1. > > https://community.hortonworks.com/content/kbentry/43615/ > hdfs-balancer-1-100x-performance-improvement.html2 > > . > > https://community.hortonworks.com/content/kbentry/43849/hdfs-balancer-2- > configurations-cli-options.html3 > > . > > https://community.hortonworks.com/content/kbentry/44148/ > hdfs-balancer-3-cluster-balancing-algorithm.html > > Regards,Tsz-Wo > > > > On Thursday, August 11, 2016 6:21 AM, Senthil Kumar < > > senthilec...@gmail.com> wrote: > > > > > > > > Hi Team , Pls add your suggestion(s) here , So that i can tune > parameters > > to balance cluster which is in bad shape now :( .. > > > > > > --Senthil > > > > On Thu, Aug 11, 2016 at 3:51 PM, Senthil Kumar <senthilec...@gmail.com> > > wrote: > > > > > Thanks Lars for your quick response! > > > > > > Here is my Cluster Utilization.. > > > DFS Used% : 74.39% > > > DFS Remaining% : 25.60% > > > > > > > > > Block Pool Used% : 74.39% > > > DataNodes usages : Min % Median % Max % stdev % > > > 1.25% 99.72% 99.99% 22.53% > > > Hadoop Version : *2.4.1* > > > > > > Let's take an example : > > > > > > Cluster Live Nodes : 1000 > > > Capacity Used 95-99% : 700 > > > Capacity Used 90 -95 % : 50 > > > Capacity Used < 90 % : 250 > > > > > > I'm looking for an option to balance the data quickly from the nodes > > > category 90-95% to < 90% nodes category.. I know there is an option > like > > > -include & -exclude but it's not helping me ( or am i not using it > > > effectively ?? Pls advise here how to use these options properly if i > > want > > > to balance my cluster as described above ) . > > > > > > Is there any option like --force-balance ( include two other inputs > like > > > force-balance-source-hosts(file) & force-balance-dest-hosts(file) ).. > > > this way i believe we can achieve balancing in urgency mode when you > have > > > 90% of nodes hitting 99% disk usage or when we have median 95% and > above > > > .. Pls add your thoughts here .. > > > > > > > > > Here is the code that constructs the NW Topology by categorizing like > > > over-utilized , avg utilized and under-utilized .. Sometimes i could > see > > > nodes with 70% of usage also comes under over-utilized ( tried with > > > threshold 10 - 30 ) . Correct me if anything wrong in my understanding. > > > > > > https://github.com/apache/hadoop/tree/release-2.4.1/ > > > hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/ > > > hadoop/hdfs/server/balancer > > > > > > */*create network topology and all data node lists: * > > > * * overloaded, above-average, below-average, and underloaded* > > > * * we alternates the accessing of the given datanodes array either > > by* > > > * * an increasing order or a decreasing order.* > > > * */ * > > > * long overLoadedBytes = 0L, underLoadedBytes = 0L;* > > > * for (DatanodeInfo datanode : DFSUtil.shuffle(datanodes)) {* > > > * if (datanode.isDecommissioned() || > > > datanode.isDecommissionInProgress()) {* > > > * continue; // ignore decommissioning or decommissioned nodes* > > > * }* > > > * cluster.add(datanode);* > > > * BalancerDatanode datanodeS;* > > > * final double avg = policy.getAvgUtilization();* > > > * if (policy.getUtilization(datanode) > avg) {* > > > * datanodeS = new Source(datanode, policy, threshold);* > > > * if (isAboveAvgUtilized(datanodeS)) {* > > > * this.aboveAvgUtilizedDatanodes.add((Source)datanodeS);* > > > * } else {* > > > * assert(isOverUtilized(datanodeS)) :* > > > * datanodeS.getDisplayName()+ "is not an overUtilized > node";* > > > * this.overUtilizedDatanodes.add((Source)datanodeS);* > > > * overLoadedBytes += (long)((datanodeS.utilization-avg* > > > * -threshold)*datanodeS.datanode.getCapacity()/100.0);* > > > * }* > > > * } else {* > > > * datanodeS = new BalancerDatanode(datanode, policy, > threshold);* > > > * if ( isBelowOrEqualAvgUtilized(datanodeS)) {* > > > * this.belowAvgUtilizedDatanodes.add(datanodeS);* > > > * } else {* > > > * assert isUnderUtilized(datanodeS) : "isUnderUtilized("* > > > * + datanodeS.getDisplayName() + ")=" + > > > isUnderUtilized(datanodeS)* > > > * + ", utilization=" + datanodeS.utilization; * > > > * this.underUtilizedDatanodes.add(datanodeS);* > > > * underLoadedBytes += (long)((avg-threshold-* > > > * > > > datanodeS.utilization)*datanodeS.datanode.getCapacity()/100.0);* > > > * }* > > > * }* > > > * datanodeMap.put(datanode.getDatanodeUuid(), datanodeS);* > > > * }* > > > > > > > > > Could someone help me here to understand the balancing policy and what > > are > > > the different parameters should i use to balance ( bring down median ) > > > cluster ?? > > > > > > --Senthil > > > > > > On Wed, Aug 10, 2016 at 8:21 PM, Lars Francke <lars.fran...@gmail.com> > > > wrote: > > > > > >> Hi Senthil, > > >> > > >> I'm not sure I fully understand. > > >> > > >> If you're using a threshold of 30 that means you have a range of 60% > > that > > >> the balancer would consider to be okay. > > >> > > >> Example: The used space divided by your total available space in the > > >> cluster is 80% Then with a 30% threshold the balancer would try to > bring > > >> all nodes within the range of 50-100% utilisation. > > >> > > >> The default threshold is 10% and that's a fairly huge range still > > >> especially on clusters that are almost at capacity. So a threshold of > 5 > > or > > >> even lower might work for you. > > >> > > >> What is your utilisation in the cluster (used space / available > space)? > > >> > > >> Cheers, > > >> Lars > > >> > > >> On Wed, Aug 10, 2016 at 3:27 PM, Senthil Kumar < > senthilec...@gmail.com> > > >> wrote: > > >> > > >>> Hi Team , We are running big cluster ( 3000 nodes cluster ) , many > > time > > >>> we > > >>> are hitting Median Increasing to 99.99 % ( 80 % of the DN's ) . > > >>> Balancer > > >>> is running all time in cluster ..But still median is not coming down > > >>> i.e < > > >>> 90 % .. > > >>> > > >>> Here is how i start balancer ? > > >>> /apache/hadoop/sbin/start-balancer.sh > > >>> -Ddfs.balance.bandwidthPerSec=104857600 *-threshold 30* > > >>> > > >>> What the recommended value for thershold ?? Is there any way to pass > > >>> param > > >>> only to move blocks from Over Utilized ( 98-100%) to under utilized ? > > >>> > > >>> > > >>> Pls advise! > > >>> > > >>> > > >>> > > >>> > > >>> Regards, > > >>> Senthil > > >>> > > >> > > >> > > > > > > > > > > > >