On Mon, Mar 25, 2013 at 4:29 AM, Tapas Sarangi <[email protected]>wrote:
> Hi, > > Thanks for the explanation. Where can I find the java code for balancer > that utilizes the threshold value and calculate it myself as you mentioned > ? I think I understand your calculation, but would like to see the code. > src/hdfs/org/apache/hadoop/hdfs/server/balancer/Balancer.java see BalancerDatanode > If I set the threshold to 5 instead of 10, then the smaller nodes will > have a maximum of 95% full where the larger nodes disk-usage will increase > from 80% to 85%. > > Now my question to you and the experts is when I run the balancer, is the > following command enough to set the threshold to a different value : > > hadoop balancer -threshold 5 > yes > > Thanks to all for the suggestions... > > ------- > > > > today i thought about my advice for you and i have understood that i wrong. > > for example we have 100 nodes where 80 with 12Tb and 20 with 72 Tb.all > node have 10 Tb data. > averege cluster dfs used 1000/2600*100=38.5 > > for 12Tb node dfs used it is 83.3 from capacity > for 72Tb nodes its 13.9. > > node is balanced if averege cluster dfs used +threshold > node dfs > used >averege cluster dfs used - threshold. > data will move from 12Tb to 72 Tb and when 12Tb nodes will have 48.5 of > capacity balancer will stop. > In this time 72tb node have 36.1 % of capacity. > > the cluster will grow up,in ideal case when cluster dfs used capacity 90 % > .72Tb nodes will about 80% of capacity and 12Tb have about 100 % > capacity.After that you have about 288Tb freespace > > > > > > > > > > > > >> >> >> ----- >> >> >> >> >> On Sun, Mar 24, 2013 at 11:01 PM, Tapas Sarangi >> <[email protected]>wrote: >> >>> Yes, thanks for pointing, but I already know that it is completing the >>> balancing when exiting otherwise it shouldn't exit. >>> Your answer doesn't solve the problem I mentioned earlier in my message. >>> 'hdfs' is stalling and hadoop is not writing unless space is cleared up >>> from the cluster even though "df" shows the cluster has about 500 TB of >>> free space. >>> >>> ------- >>> >>> >>> On Mar 24, 2013, at 1:54 PM, Balaji Narayanan (பாலாஜி நாராயணன்) < >>> [email protected]> wrote: >>> >>> -setBalancerBandwidth <bandwidth in bytes per second> >>> >>> So the value is bytes per second. If it is running and exiting,it means >>> it has completed the balancing. >>> >>> >>> On 24 March 2013 11:32, Tapas Sarangi <[email protected]> wrote: >>> >>>> Yes, we are running balancer, though a balancer process runs for almost >>>> a day or more before exiting and starting over. >>>> Current dfs.balance.bandwidthPerSec value is set to 2x10^9. I assume >>>> that's bytes so about 2 GigaByte/sec. Shouldn't that be reasonable ? If it >>>> is in Bits then we have a problem. >>>> What's the unit for "dfs.balance.bandwidthPerSec" ? >>>> >>>> ----- >>>> >>>> On Mar 24, 2013, at 1:23 PM, Balaji Narayanan (பாலாஜி நாராயணன்) < >>>> [email protected]> wrote: >>>> >>>> Are you running balancer? If balancer is running and if it is slow, try >>>> increasing the balancer bandwidth >>>> >>>> >>>> On 24 March 2013 09:21, Tapas Sarangi <[email protected]> wrote: >>>> >>>>> Thanks for the follow up. I don't know whether attachment will pass >>>>> through this mailing list, but I am attaching a pdf that contains the >>>>> usage >>>>> of all live nodes. >>>>> >>>>> All nodes starting with letter "g" are the ones with smaller storage >>>>> space where as nodes starting with letter "s" have larger storage space. >>>>> As >>>>> you will see, most of the "gXX" nodes are completely full whereas "sXX" >>>>> nodes have a lot of unused space. >>>>> >>>>> Recently, we are facing crisis frequently as 'hdfs' goes into a mode >>>>> where it is not able to write any further even though the total space >>>>> available in the cluster is about 500 TB. We believe this has something to >>>>> do with the way it is balancing the nodes, but don't understand the >>>>> problem >>>>> yet. May be the attached PDF will help some of you (experts) to see what >>>>> is >>>>> going wrong here... >>>>> >>>>> Thanks >>>>> ------ >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> Balancer know about topology,but when calculate balancing it operates >>>>> only with nodes not with racks. >>>>> You can see how it work in Balancer.java in BalancerDatanode about >>>>> string 509. >>>>> >>>>> I was wrong about 350Tb,35Tb it calculates in such way : >>>>> >>>>> For example: >>>>> cluster_capacity=3.5Pb >>>>> cluster_dfsused=2Pb >>>>> >>>>> avgutil=cluster_dfsused/cluster_capacity*100=57.14% used cluster >>>>> capacity >>>>> Then we know avg node utilization (node_dfsused/node_capacity*100) >>>>> .Balancer think that all good if avgutil >>>>> +10>node_utilizazation>=avgutil-10. >>>>> >>>>> Ideal case that all node used avgutl of capacity.but for 12TB node its >>>>> only 6.5Tb and for 72Tb its about 40Tb. >>>>> >>>>> Balancer cant help you. >>>>> >>>>> Show me >>>>> http://namenode.rambler.ru:50070/dfsnodelist.jsp?whatNodes=LIVE if >>>>> you can. >>>>> >>>>> >>>>> >>>>>> >>>>>> >>>>>> In ideal case with replication factor 2 ,with two nodes 12Tb and >>>>>> 72Tb you will be able to have only 12Tb replication data. >>>>>> >>>>>> >>>>>> Yes, this is true for exactly two nodes in the cluster with 12 TB and >>>>>> 72 TB, but not true for more than two nodes in the cluster. >>>>>> >>>>>> >>>>>> Best way,on my opinion,it is using multiple racks.Nodes in rack must >>>>>> be with identical capacity.Racks must be identical capacity. >>>>>> For example: >>>>>> >>>>>> rack1: 1 node with 72Tb >>>>>> rack2: 6 nodes with 12Tb >>>>>> rack3: 3 nodes with 24Tb >>>>>> >>>>>> It helps with balancing,because dublicated block must be another >>>>>> rack. >>>>>> >>>>>> >>>>>> The same question I asked earlier in this message, does multiple >>>>>> racks with default threshold for the balancer minimizes the difference >>>>>> between racks ? >>>>>> >>>>>> Why did you select hdfs?May be lustre,cephfs and other is better >>>>>> choise. >>>>>> >>>>>> >>>>>> It wasn't my decision, and I probably can't change it now. I am new >>>>>> to this cluster and trying to understand few issues. I will explore other >>>>>> options as you mentioned. >>>>>> >>>>>> -- >>>>>> http://balajin.net/blog >>>>>> http://flic.kr/balajijegan >>>>>> >>>>> >>>> >>> >>> >>> -- >>> http://balajin.net/blog >>> http://flic.kr/balajijegan >>> >>> >>> >> >> > >
