On Mon, Mar 25, 2013 at 12:48 AM, Tapas Sarangi <[email protected]>wrote:
> > On Mar 24, 2013, at 3:40 PM, Alexey Babutin <[email protected]> > wrote: > > you said that threshold=10.Run mannualy command : hadoop balancer > threshold 9.5 ,then 9 and so with 0.5 step. > > > We are not setting threshold anywhere in our configuration and thus > considering the default which I believe is 10. > Why do you suggest such steps need to be tested for balancer ? Please > explain. > I guess we had a discussion earlier on this thread and came to the > conclusion that the threshold will not help in this situation. > today i thought about my advice for you and i have understood that i wrong. for example we have 100 nodes where 80 with 12Tb and 20 with 72 Tb.all node have 10 Tb data. averege cluster dfs used 1000/2600*100=38.5 for 12Tb node dfs used it is 83.3 from capacity for 72Tb nodes its 13.9. node is balanced if averege cluster dfs used +threshold > node dfs used >averege cluster dfs used - threshold. data will move from 12Tb to 72 Tb and when 12Tb nodes will have 48.5 of capacity balancer will stop. In this time 72tb node have 36.1 % of capacity. the cluster will grow up,in ideal case when cluster dfs used capacity 90 % .72Tb nodes will about 80% of capacity and 12Tb have about 100 % capacity.After that you have about 288Tb freespace > > > ----- > > > > > On Sun, Mar 24, 2013 at 11:01 PM, Tapas Sarangi > <[email protected]>wrote: > >> Yes, thanks for pointing, but I already know that it is completing the >> balancing when exiting otherwise it shouldn't exit. >> Your answer doesn't solve the problem I mentioned earlier in my message. >> 'hdfs' is stalling and hadoop is not writing unless space is cleared up >> from the cluster even though "df" shows the cluster has about 500 TB of >> free space. >> >> ------- >> >> >> On Mar 24, 2013, at 1:54 PM, Balaji Narayanan (பாலாஜி நாராயணன்) < >> [email protected]> wrote: >> >> -setBalancerBandwidth <bandwidth in bytes per second> >> >> So the value is bytes per second. If it is running and exiting,it means >> it has completed the balancing. >> >> >> On 24 March 2013 11:32, Tapas Sarangi <[email protected]> wrote: >> >>> Yes, we are running balancer, though a balancer process runs for almost >>> a day or more before exiting and starting over. >>> Current dfs.balance.bandwidthPerSec value is set to 2x10^9. I assume >>> that's bytes so about 2 GigaByte/sec. Shouldn't that be reasonable ? If it >>> is in Bits then we have a problem. >>> What's the unit for "dfs.balance.bandwidthPerSec" ? >>> >>> ----- >>> >>> On Mar 24, 2013, at 1:23 PM, Balaji Narayanan (பாலாஜி நாராயணன்) < >>> [email protected]> wrote: >>> >>> Are you running balancer? If balancer is running and if it is slow, try >>> increasing the balancer bandwidth >>> >>> >>> On 24 March 2013 09:21, Tapas Sarangi <[email protected]> wrote: >>> >>>> Thanks for the follow up. I don't know whether attachment will pass >>>> through this mailing list, but I am attaching a pdf that contains the usage >>>> of all live nodes. >>>> >>>> All nodes starting with letter "g" are the ones with smaller storage >>>> space where as nodes starting with letter "s" have larger storage space. As >>>> you will see, most of the "gXX" nodes are completely full whereas "sXX" >>>> nodes have a lot of unused space. >>>> >>>> Recently, we are facing crisis frequently as 'hdfs' goes into a mode >>>> where it is not able to write any further even though the total space >>>> available in the cluster is about 500 TB. We believe this has something to >>>> do with the way it is balancing the nodes, but don't understand the problem >>>> yet. May be the attached PDF will help some of you (experts) to see what is >>>> going wrong here... >>>> >>>> Thanks >>>> ------ >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> Balancer know about topology,but when calculate balancing it operates >>>> only with nodes not with racks. >>>> You can see how it work in Balancer.java in BalancerDatanode about >>>> string 509. >>>> >>>> I was wrong about 350Tb,35Tb it calculates in such way : >>>> >>>> For example: >>>> cluster_capacity=3.5Pb >>>> cluster_dfsused=2Pb >>>> >>>> avgutil=cluster_dfsused/cluster_capacity*100=57.14% used cluster >>>> capacity >>>> Then we know avg node utilization (node_dfsused/node_capacity*100) >>>> .Balancer think that all good if avgutil >>>> +10>node_utilizazation>=avgutil-10. >>>> >>>> Ideal case that all node used avgutl of capacity.but for 12TB node its >>>> only 6.5Tb and for 72Tb its about 40Tb. >>>> >>>> Balancer cant help you. >>>> >>>> Show me http://namenode.rambler.ru:50070/dfsnodelist.jsp?whatNodes=LIVEif >>>> you can. >>>> >>>> >>>> >>>>> >>>>> >>>>> In ideal case with replication factor 2 ,with two nodes 12Tb and 72Tb >>>>> you will be able to have only 12Tb replication data. >>>>> >>>>> >>>>> Yes, this is true for exactly two nodes in the cluster with 12 TB and >>>>> 72 TB, but not true for more than two nodes in the cluster. >>>>> >>>>> >>>>> Best way,on my opinion,it is using multiple racks.Nodes in rack must >>>>> be with identical capacity.Racks must be identical capacity. >>>>> For example: >>>>> >>>>> rack1: 1 node with 72Tb >>>>> rack2: 6 nodes with 12Tb >>>>> rack3: 3 nodes with 24Tb >>>>> >>>>> It helps with balancing,because dublicated block must be another rack. >>>>> >>>>> >>>>> The same question I asked earlier in this message, does multiple racks >>>>> with default threshold for the balancer minimizes the difference between >>>>> racks ? >>>>> >>>>> Why did you select hdfs?May be lustre,cephfs and other is better >>>>> choise. >>>>> >>>>> >>>>> It wasn't my decision, and I probably can't change it now. I am new to >>>>> this cluster and trying to understand few issues. I will explore other >>>>> options as you mentioned. >>>>> >>>>> -- >>>>> http://balajin.net/blog >>>>> http://flic.kr/balajijegan >>>>> >>>> >>> >> >> >> -- >> http://balajin.net/blog >> http://flic.kr/balajijegan >> >> >> > >
