On both types of nodes, what is your dfs.data.dir set to? Does it specify multiple folders on the same set's of drives or is it 1-1 between folder and drive? If it's set to multiple folders on the same drives, it is probably multiplying the amount of "available capacity" incorrectly in that it assumes a 1-1 relationship between folder and total capacity of the drive.
On Sun, Mar 24, 2013 at 3:01 PM, Tapas Sarangi <[email protected]>wrote: > Yes, thanks for pointing, but I already know that it is completing the > balancing when exiting otherwise it shouldn't exit. > Your answer doesn't solve the problem I mentioned earlier in my message. > 'hdfs' is stalling and hadoop is not writing unless space is cleared up > from the cluster even though "df" shows the cluster has about 500 TB of > free space. > > ------- > > > On Mar 24, 2013, at 1:54 PM, Balaji Narayanan (பாலாஜி நாராயணன்) < > [email protected]> wrote: > > -setBalancerBandwidth <bandwidth in bytes per second> > > So the value is bytes per second. If it is running and exiting,it means it > has completed the balancing. > > > On 24 March 2013 11:32, Tapas Sarangi <[email protected]> wrote: > >> Yes, we are running balancer, though a balancer process runs for almost a >> day or more before exiting and starting over. >> Current dfs.balance.bandwidthPerSec value is set to 2x10^9. I assume >> that's bytes so about 2 GigaByte/sec. Shouldn't that be reasonable ? If it >> is in Bits then we have a problem. >> What's the unit for "dfs.balance.bandwidthPerSec" ? >> >> ----- >> >> On Mar 24, 2013, at 1:23 PM, Balaji Narayanan (பாலாஜி நாராயணன்) < >> [email protected]> wrote: >> >> Are you running balancer? If balancer is running and if it is slow, try >> increasing the balancer bandwidth >> >> >> On 24 March 2013 09:21, Tapas Sarangi <[email protected]> wrote: >> >>> Thanks for the follow up. I don't know whether attachment will pass >>> through this mailing list, but I am attaching a pdf that contains the usage >>> of all live nodes. >>> >>> All nodes starting with letter "g" are the ones with smaller storage >>> space where as nodes starting with letter "s" have larger storage space. As >>> you will see, most of the "gXX" nodes are completely full whereas "sXX" >>> nodes have a lot of unused space. >>> >>> Recently, we are facing crisis frequently as 'hdfs' goes into a mode >>> where it is not able to write any further even though the total space >>> available in the cluster is about 500 TB. We believe this has something to >>> do with the way it is balancing the nodes, but don't understand the problem >>> yet. May be the attached PDF will help some of you (experts) to see what is >>> going wrong here... >>> >>> Thanks >>> ------ >>> >>> >>> >>> >>> >>> >>> >>> Balancer know about topology,but when calculate balancing it operates >>> only with nodes not with racks. >>> You can see how it work in Balancer.java in BalancerDatanode about >>> string 509. >>> >>> I was wrong about 350Tb,35Tb it calculates in such way : >>> >>> For example: >>> cluster_capacity=3.5Pb >>> cluster_dfsused=2Pb >>> >>> avgutil=cluster_dfsused/cluster_capacity*100=57.14% used cluster capacity >>> Then we know avg node utilization (node_dfsused/node_capacity*100) >>> .Balancer think that all good if avgutil >>> +10>node_utilizazation>=avgutil-10. >>> >>> Ideal case that all node used avgutl of capacity.but for 12TB node its >>> only 6.5Tb and for 72Tb its about 40Tb. >>> >>> Balancer cant help you. >>> >>> Show me http://namenode.rambler.ru:50070/dfsnodelist.jsp?whatNodes=LIVEif >>> you can. >>> >>> >>> >>>> >>>> >>>> In ideal case with replication factor 2 ,with two nodes 12Tb and 72Tb >>>> you will be able to have only 12Tb replication data. >>>> >>>> >>>> Yes, this is true for exactly two nodes in the cluster with 12 TB and >>>> 72 TB, but not true for more than two nodes in the cluster. >>>> >>>> >>>> Best way,on my opinion,it is using multiple racks.Nodes in rack must be >>>> with identical capacity.Racks must be identical capacity. >>>> For example: >>>> >>>> rack1: 1 node with 72Tb >>>> rack2: 6 nodes with 12Tb >>>> rack3: 3 nodes with 24Tb >>>> >>>> It helps with balancing,because dublicated block must be another rack. >>>> >>>> >>>> The same question I asked earlier in this message, does multiple racks >>>> with default threshold for the balancer minimizes the difference between >>>> racks ? >>>> >>>> Why did you select hdfs?May be lustre,cephfs and other is better >>>> choise. >>>> >>>> >>>> It wasn't my decision, and I probably can't change it now. I am new to >>>> this cluster and trying to understand few issues. I will explore other >>>> options as you mentioned. >>>> >>>> -- >>>> http://balajin.net/blog >>>> http://flic.kr/balajijegan >>>> >>> >> > > > -- > http://balajin.net/blog > http://flic.kr/balajijegan > > >
