Re: disk used percentage is not symmetric on datanodes (balancer)

Alexey Babutin Mon, 25 Mar 2013 07:14:32 -0700

On Mon, Mar 25, 2013 at 4:29 AM, Tapas Sarangi <[email protected]>wrote:


> Hi,
>
> Thanks for the explanation. Where can I find the java code for balancer
> that utilizes the threshold value and calculate it myself as you mentioned
> ? I think I understand your calculation, but would like to see the code.
>

src/hdfs/org/apache/hadoop/hdfs/server/balancer/Balancer.java

see BalancerDatanode


> If I set the threshold to 5 instead of 10, then the smaller nodes will
> have a maximum of 95% full where the larger nodes disk-usage will increase
> from 80% to 85%.
>
> Now my question to you and the experts is when I run the balancer, is the
> following command enough to set the threshold to a different value :
>
> hadoop balancer -threshold 5
>
yes

>
> Thanks to all for the suggestions...
>
> -------
>
>
>
> today i thought about my advice for you and i have understood that i wrong.
>
> for example we have 100 nodes where 80 with 12Tb and 20 with 72 Tb.all
> node have 10 Tb data.
> averege cluster dfs used 1000/2600*100=38.5
>
> for  12Tb node dfs used it is 83.3 from capacity
> for 72Tb nodes its 13.9.
>
> node is balanced if      averege cluster dfs used +threshold > node dfs
> used >averege cluster dfs used - threshold.
> data will move from 12Tb to 72 Tb and when 12Tb nodes will have 48.5 of
> capacity balancer will stop.
> In this time 72tb node have 36.1 % of capacity.
>
> the cluster will grow up,in ideal case when cluster dfs used capacity 90 %
> .72Tb nodes will about 80% of capacity and 12Tb have  about 100 %
> capacity.After that you have about 288Tb freespace
>
>
>
>
>
>
>
>
>
>
>
>
>>
>>
>> -----
>>
>>
>>
>>
>> On Sun, Mar 24, 2013 at 11:01 PM, Tapas Sarangi 
>> <[email protected]>wrote:
>>
>>> Yes, thanks for pointing, but I already know that it is completing the
>>> balancing when exiting otherwise it shouldn't exit.
>>> Your answer doesn't solve the problem I mentioned earlier in my message.
>>> 'hdfs' is stalling and hadoop is not writing unless space is cleared up
>>> from the cluster even though "df" shows the cluster has about 500 TB of
>>> free space.
>>>
>>> -------
>>>
>>>
>>> On Mar 24, 2013, at 1:54 PM, Balaji Narayanan (பாலாஜி நாராயணன்) <
>>> [email protected]> wrote:
>>>
>>>  -setBalancerBandwidth <bandwidth in bytes per second>
>>>
>>> So the value is bytes per second. If it is running and exiting,it means
>>> it has completed the balancing.
>>>
>>>
>>> On 24 March 2013 11:32, Tapas Sarangi <[email protected]> wrote:
>>>
>>>> Yes, we are running balancer, though a balancer process runs for almost
>>>> a day or more before exiting and starting over.
>>>> Current dfs.balance.bandwidthPerSec value is set to 2x10^9. I assume
>>>> that's bytes so about 2 GigaByte/sec. Shouldn't that be reasonable ? If it
>>>> is in Bits then we have a problem.
>>>> What's the unit for "dfs.balance.bandwidthPerSec" ?
>>>>
>>>> -----
>>>>
>>>> On Mar 24, 2013, at 1:23 PM, Balaji Narayanan (பாலாஜி நாராயணன்) <
>>>> [email protected]> wrote:
>>>>
>>>> Are you running balancer? If balancer is running and if it is slow, try
>>>> increasing the balancer bandwidth
>>>>
>>>>
>>>> On 24 March 2013 09:21, Tapas Sarangi <[email protected]> wrote:
>>>>
>>>>> Thanks for the follow up. I don't know whether attachment will pass
>>>>> through this mailing list, but I am attaching a pdf that contains the 
>>>>> usage
>>>>> of all live nodes.
>>>>>
>>>>> All nodes starting with letter "g" are the ones with smaller storage
>>>>> space where as nodes starting with letter "s" have larger storage space. 
>>>>> As
>>>>> you will see, most of the "gXX" nodes are completely full whereas "sXX"
>>>>> nodes have a lot of unused space.
>>>>>
>>>>> Recently, we are facing crisis frequently as 'hdfs' goes into a mode
>>>>> where it is not able to write any further even though the total space
>>>>> available in the cluster is about 500 TB. We believe this has something to
>>>>> do with the way it is balancing the nodes, but don't understand the 
>>>>> problem
>>>>> yet. May be the attached PDF will help some of you (experts) to see what 
>>>>> is
>>>>> going wrong here...
>>>>>
>>>>> Thanks
>>>>> ------
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Balancer know about topology,but when calculate balancing it operates
>>>>> only with nodes not with racks.
>>>>> You can see how it work in Balancer.java in  BalancerDatanode about
>>>>> string 509.
>>>>>
>>>>> I was wrong about 350Tb,35Tb it calculates in such way :
>>>>>
>>>>> For example:
>>>>> cluster_capacity=3.5Pb
>>>>> cluster_dfsused=2Pb
>>>>>
>>>>> avgutil=cluster_dfsused/cluster_capacity*100=57.14% used cluster
>>>>> capacity
>>>>> Then we know avg node utilization (node_dfsused/node_capacity*100)
>>>>> .Balancer think that all good if  avgutil
>>>>> +10>node_utilizazation>=avgutil-10.
>>>>>
>>>>> Ideal case that all node used avgutl of capacity.but for 12TB node its
>>>>> only 6.5Tb and for 72Tb its about 40Tb.
>>>>>
>>>>> Balancer cant help you.
>>>>>
>>>>> Show me
>>>>> http://namenode.rambler.ru:50070/dfsnodelist.jsp?whatNodes=LIVE if
>>>>> you can.
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>>  In ideal case with replication factor 2 ,with two nodes 12Tb and
>>>>>> 72Tb you will be able to have only 12Tb replication data.
>>>>>>
>>>>>>
>>>>>> Yes, this is true for exactly two nodes in the cluster with 12 TB and
>>>>>> 72 TB, but not true for more than two nodes in the cluster.
>>>>>>
>>>>>>
>>>>>> Best way,on my opinion,it is using multiple racks.Nodes in rack must
>>>>>> be with identical capacity.Racks must be identical capacity.
>>>>>> For example:
>>>>>>
>>>>>> rack1: 1 node with 72Tb
>>>>>> rack2: 6 nodes with 12Tb
>>>>>> rack3: 3 nodes with 24Tb
>>>>>>
>>>>>> It helps with balancing,because dublicated  block must be another
>>>>>> rack.
>>>>>>
>>>>>>
>>>>>> The same question I asked earlier in this message, does multiple
>>>>>> racks with default threshold for the balancer minimizes the difference
>>>>>> between racks ?
>>>>>>
>>>>>> Why did you select hdfs?May be lustre,cephfs and other is better
>>>>>> choise.
>>>>>>
>>>>>>
>>>>>> It wasn't my decision, and I probably can't change it now. I am new
>>>>>> to this cluster and trying to understand few issues. I will explore other
>>>>>> options as you mentioned.
>>>>>>
>>>>>> --
>>>>>> http://balajin.net/blog
>>>>>> http://flic.kr/balajijegan
>>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> http://balajin.net/blog
>>> http://flic.kr/balajijegan
>>>
>>>
>>>
>>
>>
>
>

Re: disk used percentage is not symmetric on datanodes (balancer)

Reply via email to