Namenode Web UI capacity report is inconsistent with Balancer
-------------------------------------------------------------

                 Key: HADOOP-4430
                 URL: https://issues.apache.org/jira/browse/HADOOP-4430
             Project: Hadoop Core
          Issue Type: Bug
    Affects Versions: 0.19.0
            Reporter: Suresh Srinivas
            Assignee: Suresh Srinivas
             Fix For: 0.19.0


Solution to 2816 changed
- Total Capacity definition from (the disk space of all data directories) to 
(the disk space of all the data directories - the reserved space)
- We added a new element Present Capacity to the report. It is set to (Used 
Capacity + Remaining Capacity)
- We changed the Used Percentage reported from (Used Capacity)/(Total Capacity) 
to (Used Capacity)/(Present Capacity)
- All these changes are displayed on Namenode Web UI.

Balancer functionality
Balancer script is started with a threshold parameter. It tries to move the 
blocks from the nodes that have Used % that is more than (Cluster average + 
threshold) to the nodes that have less than (Cluster average - threshold). 
Essentially balancer gets all the datanodes used % to with in (the Cluster 
average +/- threshold).

Inconsistencies due to the change in 2816
When MapReduce jobs are run, temporary files are generated. This eats away a 
lot of space from Present Capacity. The difference between the Total Capacity 
and the Present Capacity can be huge. Currently balancer computes Used 
Percentage based (Used Capacity)/(Total Capacity). The Used % the balancer uses 
could be significantly different from Used % displayed on the Namenode Web UI. 
When balancer is done balancing, the Namenode Used % might still appear 
unbalanced.



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to