[ 
https://issues.apache.org/jira/browse/HDFS-2851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13195323#comment-13195323
 ] 

Uma Maheswara Rao G commented on HDFS-2851:
-------------------------------------------

Hi Eli, This particular issue is happening only in branch. This case works fine 
in trunk.

As for the initial look, There are 2 DNs(DN1,DN2) registered with NN initial 
block report also sent. After NN transitioned to active all blocks will be 
marked as stale until next block report comes from this DNs. One new DN (DN3) 
added , this particular DN registered with active NN sucessfully. When we run 
the balancer, it needs to move some blocks here and there to balance the 
cluster. Some blocks came to old DNs, and needs to process OverReplicated 
blocks as well. I think there is no immediate next block report after 
transitioned to active (this point need to confirm , whether we are triggering 
the block report immediately after transitioned to active or not), So the 
blocks was still in stale mode. Processing overReplicated blocks are getting 
postponed due to this reason. Since this nodes not processed OverReplicated 
blocks , used space is little high than expected. [ usedSpace (current: 390, 
expected: 300)]

I just reduced the block report interval to very less (10s), then this 
particular case is passing.

Thanks
Uma
                
> After Balancer runs, usedSpace is not balancing correctly.
> ----------------------------------------------------------
>
>                 Key: HDFS-2851
>                 URL: https://issues.apache.org/jira/browse/HDFS-2851
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: balancer, data-node, ha, name-node
>    Affects Versions: HA branch (HDFS-1623)
>            Reporter: Uma Maheswara Rao G
>            Assignee: Uma Maheswara Rao G
>
> After Balancer runs, usedSpace is not balancing correctly.
> {code}
> java.util.concurrent.TimeoutException: Cluster failed to reached expected 
> values of totalSpace (current: 1500, expected: 1500), or usedSpace (current: 
> 390, expected: 300), in more than 20000 msec.
>       at 
> org.apache.hadoop.hdfs.server.balancer.TestBalancer.waitForHeartBeat(TestBalancer.java:233)
>       at 
> org.apache.hadoop.hdfs.server.balancer.TestBalancerWithHANameNodes.testBalancerWithHANameNodes(TestBalancerWithHANameNodes.java:99)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to