[ 
https://issues.apache.org/jira/browse/HDFS-10716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiqun Lin updated HDFS-10716:
-----------------------------
    Summary: The target node should be removed in balancer when the size of 
bytes that need to move firstly reduced to less than 0  (was: The target node 
should be removed in balancer when the scheduled size of bytes that firstly 
reduced to less than 0)

> The target node should be removed in balancer when the size of bytes that 
> need to move firstly reduced to less than 0
> ---------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-10716
>                 URL: https://issues.apache.org/jira/browse/HDFS-10716
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: balancer & mover
>            Reporter: Yiqun Lin
>            Assignee: Yiqun Lin
>         Attachments: HDFS-10716.001.patch, failing.log
>
>
> In HDFS-10602, we found a failing case that the balancer moves data always 
> between 2 DNs. And it made the balancer can't be finished. I debug the code 
> for this, I found there seems a bug in choosing pending blocks in 
> {{Dispatcher.Source.chooseNextMove}}.
> The codes:
> {code}
>     private PendingMove chooseNextMove() {
>       for (Iterator<Task> i = tasks.iterator(); i.hasNext();) {
>         final Task task = i.next();
>         final DDatanode target = task.target.getDDatanode();
>         final PendingMove pendingBlock = new PendingMove(this, task.target);
>         if (target.addPendingBlock(pendingBlock)) {
>           // target is not busy, so do a tentative block allocation
>           if (pendingBlock.chooseBlockAndProxy()) {
>             long blockSize = pendingBlock.reportedBlock.getNumBytes(this);
>             incScheduledSize(-blockSize);
>             task.size -= blockSize;
>             // If the size of bytes that need to be moved was first reduced 
> to less than 0
>             // it should also be removed.
>             if (task.size == 0) {
>               i.remove();
>             }
>             return pendingBlock;
>             //...
> {code}
> The value of task.size was assigned in 
> {{Balancer#matchSourceWithTargetToMove}}
> {code}
>     long size = Math.min(source.availableSizeToMove(), 
> target.availableSizeToMove());
>     final Task task = new Task(target, size);
> {code}
> This value was depended on the source and target node, and this value will 
> not always can be reduced to 0 in choosing pending blocks. And then, it will 
> still move the data to the target node even if the size of bytes that needed 
> to move has been already reduced less than 0. And finally it will make the 
> data imbalance again in cluster.
> We can opitimize for this as this title mentioned, I think this can speed the 
> balancer.
> Can see the logs for failling case, or see the HDFS-10602.(Concentrating on 
> the change record for the scheduled size of target node. That's my added info 
> for debug, like this).
> {code}
> 2016-08-01 16:51:57,492 [pool-51-thread-1] INFO  balancer.Dispatcher 
> (Dispatcher.java:chooseNextMove(799)) - TargetNode: 58794, bytes scheduled to 
> move, after: -67, before: 33
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to