[ 
https://issues.apache.org/jira/browse/HDFS-10716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiqun Lin updated HDFS-10716:
-----------------------------
    Attachment: failing.log

> The target node should be removed in balancer when the scheduled size of 
> bytes that firstly reduced to less than 0
> ------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-10716
>                 URL: https://issues.apache.org/jira/browse/HDFS-10716
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: balancer & mover
>            Reporter: Yiqun Lin
>            Assignee: Yiqun Lin
>         Attachments: failing.log
>
>
> In HDFS-10602, we found a failing case that the balancer moves data always 
> between 2 DNs. And it made the balancer can't be finished. I debug the code 
> for this, I found there seems a bug in choosing pending blocks in 
> {{Dispatcher.Source.chooseNextMove}}.
> The codes:
> {code}
>     private PendingMove chooseNextMove() {
>       for (Iterator<Task> i = tasks.iterator(); i.hasNext();) {
>         final Task task = i.next();
>         final DDatanode target = task.target.getDDatanode();
>         final PendingMove pendingBlock = new PendingMove(this, task.target);
>         if (target.addPendingBlock(pendingBlock)) {
>           // target is not busy, so do a tentative block allocation
>           if (pendingBlock.chooseBlockAndProxy()) {
>             long blockSize = pendingBlock.reportedBlock.getNumBytes(this);
>             incScheduledSize(-blockSize);
>             task.size -= blockSize;
>             // If the size of bytes that need to be moved was first reduced 
> to less than 0
>             // it should also be removed.
>             if (task.size == 0) {
>               i.remove();
>             }
>             return pendingBlock;
>             //...
> {code}
> The value of task.size was assigned in 
> {{Balancer#matchSourceWithTargetToMove}}
> {code}
>     long size = Math.min(source.availableSizeToMove(), 
> target.availableSizeToMove());
>     final Task task = new Task(target, size);
> {code}
> This value was depended on the source and target node, and this value will 
> not always can be reduced to 0 in choosing pending blocks. And then, it will 
> still move the data to the target node even if the size of bytes that needed 
> to move has been already reduced less than 0. And finally it will make the 
> data imbalance again in cluster.
> We can opitimize for this as this title mentioned, I think this can speed the 
> balancer.
> Can see the failling logs for this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to