[
https://issues.apache.org/jira/browse/HDFS-10716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Zhe Zhang updated HDFS-10716:
-----------------------------
Fix Version/s: 2.7.4
> In Balancer, the target task should be removed when its size < 0.
> -----------------------------------------------------------------
>
> Key: HDFS-10716
> URL: https://issues.apache.org/jira/browse/HDFS-10716
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: balancer & mover
> Reporter: Yiqun Lin
> Assignee: Yiqun Lin
> Priority: Minor
> Fix For: 2.8.0, 2.7.4, 3.0.0-alpha1
>
> Attachments: HDFS-10716.001.patch, failing.log
>
>
> In HDFS-10602, we found a failing case that the balancer moves data always
> between 2 DNs. And it made the balancer can't be finished. I debug the code
> for this, I found there seems a bug in choosing pending blocks in
> {{Dispatcher.Source.chooseNextMove}}.
> The codes:
> {code}
> private PendingMove chooseNextMove() {
> for (Iterator<Task> i = tasks.iterator(); i.hasNext();) {
> final Task task = i.next();
> final DDatanode target = task.target.getDDatanode();
> final PendingMove pendingBlock = new PendingMove(this, task.target);
> if (target.addPendingBlock(pendingBlock)) {
> // target is not busy, so do a tentative block allocation
> if (pendingBlock.chooseBlockAndProxy()) {
> long blockSize = pendingBlock.reportedBlock.getNumBytes(this);
> incScheduledSize(-blockSize);
> task.size -= blockSize;
> // If the size of bytes that need to be moved was first reduced
> to less than 0
> // it should also be removed.
> if (task.size == 0) {
> i.remove();
> }
> return pendingBlock;
> //...
> {code}
> The value of task.size was assigned in
> {{Balancer#matchSourceWithTargetToMove}}
> {code}
> long size = Math.min(source.availableSizeToMove(),
> target.availableSizeToMove());
> final Task task = new Task(target, size);
> {code}
> This value was depended on the source and target node, and this value will
> not always can be reduced to 0 in choosing pending blocks. And then, it will
> still move the data to the target node even if the size of bytes that needed
> to move has been already reduced less than 0. And finally it will make the
> data imbalance again in cluster, then it leads the next balancer.
> We can opitimize for this as this title mentioned, I think this can speed the
> balancer.
> Can see the logs for failling case, or see the HDFS-10602.(Concentrating on
> the change record for the scheduled size of target node. That's my added info
> for debug, like this).
> {code}
> 2016-08-01 16:51:57,492 [pool-51-thread-1] INFO balancer.Dispatcher
> (Dispatcher.java:chooseNextMove(799)) - TargetNode: 58794, bytes scheduled to
> move, after: -67, before: 33
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]