[ 
https://issues.apache.org/jira/browse/HDFS-10598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15392760#comment-15392760
 ] 

Wei-Chiu Chuang commented on HDFS-10598:
----------------------------------------

Hi [~eddyxu] thanks identifying the bug and submitting the patch. I think the 
fix is straightforward and the unit test makes sense to me. I wonder if we need 
more unit tests to cover more scenarios, because in addition to the normal 
operation, the patch fixes the termination condition in these corner cases:
* {code}// Check for the max error count constraint.{code}
* {code}// we are not able to find any blocks to copy.{code}
* {code}// check if someone told us exit{code}
* {code}// Technically it is possible for us to find a smaller block and{code}

[~arpitagarwal], what's your take?

> DiskBalancer does not execute multi-steps plan.
> -----------------------------------------------
>
>                 Key: HDFS-10598
>                 URL: https://issues.apache.org/jira/browse/HDFS-10598
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: diskbalancer
>    Affects Versions: 3.0.0-beta1
>            Reporter: Lei (Eddy) Xu
>            Assignee: Lei (Eddy) Xu
>            Priority: Critical
>         Attachments: HDFS-10598.00.patch
>
>
> I set up a 3 DN node cluster, each one with 2 small disks.  After creating 
> some files to fill HDFS, I added two more small disks to one DN.  And run the 
> diskbalancer on this DataNode.
> The disk usage before running diskbalancer:
> {code}
> /dev/loop0  3.9G  2.1G  1.6G 58%  /mnt/data1
> /dev/loop1  3.9G  2.6G  1.1G 71%  /mnt/data2
> /dev/loop2  3.9G  17M  3.6G 1%  /mnt/data3
> /dev/loop3  3.9G  17M  3.6G 1%  /mnt/data4
> {code}
> However, after running diskbalancer (i.e., {{-query}} shows {{PLAN_DONE}})
> {code}
> /dev/loop0  3.9G  1.2G  2.5G 32%  /mnt/data1
> /dev/loop1  3.9G  2.6G  1.1G 71%  /mnt/data2
> /dev/loop2  3.9G  953M  2.7G 26%  /mnt/data3
> /dev/loop3  3.9G  17M  3.6G 1%   /mnt/data4
> {code}
> It is suspicious that in {{DiskBalancerMover#copyBlocks}}, every return does 
> {{this.setExitFlag}} which prevents {{copyBlocks()}} be called multiple times 
> from {{DiskBalancer#executePlan}}. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to