[jira] [Commented] (HDFS-8278) HDFS Balancer should consider remaining storage % when checking for under-utilized machines

Hadoop QA (JIRA) Mon, 17 Aug 2015 17:34:17 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-8278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700522#comment-14700522
 ]


Hadoop QA commented on HDFS-8278:
---------------------------------

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  17m 48s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 58s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 56s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 26s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 23s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 32s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 10s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 174m 22s | Tests failed in hadoop-hdfs. |
| | | 219m 35s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.fs.viewfs.TestViewFsWithXAttrs |
| Timed out tests | org.apache.hadoop.cli.TestHDFSCLI |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12750868/h8278_20150817.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / c77bd6a |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12012/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12012/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12012/console |


This message was automatically generated.

> HDFS Balancer should consider remaining storage % when checking for 
> under-utilized machines
> -------------------------------------------------------------------------------------------
>
>                 Key: HDFS-8278
>                 URL: https://issues.apache.org/jira/browse/HDFS-8278
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: balancer & mover
>    Affects Versions: 2.8.0
>            Reporter: Gopal V
>            Assignee: Tsz Wo Nicholas Sze
>         Attachments: h8278_20150817.patch
>
>
> DFS balancer mistakenly identifies a node with very little storage space 
> remaining as an "underutilized" node and tries to move large amounts of data 
> to that particular node.
> All these block moves fail to execute successfully, as the % utilization is 
> less relevant than the dfs remaining storage on that node.
> {code}
> 15/04/24 04:25:55 INFO balancer.Balancer: 0 over-utilized: []
> 15/04/24 04:25:55 INFO balancer.Balancer: 1 underutilized: 
> [172.19.1.46:50010:DISK]
> 15/04/24 04:25:55 INFO balancer.Balancer: Need to move 47.68 GB to make the 
> cluster balanced.
> 15/04/24 04:25:55 INFO balancer.Balancer: Decided to move 413.08 MB bytes 
> from 172.19.1.52:50010:DISK to 172.19.1.46:50010:DISK
> 15/04/24 04:25:55 INFO balancer.Balancer: Will move 413.08 MB in this 
> iteration
> 15/04/24 04:25:55 WARN balancer.Dispatcher: Failed to move 
> blk_1078689321_1099517353638 with size=131146 from 172.19.1.52:50010:DISK to 
> 172.19.1.46:50010:DISK through 172.19.1.53:50010: Got error, status message 
> opReplaceBlock 
> BP-942051088-172.18.1.41-1370508013893:blk_1078689321_1099517353638 received 
> exception org.apache.hadoop.util.DiskChecker$DiskOutOfSpaceException: Out of 
> space: The volume with the most available space (=225042432 B) is less than 
> the block size (=268435456 B)., block move is failed
> {code}
> The machine in concern is under-full when it comes to the BP utilization, but 
> has very little free space available for blocks.
> {code}
> Decommission Status : Normal
> Configured Capacity: 3826907185152 (3.48 TB)
> DFS Used: 2817262833664 (2.56 TB)
> Non DFS Used: 1000621305856 (931.90 GB)
> DFS Remaining: 9023045632 (8.40 GB)
> DFS Used%: 73.62%
> DFS Remaining%: 0.24%
> Configured Cache Capacity: 8589934592 (8 GB)
> Cache Used: 0 (0 B)
> Cache Remaining: 8589934592 (8 GB)
> Cache Used%: 0.00%
> Cache Remaining%: 100.00%
> Xceivers: 3
> Last contact: Fri Apr 24 04:28:36 PDT 2015
> {code}
> The machine has 0.40 Gb of non-RAM storage available on that node, so it is 
> futile to attempt to move any blocks to that particular machine.
> This is a similar concern when a machine loses disks, since the comparisons 
> of utilization always compare percentages per-node. Even that scenario needs 
> to cap data movement to that node to the "DFS Remaining %" variable.
> Trying to move any more data than that to a given node will always fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8278) HDFS Balancer should consider remaining storage % when checking for under-utilized machines

Reply via email to