[ https://issues.apache.org/jira/browse/HDFS-11377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15840896#comment-15840896 ]
Hadoop QA commented on HDFS-11377: ---------------------------------- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 33s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 51s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 5s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 79m 40s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 27s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}106m 41s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.TestAclsEndToEnd | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure | | | hadoop.hdfs.server.namenode.TestCacheDirectives | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:a9ad5d6 | | JIRA Issue | HDFS-11377 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12849622/HDFS-11377.001.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux fcf3ffa082ad 3.13.0-103-generic #150-Ubuntu SMP Thu Nov 24 10:34:17 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 7bc333a | | Default Java | 1.8.0_121 | | findbugs | v3.0.0 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/18280/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/18280/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/18280/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Balancer hung due to "No mover threads available" > ------------------------------------------------- > > Key: HDFS-11377 > URL: https://issues.apache.org/jira/browse/HDFS-11377 > Project: Hadoop HDFS > Issue Type: Bug > Affects Versions: 2.7.3 > Reporter: yunjiong zhao > Assignee: yunjiong zhao > Attachments: HDFS-11377.001.patch > > > When running balancer on large cluster which have more than 3000 Datanodes, > it might be hung due to "No mover threads available". > The stack trace shows it waiting forever like below. > {code} > "main" #1 prio=5 os_prio=0 tid=0x00007ff6cc014800 nid=0x6b2c waiting on > condition [0x00007ff6d1bad000] > java.lang.Thread.State: TIMED_WAITING (sleeping) > at java.lang.Thread.sleep(Native Method) > at > org.apache.hadoop.hdfs.server.balancer.Dispatcher.waitForMoveCompletion(Dispatcher.java:1043) > at > org.apache.hadoop.hdfs.server.balancer.Dispatcher.dispatchBlockMoves(Dispatcher.java:1017) > at > org.apache.hadoop.hdfs.server.balancer.Dispatcher.dispatchAndCheckContinue(Dispatcher.java:981) > at > org.apache.hadoop.hdfs.server.balancer.Balancer.runOneIteration(Balancer.java:611) > at > org.apache.hadoop.hdfs.server.balancer.Balancer.run(Balancer.java:663) > at > org.apache.hadoop.hdfs.server.balancer.Balancer$Cli.run(Balancer.java:776) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > at > org.apache.hadoop.hdfs.server.balancer.Balancer.main(Balancer.java:905) > {code} > In the log, there are lots of WARN about "No mover threads available". > {quote} > 2017-01-26 15:36:40,085 WARN > org.apache.hadoop.hdfs.server.balancer.Dispatcher: No mover threads > available: skip moving blk_13700554102_1112815018180 with size=268435456 from > 10.115.67.137:50010:DISK to 10.140.21.55:50010:DISK through > 10.115.67.137:50010 > 2017-01-26 15:36:40,085 WARN > org.apache.hadoop.hdfs.server.balancer.Dispatcher: No mover threads > available: skip moving blk_4009558842_1103118359883 with size=268435456 from > 10.115.67.137:50010:DISK to 10.140.21.55:50010:DISK through > 10.115.67.137:50010 > 2017-01-26 15:36:40,085 WARN > org.apache.hadoop.hdfs.server.balancer.Dispatcher: No mover threads > available: skip moving blk_13881956058_1112996460026 with size=133509566 from > 10.115.67.137:50010:DISK to 10.140.21.55:50010:DISK through 10.115.67.36:50010 > {quote} > What happened here is, when there are no mover threads available, > DDatanode.isPendingQEmpty() will return false, so Balancer hung. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org