[jira] [Commented] (HDFS-10906) Add unit tests for Trash with HDFS encryption zones
[ https://issues.apache.org/jira/browse/HDFS-10906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564335#comment-15564335 ] Hadoop QA commented on HDFS-10906: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 24s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 3s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 55s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 47s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 36s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 598 new + 0 unchanged - 0 fixed = 598 total (was 0) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 11s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch 468 line(s) with tabs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 78m 5s{color} | {color:green} hadoop-hdfs in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 25s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}101m 56s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Issue | HDFS-10906 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12832592/HDFS-10906.000.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux ca54550115a4 3.13.0-96-generic #143-Ubuntu SMP Mon Aug 29 20:15:20 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 96b1266 | | Default Java | 1.8.0_101 | | findbugs | v3.0.0 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/17095/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/17095/artifact/patchprocess/whitespace-tabs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/17095/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/17095/console | | Powered by | Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Add unit tests for Trash with HDFS encryption zones > --- > > Key: HDFS-10906 > URL: https://issues.apache.org/jira/browse/HDFS-10906 > Project: Hadoop HDFS > Issue Type: Sub-task >
[jira] [Created] (HDFS-10994) Support "XOR-2-1-64k" policy in "hdfs erasurecode" command
SammiChen created HDFS-10994: Summary: Support "XOR-2-1-64k" policy in "hdfs erasurecode" command Key: HDFS-10994 URL: https://issues.apache.org/jira/browse/HDFS-10994 Project: Hadoop HDFS Issue Type: Task Reporter: SammiChen Assignee: SammiChen So far, "hdfs erasurecode" command supports three policies, RS-DEFAULT-3-2-64k, RS-DEFAULT-6-3-64k and RS-LEGACY-6-3-64k. This task is going to add XOR-2-1-64k policy to this command. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10986) DFSAdmin should log detailed error message if any
[ https://issues.apache.org/jira/browse/HDFS-10986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564257#comment-15564257 ] Hadoop QA commented on HDFS-10986: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 22s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 51s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 78m 19s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 23s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 96m 46s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Issue | HDFS-10986 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12832584/HDFS-10986.002.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 7c60a3d22a5e 3.13.0-96-generic #143-Ubuntu SMP Mon Aug 29 20:15:20 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 96b1266 | | Default Java | 1.8.0_101 | | findbugs | v3.0.0 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/17094/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/17094/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/17094/console | | Powered by | Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > DFSAdmin should log detailed error message if any > - > > Key: HDFS-10986 > URL: https://issues.apache.org/jira/browse/HDFS-10986 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Attachments:
[jira] [Commented] (HDFS-10991) libhdfs : Client compilation is failing for hdfsTruncateFile API
[ https://issues.apache.org/jira/browse/HDFS-10991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564190#comment-15564190 ] Surendra Singh Lilhore commented on HDFS-10991: --- Thanks [~James C] for review.. > libhdfs : Client compilation is failing for hdfsTruncateFile API > - > > Key: HDFS-10991 > URL: https://issues.apache.org/jira/browse/HDFS-10991 > Project: Hadoop HDFS > Issue Type: Bug > Components: libhdfs >Affects Versions: 2.7.0 >Reporter: Surendra Singh Lilhore >Assignee: Surendra Singh Lilhore >Priority: Blocker > Attachments: HDFS-10991.patch > > > {noformat} > /tmp/ccJNUj6m.o: In function `main': > test.c:(.text+0x812): undefined reference to `hdfsTruncateFile' > collect2: ld returned 1 exit status > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10991) libhdfs : Client compilation is failing for hdfsTruncateFile API
[ https://issues.apache.org/jira/browse/HDFS-10991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Surendra Singh Lilhore updated HDFS-10991: -- Affects Version/s: 2.7.0 > libhdfs : Client compilation is failing for hdfsTruncateFile API > - > > Key: HDFS-10991 > URL: https://issues.apache.org/jira/browse/HDFS-10991 > Project: Hadoop HDFS > Issue Type: Bug > Components: libhdfs >Affects Versions: 2.7.0 >Reporter: Surendra Singh Lilhore >Assignee: Surendra Singh Lilhore >Priority: Blocker > Attachments: HDFS-10991.patch > > > {noformat} > /tmp/ccJNUj6m.o: In function `main': > test.c:(.text+0x812): undefined reference to `hdfsTruncateFile' > collect2: ld returned 1 exit status > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10988) Refactor TestBalancerBandwidth
[ https://issues.apache.org/jira/browse/HDFS-10988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564168#comment-15564168 ] Brahma Reddy Battula commented on HDFS-10988: - [~liuml07] thanks for review and commit. > Refactor TestBalancerBandwidth > -- > > Key: HDFS-10988 > URL: https://issues.apache.org/jira/browse/HDFS-10988 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: balancer & mover, test >Reporter: Brahma Reddy Battula >Assignee: Brahma Reddy Battula > Fix For: 2.8.0, 3.0.0-alpha2 > > Attachments: HDFS-10988-002.patch, HDFS-10988.patch > > > This jira will deal the following. > 1) Remove Fixed sleep > 2) Remove unused dnproxy > 3) use try with resources -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10986) DFSAdmin should log detailed error message if any
[ https://issues.apache.org/jira/browse/HDFS-10986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564161#comment-15564161 ] Hadoop QA commented on HDFS-10986: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 50s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 59m 20s{color} | {color:green} hadoop-hdfs in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 19s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 78m 11s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Issue | HDFS-10986 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12832579/HDFS-10986.001.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux ba22a0b56351 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 96b1266 | | Default Java | 1.8.0_101 | | findbugs | v3.0.0 | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/17093/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/17093/console | | Powered by | Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > DFSAdmin should log detailed error message if any > - > > Key: HDFS-10986 > URL: https://issues.apache.org/jira/browse/HDFS-10986 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Attachments: HDFS-10986-branch-2.8.002.patch, HDFS-10986.000.patch, > HDFS-10986.001.patch, HDFS-10986.002.patch > > > There are some subcommands in {{DFSAdmin}} that swallow IOException and give > very limited error message, if any, to the stderr. > {code} >
[jira] [Commented] (HDFS-10906) Add unit tests for Trash with HDFS encryption zones
[ https://issues.apache.org/jira/browse/HDFS-10906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564152#comment-15564152 ] Hanisha Koneru commented on HDFS-10906: --- Thank you [~xyao]. I have added two tests to cover the list of test cases. > Add unit tests for Trash with HDFS encryption zones > --- > > Key: HDFS-10906 > URL: https://issues.apache.org/jira/browse/HDFS-10906 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: encryption >Affects Versions: 2.8.0 >Reporter: Xiaoyu Yao >Assignee: Hanisha Koneru > Attachments: HDFS-10906.000.patch > > > The goal is to improve unit test coverage for HDFS trash with encryption zone > especially under Kerberos environment. The current unit test > TestEncryptionZones#testEncryptionZonewithTrash() has limited coverage on > non-Kerberos case. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10906) Add unit tests for Trash with HDFS encryption zones
[ https://issues.apache.org/jira/browse/HDFS-10906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hanisha Koneru updated HDFS-10906: -- Status: Patch Available (was: Open) > Add unit tests for Trash with HDFS encryption zones > --- > > Key: HDFS-10906 > URL: https://issues.apache.org/jira/browse/HDFS-10906 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: encryption >Affects Versions: 2.8.0 >Reporter: Xiaoyu Yao >Assignee: Hanisha Koneru > Attachments: HDFS-10906.000.patch > > > The goal is to improve unit test coverage for HDFS trash with encryption zone > especially under Kerberos environment. The current unit test > TestEncryptionZones#testEncryptionZonewithTrash() has limited coverage on > non-Kerberos case. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9668) Optimize the locking in FsDatasetImpl
[ https://issues.apache.org/jira/browse/HDFS-9668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564153#comment-15564153 ] Jingcheng Du commented on HDFS-9668: Thanks [~eddyxu]! I will update the patch after HADOOP-13702 is committed to address the comments. I guess I can co-operate this JIRA with HDFS-10804. In the latest patch of this JIRA, I did some similar things, I think the changes can address the concerns in HDFS-10804. Thanks. > Optimize the locking in FsDatasetImpl > - > > Key: HDFS-9668 > URL: https://issues.apache.org/jira/browse/HDFS-9668 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Jingcheng Du >Assignee: Jingcheng Du > Attachments: HDFS-9668-1.patch, HDFS-9668-10.patch, > HDFS-9668-11.patch, HDFS-9668-12.patch, HDFS-9668-13.patch, > HDFS-9668-14.patch, HDFS-9668-14.patch, HDFS-9668-15.patch, > HDFS-9668-16.patch, HDFS-9668-17.patch, HDFS-9668-2.patch, HDFS-9668-3.patch, > HDFS-9668-4.patch, HDFS-9668-5.patch, HDFS-9668-6.patch, HDFS-9668-7.patch, > HDFS-9668-8.patch, HDFS-9668-9.patch, execution_time.png > > > During the HBase test on a tiered storage of HDFS (WAL is stored in > SSD/RAMDISK, and all other files are stored in HDD), we observe many > long-time BLOCKED threads on FsDatasetImpl in DataNode. The following is part > of the jstack result: > {noformat} > "DataXceiver for client DFSClient_NONMAPREDUCE_-1626037897_1 at > /192.168.50.16:48521 [Receiving block > BP-1042877462-192.168.50.13-1446173170517:blk_1073779272_40852]" - Thread > t@93336 >java.lang.Thread.State: BLOCKED > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:) > - waiting to lock <18324c9> (a > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl) owned by > "DataXceiver for client DFSClient_NONMAPREDUCE_-1626037897_1 at > /192.168.50.16:48520 [Receiving block > BP-1042877462-192.168.50.13-1446173170517:blk_1073779271_40851]" t@93335 > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:113) > at > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.(BlockReceiver.java:183) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:615) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235) > at java.lang.Thread.run(Thread.java:745) >Locked ownable synchronizers: > - None > > "DataXceiver for client DFSClient_NONMAPREDUCE_-1626037897_1 at > /192.168.50.16:48520 [Receiving block > BP-1042877462-192.168.50.13-1446173170517:blk_1073779271_40851]" - Thread > t@93335 >java.lang.Thread.State: RUNNABLE > at java.io.UnixFileSystem.createFileExclusively(Native Method) > at java.io.File.createNewFile(File.java:1012) > at > org.apache.hadoop.hdfs.server.datanode.DatanodeUtil.createTmpFile(DatanodeUtil.java:66) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.createRbwFile(BlockPoolSlice.java:271) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.createRbwFile(FsVolumeImpl.java:286) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:1140) > - locked <18324c9> (a > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:113) > at > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.(BlockReceiver.java:183) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:615) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235) > at java.lang.Thread.run(Thread.java:745) >Locked ownable synchronizers: > - None > {noformat} > We measured the execution of some operations in FsDatasetImpl during the > test. Here following is the result. > !execution_time.png! > The operations of finalizeBlock, addBlock and createRbw on HDD in a heavy > load take a really long time. > It means one slow operation of finalizeBlock, addBlock and createRbw in a > slow storage can block all the other same operations in the same DataNode, > especially in HBase when many wal/flusher/compactor are configured. >
[jira] [Updated] (HDFS-10906) Add unit tests for Trash with HDFS encryption zones
[ https://issues.apache.org/jira/browse/HDFS-10906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hanisha Koneru updated HDFS-10906: -- Attachment: HDFS-10906.000.patch > Add unit tests for Trash with HDFS encryption zones > --- > > Key: HDFS-10906 > URL: https://issues.apache.org/jira/browse/HDFS-10906 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: encryption >Affects Versions: 2.8.0 >Reporter: Xiaoyu Yao >Assignee: Hanisha Koneru > Attachments: HDFS-10906.000.patch > > > The goal is to improve unit test coverage for HDFS trash with encryption zone > especially under Kerberos environment. The current unit test > TestEncryptionZones#testEncryptionZonewithTrash() has limited coverage on > non-Kerberos case. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10986) DFSAdmin should log detailed error message if any
[ https://issues.apache.org/jira/browse/HDFS-10986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mingliang Liu updated HDFS-10986: - Attachment: HDFS-10986-branch-2.8.002.patch > DFSAdmin should log detailed error message if any > - > > Key: HDFS-10986 > URL: https://issues.apache.org/jira/browse/HDFS-10986 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Attachments: HDFS-10986-branch-2.8.002.patch, HDFS-10986.000.patch, > HDFS-10986.001.patch, HDFS-10986.002.patch > > > There are some subcommands in {{DFSAdmin}} that swallow IOException and give > very limited error message, if any, to the stderr. > {code} > $ hdfs dfsadmin -getBalancerBandwidth 127.0.0.1:9866 > Datanode unreachable. > $ hdfs dfsadmin -getDatanodeInfo localhost:9866 > Datanode unreachable. > $ hdfs dfsadmin -evictWriters 127.0.0.1:9866 > $ echo $? > -1 > {code} > User is not able to get the exception stack even the LOG level is DEBUG. This > is not very user friendly. Fortunately, if the port number is not accessible > (say ), users can infer the detailed error message by IPC logs: > {code} > $ hdfs dfsadmin -getBalancerBandwidth 127.0.0.1: > 2016-10-07 18:01:35,115 WARN util.NativeCodeLoader: Unable to load > native-hadoop library for your platform... using builtin-java classes where > applicable > 2016-10-07 18:01:36,335 INFO ipc.Client: Retrying connect to server: > localhost/127.0.0.1:. Already tried 0 time(s); retry policy is > RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 > MILLISECONDS) > . > 2016-10-07 18:01:45,361 INFO ipc.Client: Retrying connect to server: > localhost/127.0.0.1:. Already tried 9 time(s); retry policy is > RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 > MILLISECONDS) > 2016-10-07 18:01:45,362 WARN ipc.Client: Failed to connect to server: > localhost/127.0.0.1:: retries get failed due to exceeded maximum allowed > retries number: 10 > java.net.ConnectException: Connection refused > at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) > at > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) > ... > at org.apache.hadoop.hdfs.tools.DFSAdmin.run(DFSAdmin.java:2073) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) > at org.apache.hadoop.hdfs.tools.DFSAdmin.main(DFSAdmin.java:2225) > Datanode unreachable. > {code} > We should fix this by providing detailed error message. Actually, the > {{DFSAdmin#run}} already handles exception carefully, including: > # set the exit ret value to -1 > # print the error message > # log the exception stack trace (in DEBUG level) > All we need to do is to not swallow exceptions without good reason. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10986) DFSAdmin should log detailed error message if any
[ https://issues.apache.org/jira/browse/HDFS-10986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mingliang Liu updated HDFS-10986: - Attachment: HDFS-10986.002.patch > DFSAdmin should log detailed error message if any > - > > Key: HDFS-10986 > URL: https://issues.apache.org/jira/browse/HDFS-10986 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Attachments: HDFS-10986-branch-2.8.002.patch, HDFS-10986.000.patch, > HDFS-10986.001.patch, HDFS-10986.002.patch > > > There are some subcommands in {{DFSAdmin}} that swallow IOException and give > very limited error message, if any, to the stderr. > {code} > $ hdfs dfsadmin -getBalancerBandwidth 127.0.0.1:9866 > Datanode unreachable. > $ hdfs dfsadmin -getDatanodeInfo localhost:9866 > Datanode unreachable. > $ hdfs dfsadmin -evictWriters 127.0.0.1:9866 > $ echo $? > -1 > {code} > User is not able to get the exception stack even the LOG level is DEBUG. This > is not very user friendly. Fortunately, if the port number is not accessible > (say ), users can infer the detailed error message by IPC logs: > {code} > $ hdfs dfsadmin -getBalancerBandwidth 127.0.0.1: > 2016-10-07 18:01:35,115 WARN util.NativeCodeLoader: Unable to load > native-hadoop library for your platform... using builtin-java classes where > applicable > 2016-10-07 18:01:36,335 INFO ipc.Client: Retrying connect to server: > localhost/127.0.0.1:. Already tried 0 time(s); retry policy is > RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 > MILLISECONDS) > . > 2016-10-07 18:01:45,361 INFO ipc.Client: Retrying connect to server: > localhost/127.0.0.1:. Already tried 9 time(s); retry policy is > RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 > MILLISECONDS) > 2016-10-07 18:01:45,362 WARN ipc.Client: Failed to connect to server: > localhost/127.0.0.1:: retries get failed due to exceeded maximum allowed > retries number: 10 > java.net.ConnectException: Connection refused > at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) > at > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) > ... > at org.apache.hadoop.hdfs.tools.DFSAdmin.run(DFSAdmin.java:2073) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) > at org.apache.hadoop.hdfs.tools.DFSAdmin.main(DFSAdmin.java:2225) > Datanode unreachable. > {code} > We should fix this by providing detailed error message. Actually, the > {{DFSAdmin#run}} already handles exception carefully, including: > # set the exit ret value to -1 > # print the error message > # log the exception stack trace (in DEBUG level) > All we need to do is to not swallow exceptions without good reason. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10986) DFSAdmin should log detailed error message if any
[ https://issues.apache.org/jira/browse/HDFS-10986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mingliang Liu updated HDFS-10986: - Target Version/s: 2.8.0 (was: 3.0.0-alpha2) > DFSAdmin should log detailed error message if any > - > > Key: HDFS-10986 > URL: https://issues.apache.org/jira/browse/HDFS-10986 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Attachments: HDFS-10986.000.patch, HDFS-10986.001.patch > > > There are some subcommands in {{DFSAdmin}} that swallow IOException and give > very limited error message, if any, to the stderr. > {code} > $ hdfs dfsadmin -getBalancerBandwidth 127.0.0.1:9866 > Datanode unreachable. > $ hdfs dfsadmin -getDatanodeInfo localhost:9866 > Datanode unreachable. > $ hdfs dfsadmin -evictWriters 127.0.0.1:9866 > $ echo $? > -1 > {code} > User is not able to get the exception stack even the LOG level is DEBUG. This > is not very user friendly. Fortunately, if the port number is not accessible > (say ), users can infer the detailed error message by IPC logs: > {code} > $ hdfs dfsadmin -getBalancerBandwidth 127.0.0.1: > 2016-10-07 18:01:35,115 WARN util.NativeCodeLoader: Unable to load > native-hadoop library for your platform... using builtin-java classes where > applicable > 2016-10-07 18:01:36,335 INFO ipc.Client: Retrying connect to server: > localhost/127.0.0.1:. Already tried 0 time(s); retry policy is > RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 > MILLISECONDS) > . > 2016-10-07 18:01:45,361 INFO ipc.Client: Retrying connect to server: > localhost/127.0.0.1:. Already tried 9 time(s); retry policy is > RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 > MILLISECONDS) > 2016-10-07 18:01:45,362 WARN ipc.Client: Failed to connect to server: > localhost/127.0.0.1:: retries get failed due to exceeded maximum allowed > retries number: 10 > java.net.ConnectException: Connection refused > at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) > at > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) > ... > at org.apache.hadoop.hdfs.tools.DFSAdmin.run(DFSAdmin.java:2073) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) > at org.apache.hadoop.hdfs.tools.DFSAdmin.main(DFSAdmin.java:2225) > Datanode unreachable. > {code} > We should fix this by providing detailed error message. Actually, the > {{DFSAdmin#run}} already handles exception carefully, including: > # set the exit ret value to -1 > # print the error message > # log the exception stack trace (in DEBUG level) > All we need to do is to not swallow exceptions without good reason. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10972) Add unit test for HDFS command 'dfsadmin -getDatanodeInfo'
[ https://issues.apache.org/jira/browse/HDFS-10972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mingliang Liu updated HDFS-10972: - Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.8.0 Status: Resolved (was: Patch Available) Committed to {{branch-2.8}}. Thanks for your contribution, [~xiaobingo]. > Add unit test for HDFS command 'dfsadmin -getDatanodeInfo' > -- > > Key: HDFS-10972 > URL: https://issues.apache.org/jira/browse/HDFS-10972 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: fs, shell, test >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Fix For: 2.8.0, 3.0.0-alpha2 > > Attachments: HDFS-10972-branch-2.8.003.patch, HDFS-10972.000.patch, > HDFS-10972.001.patch, HDFS-10972.002.patch, HDFS-10972.003.patch > > > getDatanodeInfo should be tested in admin CLI. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10933) Refactor TestFsck
[ https://issues.apache.org/jira/browse/HDFS-10933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564047#comment-15564047 ] Takanobu Asanuma commented on HDFS-10933: - Thank you for reviewing and committing, [~jojochuang]! I will create a branch-2 patch soon. > Refactor TestFsck > - > > Key: HDFS-10933 > URL: https://issues.apache.org/jira/browse/HDFS-10933 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Takanobu Asanuma >Assignee: Takanobu Asanuma >Priority: Minor > Attachments: HDFS-10933.1.patch, HDFS-10933.2.patch, > HDFS-10933.3.patch, HDFS-10933.WIP.1.patch > > > {{TestFsck}} should be refactored. > - use @Before @After annotations > - improve loggings > - fix checkstyle warnings > etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-10965) Add unit test for HDFS command 'dfsadmin -printTopology'
[ https://issues.apache.org/jira/browse/HDFS-10965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563889#comment-15563889 ] Mingliang Liu edited comment on HDFS-10965 at 10/11/16 12:43 AM: - {code} 384 /* init reused vars */ 385 List outs = null; 386 int ret; 387 388 /** 389* test normal run 390*/ {code} No reuse found. Make them final. {code} 400 assertEquals( 401 "three lines per Datanode: the 1st line is rack info, 2nd node info," 402 + " 3rd empty line.", 403 12, outs.size()); {code} "There should be three lines per Datanode: the 1st line " 12 -> 3 * numDn {code} 376 /* init cluster using topology */ 377 try (MiniDFSCluster miniCluster = new MiniDFSCluster.Builder(dfsConf) 378 .numDataNodes(numDn).racks(racks).build()) { {code} You created a new MiniDFSCluster in the test using the default cluster directory, which conflicts with the pre-setup class variable cluster (e.g. not able to find the edits dir etc). The reason is that the MiniDFSCluster will format every time we build a new one. Please have a look at [HDFS-10986] for more information to use the pre-set {{cluster}}. was (Author: liuml07): {code} 384 /* init reused vars */ 385 List outs = null; 386 int ret; 387 388 /** 389* test normal run 390*/ {code} No reuse found. Make them final. {code} 400 assertEquals( 401 "three lines per Datanode: the 1st line is rack info, 2nd node info," 402 + " 3rd empty line.", 403 12, outs.size()); {code} "There should be three lines per Datanode: the 1st line " 12 -> 3 * numDn Otherwise +1 pending on Jenkins. > Add unit test for HDFS command 'dfsadmin -printTopology' > > > Key: HDFS-10965 > URL: https://issues.apache.org/jira/browse/HDFS-10965 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: fs, shell, test >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Attachments: HDFS-10965.000.patch, HDFS-10965.001.patch, > HDFS-10965.002.patch, HDFS-10965.003.patch, HDFS-10965.004.patch > > > DFSAdmin#printTopology should also be tested. This proposes adding it in > TestDFSAdmin. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10986) DFSAdmin should log detailed error message if any
[ https://issues.apache.org/jira/browse/HDFS-10986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mingliang Liu updated HDFS-10986: - Attachment: HDFS-10986.001.patch Thanks for your review, [~brahmareddy]. The v1 patch is to address both of your comments (which are very valid). [~xiaobingo] In [HDFS-10972] you created a new MiniDFSCluster in the test using the default cluster directory, which conflicts with the pre-setup class variable {{cluster}} (e.g. not able to find the edits dir etc). The reason is that the MiniDFSCluster will format every time we build a new one. Your test can pass regardless of this. This v1 patch, while adding more failing cases, also addressed that problem. Please confirm and review the test here. Thanks. > DFSAdmin should log detailed error message if any > - > > Key: HDFS-10986 > URL: https://issues.apache.org/jira/browse/HDFS-10986 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Attachments: HDFS-10986.000.patch, HDFS-10986.001.patch > > > There are some subcommands in {{DFSAdmin}} that swallow IOException and give > very limited error message, if any, to the stderr. > {code} > $ hdfs dfsadmin -getBalancerBandwidth 127.0.0.1:9866 > Datanode unreachable. > $ hdfs dfsadmin -getDatanodeInfo localhost:9866 > Datanode unreachable. > $ hdfs dfsadmin -evictWriters 127.0.0.1:9866 > $ echo $? > -1 > {code} > User is not able to get the exception stack even the LOG level is DEBUG. This > is not very user friendly. Fortunately, if the port number is not accessible > (say ), users can infer the detailed error message by IPC logs: > {code} > $ hdfs dfsadmin -getBalancerBandwidth 127.0.0.1: > 2016-10-07 18:01:35,115 WARN util.NativeCodeLoader: Unable to load > native-hadoop library for your platform... using builtin-java classes where > applicable > 2016-10-07 18:01:36,335 INFO ipc.Client: Retrying connect to server: > localhost/127.0.0.1:. Already tried 0 time(s); retry policy is > RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 > MILLISECONDS) > . > 2016-10-07 18:01:45,361 INFO ipc.Client: Retrying connect to server: > localhost/127.0.0.1:. Already tried 9 time(s); retry policy is > RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 > MILLISECONDS) > 2016-10-07 18:01:45,362 WARN ipc.Client: Failed to connect to server: > localhost/127.0.0.1:: retries get failed due to exceeded maximum allowed > retries number: 10 > java.net.ConnectException: Connection refused > at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) > at > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) > ... > at org.apache.hadoop.hdfs.tools.DFSAdmin.run(DFSAdmin.java:2073) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) > at org.apache.hadoop.hdfs.tools.DFSAdmin.main(DFSAdmin.java:2225) > Datanode unreachable. > {code} > We should fix this by providing detailed error message. Actually, the > {{DFSAdmin#run}} already handles exception carefully, including: > # set the exit ret value to -1 > # print the error message > # log the exception stack trace (in DEBUG level) > All we need to do is to not swallow exceptions without good reason. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10903) Replace config key literal strings with config key names II: hadoop hdfs
[ https://issues.apache.org/jira/browse/HDFS-10903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564009#comment-15564009 ] Mingliang Liu commented on HDFS-10903: -- {{ipc.client.connect.max.retries}} is used in some tests as string literals (e.g. {{TestFileAppend4}}), please also address them in this JIRA. Thanks. > Replace config key literal strings with config key names II: hadoop hdfs > > > Key: HDFS-10903 > URL: https://issues.apache.org/jira/browse/HDFS-10903 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Mingliang Liu >Assignee: Chen Liang >Priority: Minor > Attachments: HADOOP-13644.001.patch, HDFS-10903.002.patch > > > In *Hadoop HDFS*, there are some places that use config key literal strings > instead of config key names, e.g. > {code:title=IOUtils.java} > copyBytes(in, out, conf.getInt("io.file.buffer.size", 4096), true); > {code} > We should replace places like this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10965) Add unit test for HDFS command 'dfsadmin -printTopology'
[ https://issues.apache.org/jira/browse/HDFS-10965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563986#comment-15563986 ] Hadoop QA commented on HDFS-10965: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 1s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 53s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 10s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 16s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 47s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 57m 43s{color} | {color:green} hadoop-hdfs in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 17s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 80m 11s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Issue | HDFS-10965 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12832554/HDFS-10965.004.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux a7b78250c754 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 96b1266 | | Default Java | 1.8.0_101 | | findbugs | v3.0.0 | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/17092/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/17092/console | | Powered by | Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Add unit test for HDFS command 'dfsadmin -printTopology' > > > Key: HDFS-10965 > URL: https://issues.apache.org/jira/browse/HDFS-10965 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: fs, shell, test >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Attachments: HDFS-10965.000.patch, HDFS-10965.001.patch, > HDFS-10965.002.patch, HDFS-10965.003.patch, HDFS-10965.004.patch > > > DFSAdmin#printTopology should also be tested. This proposes adding it in > TestDFSAdmin. -- This
[jira] [Commented] (HDFS-10967) Add configuration for BlockPlacementPolicy to avoid near-full DataNodes
[ https://issues.apache.org/jira/browse/HDFS-10967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563969#comment-15563969 ] Hadoop QA commented on HDFS-10967: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 29s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 44s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 1s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 7s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 1m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 21s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 43s{color} | {color:orange} hadoop-hdfs-project: The patch generated 14 new + 1052 unchanged - 1 fixed = 1066 total (was 1053) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 55s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 63m 1s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 21s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 91m 48s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.tools.TestHdfsConfigFields | | Timed out junit tests | org.apache.hadoop.hdfs.TestLeaseRecovery2 | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Issue | HDFS-10967 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12832548/HDFS-10967.03.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc | | uname | Linux 7bb128d54181 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 96b1266 | | Default Java | 1.8.0_101 | | findbugs | v3.0.0 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/17091/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project.txt | | unit |
[jira] [Commented] (HDFS-10972) Add unit test for HDFS command 'dfsadmin -getDatanodeInfo'
[ https://issues.apache.org/jira/browse/HDFS-10972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563946#comment-15563946 ] Hadoop QA commented on HDFS-10972: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 17m 50s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 45s{color} | {color:green} branch-2.8 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s{color} | {color:green} branch-2.8 passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s{color} | {color:green} branch-2.8 passed with JDK v1.7.0_111 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 21s{color} | {color:green} branch-2.8 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 54s{color} | {color:green} branch-2.8 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 16s{color} | {color:green} branch-2.8 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 56s{color} | {color:green} branch-2.8 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s{color} | {color:green} branch-2.8 passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 34s{color} | {color:green} branch-2.8 passed with JDK v1.7.0_111 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s{color} | {color:green} the patch passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s{color} | {color:green} the patch passed with JDK v1.7.0_111 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 6s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 53s{color} | {color:green} the patch passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 33s{color} | {color:green} the patch passed with JDK v1.7.0_111 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 64m 44s{color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_111. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 32s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}172m 30s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_101 Failed junit tests | hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency | | | hadoop.hdfs.server.namenode.snapshot.TestSnapshotFileLength | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaRecovery | | JDK v1.7.0_111 Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:5af2af1 | | JIRA Issue | HDFS-10972 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12832539/HDFS-10972-branch-2.8.003.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux bca529a9fc97 3.13.0-96-generic #143-Ubuntu SMP Mon Aug 29 20:15:20 UTC 2016
[jira] [Updated] (HDFS-10986) DFSAdmin should log detailed error message if any
[ https://issues.apache.org/jira/browse/HDFS-10986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mingliang Liu updated HDFS-10986: - Description: There are some subcommands in {{DFSAdmin}} that swallow IOException and give very limited error message, if any, to the stderr. {code} $ hdfs dfsadmin -getBalancerBandwidth 127.0.0.1:9866 Datanode unreachable. $ hdfs dfsadmin -getDatanodeInfo localhost:9866 Datanode unreachable. $ hdfs dfsadmin -evictWriters 127.0.0.1:9866 $ echo $? -1 {code} User is not able to get the exception stack even the LOG level is DEBUG. This is not very user friendly. Fortunately, if the port number is not accessible (say ), users can infer the detailed error message by IPC logs: {code} $ hdfs dfsadmin -getBalancerBandwidth 127.0.0.1: 2016-10-07 18:01:35,115 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2016-10-07 18:01:36,335 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) . 2016-10-07 18:01:45,361 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-10-07 18:01:45,362 WARN ipc.Client: Failed to connect to server: localhost/127.0.0.1:: retries get failed due to exceeded maximum allowed retries number: 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) ... at org.apache.hadoop.hdfs.tools.DFSAdmin.run(DFSAdmin.java:2073) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) at org.apache.hadoop.hdfs.tools.DFSAdmin.main(DFSAdmin.java:2225) Datanode unreachable. {code} We should fix this by providing detailed error message. Actually, the {{DFSAdmin#run}} already handles exception carefully, including: # set the exit ret value to -1 # print the error message # log the exception stack trace (in DEBUG level) All we need to do is to not swallow exceptions without good reason. was: There are some subcommands in {{DFSAdmin}} that swallow IOException and give very limited error message, if any, to the stderr. {code} $ hdfs dfsadmin -getBalancerBandwidth 127.0.0.1:9866 Datanode unreachable. $ hdfs dfsadmin -getDatanodeInfo localhost:9866 Datanode unreachable. $ hdfs dfsadmin -evictWriters 127.0.0.1:9866 $ echo $? -1 {code} User is not able to get the exception stack even the LOG level is DEBUG. This is not very user friendly. Fortunately, if the port number is not accessible (say ), users can infer the detailed error message by IPC logs: {code} $ hdfs dfsadmin -getBalancerBandwidth 127.0.0.1: 2016-10-07 18:01:35,115 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2016-10-07 18:01:36,335 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9690. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) . 2016-10-07 18:01:45,361 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9690. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-10-07 18:01:45,362 WARN ipc.Client: Failed to connect to server: localhost/127.0.0.1:9690: retries get failed due to exceeded maximum allowed retries number: 10 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) ... at org.apache.hadoop.hdfs.tools.DFSAdmin.run(DFSAdmin.java:2073) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) at org.apache.hadoop.hdfs.tools.DFSAdmin.main(DFSAdmin.java:2225) Datanode unreachable. {code} We should fix this by providing detailed error message. Actually, the {{DFSAdmin#run}} already handles exception carefully, including: # set the exit ret value to -1 # print the error message # log the exception stack trace (in DEBUG level) All we need to do is to not swallow exceptions without good reason. > DFSAdmin should log detailed error message if any > - > > Key: HDFS-10986 > URL: https://issues.apache.org/jira/browse/HDFS-10986 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools >
[jira] [Commented] (HDFS-10965) Add unit test for HDFS command 'dfsadmin -printTopology'
[ https://issues.apache.org/jira/browse/HDFS-10965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563889#comment-15563889 ] Mingliang Liu commented on HDFS-10965: -- {code} 384 /* init reused vars */ 385 List outs = null; 386 int ret; 387 388 /** 389* test normal run 390*/ {code} No reuse found. Make them final. {code} 400 assertEquals( 401 "three lines per Datanode: the 1st line is rack info, 2nd node info," 402 + " 3rd empty line.", 403 12, outs.size()); {code} "There should be three lines per Datanode: the 1st line " 12 -> 3 * numDn Otherwise +1 pending on Jenkins. > Add unit test for HDFS command 'dfsadmin -printTopology' > > > Key: HDFS-10965 > URL: https://issues.apache.org/jira/browse/HDFS-10965 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: fs, shell, test >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Attachments: HDFS-10965.000.patch, HDFS-10965.001.patch, > HDFS-10965.002.patch, HDFS-10965.003.patch, HDFS-10965.004.patch > > > DFSAdmin#printTopology should also be tested. This proposes adding it in > TestDFSAdmin. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10967) Add configuration for BlockPlacementPolicy to avoid near-full DataNodes
[ https://issues.apache.org/jira/browse/HDFS-10967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563882#comment-15563882 ] Konstantin Shvachko commented on HDFS-10967: This could be good optimization of placement policy for other than the first replicas when some nodes are close to full. This should only be turned on though in heterogeneous environment. With homogeneous nodes where node usage is balanced this will be just a performance overhead. Few suggestions on the patch # Set the default for remaining capacity threshold that triggers new behavior to {{0.00}}. Homogeneous clusters should not require reconfiguration. # The config variable is more like a "threshold" rather than a "factor" as in {{considerLoad}}. So may be call it {{dfs.namenode.replication.considerCapacity.threshold}}. # I would suggest to use only one configuration variable. The Boolean {{considerCapacity}} is essentially redundant. # Would be good to have a JavaDoc for the new config variable and for the method, where the feature is triggered. # Did you try to add {{isNearFull()}} call into {{isGoodDatanode()}}? Then you will not need to retry 3 times. # JavaDoc for the unit test would be also useful. A separate discussion point whether we should add a separate admin command for every configuration nob we introduce? This doesn't feel right. May be we should have a command, say {{refreshConfiguration()}}, which re-reads config, updates variables and logs what it updated. The main problem with admin command is that it does not change configuration and when you restart the service or fail it over you loose the set value. So may be anything that is controlled by configuration should be updated through configuration and the {{refreshConfiguration()}} call. We could have merged this with {{setBalancerBandwidth()}} some time. Alternatively, we can use failover to take effect of new configuration values. Then we don't need new admin commands at all. > Add configuration for BlockPlacementPolicy to avoid near-full DataNodes > --- > > Key: HDFS-10967 > URL: https://issues.apache.org/jira/browse/HDFS-10967 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Labels: balancer > Attachments: HDFS-10967.00.patch, HDFS-10967.01.patch, > HDFS-10967.02.patch, HDFS-10967.03.patch > > > Large production clusters are likely to have heterogeneous nodes in terms of > storage capacity, memory, and CPU cores. It is not always possible to > proportionally ingest data into DataNodes based on their remaining storage > capacity. Therefore it's possible for a subset of DataNodes to be much closer > to full capacity than the rest. > This heterogeneity is most likely rack-by-rack -- i.e. _m_ whole racks of > low-storage nodes and _n_ whole racks of high-storage nodes. So It'd be very > useful if we can lower the chance for those near-full DataNodes to become > destinations for the 2nd and 3rd replicas. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10977) Balancer should query NameNode with a timeout
[ https://issues.apache.org/jira/browse/HDFS-10977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563883#comment-15563883 ] Zhe Zhang commented on HDFS-10977: -- Thanks [~senthilec566] for reporting the similar issue. Yes it's possible that decomm nodes were causing the infinite delay, in which case {{-include}} is a good workaround. I'll try that in our cluster. But in general, I think we still should add the timeout logic. E.g. the RPC request could encounter issues on the network layer. > Balancer should query NameNode with a timeout > - > > Key: HDFS-10977 > URL: https://issues.apache.org/jira/browse/HDFS-10977 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Attachments: Gmail - HDFS Balancer Stuck after 10 Minz.pdf, > HDFS-10977-reproduce.patch > > > We found a case where {{Dispatcher}} was stuck at {{getBlockList}} *forever* > (well, several hours when we found it). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10985) o.a.h.ha.TestZKFailoverController should not use fixed time sleep before assertions
[ https://issues.apache.org/jira/browse/HDFS-10985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mingliang Liu updated HDFS-10985: - Description: {{TestZKFailoverController#testGracefulFailoverMultipleZKfcs}} uses fixed time sleep before assertions. This may fail sometimes though 10 seconds are generally long enough. I think we can use {{GenericTestUtils.waitFor()}} to retry the assertions. Meanwhile, it usually does not need 10 seconds to reach the condition (<1s in my laptop). Removing the 10s sleep will also make the UT run faster. This is also true to {{TestZKFailoverController#testGracefulFailover}}. was: {{TestZKFailoverController#testGracefulFailoverMultipleZKfcs}} uses fixed time sleep before assertions. This may fail sometimes though 10 seconds are generally long enough. I think we can use {{GenericTestUtils.waitFor()}} to retry the assertions. This is also true to {{TestZKFailoverController#testGracefulFailover}}. > o.a.h.ha.TestZKFailoverController should not use fixed time sleep before > assertions > --- > > Key: HDFS-10985 > URL: https://issues.apache.org/jira/browse/HDFS-10985 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ha, test >Reporter: Mingliang Liu >Assignee: Mingliang Liu >Priority: Minor > Fix For: 2.8.0, 3.0.0-alpha2 > > Attachments: HDFS-10985-branch-2.8.001.patch, HDFS-10985.000.patch, > HDFS-10985.001.patch > > > {{TestZKFailoverController#testGracefulFailoverMultipleZKfcs}} uses fixed > time sleep before assertions. This may fail sometimes though 10 seconds are > generally long enough. I think we can use {{GenericTestUtils.waitFor()}} to > retry the assertions. Meanwhile, it usually does not need 10 seconds to reach > the condition (<1s in my laptop). Removing the 10s sleep will also make the > UT run faster. > This is also true to {{TestZKFailoverController#testGracefulFailover}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-10967) Add configuration for BlockPlacementPolicy to avoid near-full DataNodes
[ https://issues.apache.org/jira/browse/HDFS-10967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563835#comment-15563835 ] Zhe Zhang edited comment on HDFS-10967 at 10/10/16 11:07 PM: - Above Jenkins report was for v2 patch. We'll see one for v3 soon. I verified reported test failures and could only reproduce {{TestHdfsConfigFields}}. Once we agree on the overall structure I'll do the due diligence: # Add the item to {{hdfs-default.xml}} # Document the new dfsAdmin command # Clear up the checkStyle warnings was (Author: zhz): I verified reported test failures and could only reproduce {{TestHdfsConfigFields}}. Once we agree on the overall structure I'll do the due diligence: # Add the item to {{hdfs-default.xml}} # Document the new dfsAdmin command # Clear up the checkStyle warnings > Add configuration for BlockPlacementPolicy to avoid near-full DataNodes > --- > > Key: HDFS-10967 > URL: https://issues.apache.org/jira/browse/HDFS-10967 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Labels: balancer > Attachments: HDFS-10967.00.patch, HDFS-10967.01.patch, > HDFS-10967.02.patch, HDFS-10967.03.patch > > > Large production clusters are likely to have heterogeneous nodes in terms of > storage capacity, memory, and CPU cores. It is not always possible to > proportionally ingest data into DataNodes based on their remaining storage > capacity. Therefore it's possible for a subset of DataNodes to be much closer > to full capacity than the rest. > This heterogeneity is most likely rack-by-rack -- i.e. _m_ whole racks of > low-storage nodes and _n_ whole racks of high-storage nodes. So It'd be very > useful if we can lower the chance for those near-full DataNodes to become > destinations for the 2nd and 3rd replicas. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10967) Add configuration for BlockPlacementPolicy to avoid near-full DataNodes
[ https://issues.apache.org/jira/browse/HDFS-10967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563835#comment-15563835 ] Zhe Zhang commented on HDFS-10967: -- I verified reported test failures and could only reproduce {{TestHdfsConfigFields}}. Once we agree on the overall structure I'll do the due diligence: # Add the item to {{hdfs-default.xml}} # Document the new dfsAdmin command # Clear up the checkStyle warnings > Add configuration for BlockPlacementPolicy to avoid near-full DataNodes > --- > > Key: HDFS-10967 > URL: https://issues.apache.org/jira/browse/HDFS-10967 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Labels: balancer > Attachments: HDFS-10967.00.patch, HDFS-10967.01.patch, > HDFS-10967.02.patch, HDFS-10967.03.patch > > > Large production clusters are likely to have heterogeneous nodes in terms of > storage capacity, memory, and CPU cores. It is not always possible to > proportionally ingest data into DataNodes based on their remaining storage > capacity. Therefore it's possible for a subset of DataNodes to be much closer > to full capacity than the rest. > This heterogeneity is most likely rack-by-rack -- i.e. _m_ whole racks of > low-storage nodes and _n_ whole racks of high-storage nodes. So It'd be very > useful if we can lower the chance for those near-full DataNodes to become > destinations for the 2nd and 3rd replicas. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10629) Federation Router
[ https://issues.apache.org/jira/browse/HDFS-10629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563827#comment-15563827 ] Hadoop QA commented on HDFS-10629: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 7 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 23s{color} | {color:green} HDFS-10467 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 50s{color} | {color:green} HDFS-10467 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 30s{color} | {color:green} HDFS-10467 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 53s{color} | {color:green} HDFS-10467 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s{color} | {color:green} HDFS-10467 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 49s{color} | {color:green} HDFS-10467 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 56s{color} | {color:green} HDFS-10467 passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 29s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 2 new + 394 unchanged - 0 fixed = 396 total (was 394) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 9s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch 1 line(s) with tabs. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 52s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 62m 54s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 84m 23s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs-project/hadoop-hdfs | | | Sequence of calls to java.util.concurrent.ConcurrentHashMap may not be atomic in org.apache.hadoop.hdfs.server.federation.router.ConnectionManager.getConnection(UserGroupInformation, String) At ConnectionManager.java:may not be atomic in org.apache.hadoop.hdfs.server.federation.router.ConnectionManager.getConnection(UserGroupInformation, String) At ConnectionManager.java:[line 151] | | | org.apache.hadoop.hdfs.server.federation.router.Router.initAndStartRouter(Configuration, boolean) invokes System.exit(...), which shuts down the entire virtual machine At Router.java:shuts down the entire virtual machine At Router.java:[line 123] | | Failed junit tests | hadoop.hdfs.server.balancer.TestBalancer | | | hadoop.tools.TestHdfsConfigFields | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Issue | HDFS-10629 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12832545/HDFS-10629-HDFS-10467-007.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle xml | | uname | Linux 96155d4d43ed 3.13.0-92-generic #139-Ubuntu SMP Tue Jun 28 20:42:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | |
[jira] [Commented] (HDFS-10984) Expose nntop output as metrics
[ https://issues.apache.org/jira/browse/HDFS-10984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563816#comment-15563816 ] Hadoop QA commented on HDFS-10984: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 44s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 25s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 3 new + 185 unchanged - 0 fixed = 188 total (was 185) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 68m 53s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 32s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 88m 9s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.TestPersistBlocks | | | hadoop.hdfs.server.namenode.TestAddStripedBlockInFBR | | Timed out junit tests | org.apache.hadoop.hdfs.TestLeaseRecovery2 | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Issue | HDFS-10984 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12832544/HDFS-10984.v3.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 349de580663e 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / c874fa9 | | Default Java | 1.8.0_101 | | findbugs | v3.0.0 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/17089/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/17089/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/17089/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/17089/console | | Powered by | Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Expose nntop output as metrics >
[jira] [Commented] (HDFS-10967) Add configuration for BlockPlacementPolicy to avoid near-full DataNodes
[ https://issues.apache.org/jira/browse/HDFS-10967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563810#comment-15563810 ] Hadoop QA commented on HDFS-10967: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 38s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 5s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 57s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 6s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 1m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 21s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 39s{color} | {color:orange} hadoop-hdfs-project: The patch generated 14 new + 1052 unchanged - 1 fixed = 1066 total (was 1053) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 56s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 67m 20s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 27s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 96m 26s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA | | | hadoop.hdfs.server.namenode.TestReconstructStripedBlocks | | | hadoop.hdfs.server.datanode.TestLargeBlockReport | | | hadoop.tools.TestHdfsConfigFields | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Issue | HDFS-10967 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12832542/HDFS-10967.02.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc | | uname | Linux a91a71c163fa 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / c874fa9 | | Default Java | 1.8.0_101 | | findbugs | v3.0.0 | | checkstyle |
[jira] [Commented] (HDFS-10637) Modifications to remove the assumption that FsVolumes are backed by java.io.File.
[ https://issues.apache.org/jira/browse/HDFS-10637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563788#comment-15563788 ] Hudson commented on HDFS-10637: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10583 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/10583/]) HDFS-10637. Modifications to remove the assumption that FsVolumes are (lei: rev 96b12662ea76e3ded4ef13944fc8df206cfb4613) * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestNameNodePrunesMissingStorages.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/diskbalancer/TestDiskBalancerWithMockMover.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataStorage.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/FsDatasetSpi.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDiskError.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImplTestUtils.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/VolumeScanner.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeHotSwapVolumes.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailureReporting.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeImpl.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestFsDatasetImpl.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/Storage.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockScanner.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDirectoryScanner.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/VolumeFailureInfo.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DiskBalancer.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/FsVolumeSpi.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalDatasetImpl.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestFsVolumeList.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailure.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalVolumeImpl.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockScanner.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/ReplicaInfo.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/RamDiskAsyncLazyPersistService.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockPoolSliceStorage.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DirectoryScanner.java * (add) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeImplBuilder.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/StorageLocation.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/LocalReplica.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetAsyncDiskService.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeList.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java > Modifications to remove the assumption that FsVolumes are backed by > java.io.File. > - > >
[jira] [Updated] (HDFS-10637) Modifications to remove the assumption that FsVolumes are backed by java.io.File.
[ https://issues.apache.org/jira/browse/HDFS-10637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-10637: - Resolution: Fixed Hadoop Flags: Incompatible change Fix Version/s: 3.0.0-alpha2 Target Version/s: 3.0.0-alpha1 Status: Resolved (was: Patch Available) +1. The checkstyle warnings were existed ones, caused by moving code around. Committed to trunk. Thanks for the great work, [~virajith]. > Modifications to remove the assumption that FsVolumes are backed by > java.io.File. > - > > Key: HDFS-10637 > URL: https://issues.apache.org/jira/browse/HDFS-10637 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, fs >Reporter: Virajith Jalaparti >Assignee: Virajith Jalaparti > Fix For: 3.0.0-alpha2 > > Attachments: HDFS-10637.001.patch, HDFS-10637.002.patch, > HDFS-10637.003.patch, HDFS-10637.004.patch, HDFS-10637.005.patch, > HDFS-10637.006.patch, HDFS-10637.007.patch, HDFS-10637.008.patch, > HDFS-10637.009.patch, HDFS-10637.010.patch, HDFS-10637.011.patch > > > Modifications to {{FsVolumeSpi}} and {{FsVolumeImpl}} to remove references to > {{java.io.File}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10965) Add unit test for HDFS command 'dfsadmin -printTopology'
[ https://issues.apache.org/jira/browse/HDFS-10965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563675#comment-15563675 ] Xiaobing Zhou commented on HDFS-10965: -- v004 is posted, which is based on some work committed in HDFS-10972. > Add unit test for HDFS command 'dfsadmin -printTopology' > > > Key: HDFS-10965 > URL: https://issues.apache.org/jira/browse/HDFS-10965 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: fs, shell, test >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Attachments: HDFS-10965.000.patch, HDFS-10965.001.patch, > HDFS-10965.002.patch, HDFS-10965.003.patch, HDFS-10965.004.patch > > > DFSAdmin#printTopology should also be tested. This proposes adding it in > TestDFSAdmin. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10965) Add unit test for HDFS command 'dfsadmin -printTopology'
[ https://issues.apache.org/jira/browse/HDFS-10965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaobing Zhou updated HDFS-10965: - Attachment: HDFS-10965.004.patch > Add unit test for HDFS command 'dfsadmin -printTopology' > > > Key: HDFS-10965 > URL: https://issues.apache.org/jira/browse/HDFS-10965 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: fs, shell, test >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Attachments: HDFS-10965.000.patch, HDFS-10965.001.patch, > HDFS-10965.002.patch, HDFS-10965.003.patch, HDFS-10965.004.patch > > > DFSAdmin#printTopology should also be tested. This proposes adding it in > TestDFSAdmin. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-10967) Add configuration for BlockPlacementPolicy to avoid near-full DataNodes
[ https://issues.apache.org/jira/browse/HDFS-10967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563541#comment-15563541 ] Zhe Zhang edited comment on HDFS-10967 at 10/10/16 9:48 PM: Thanks [~mingma] for the feedback! Attaching v2 patch to add dfsAdmin command to update the config without NN restart. bq. For balancer or over replicated scenarios, chooseReplicasToDelete uses absolute free space, maybe that needs to be change to percentage based? Agreed. In general I think all capacity-based decisions (Balancer, placement policy etc) should be consistent. Since the current patch is already big let's address this issue separately? bq. What if we move this new policy to isGoodDatanode? That's a good thought. The challenge is that this capacity consideration is not a _hard_ restriction. I.e. if a DN's capacity usage if over the factor, it is not an ideal candidate from balancing perspective, but if there's no other valid candidates, it should still be considered. I think I found a way to apply the logic to remote writer and {{BlockPlacementPolicyRackFaultTolerant}}. But it requires some additional refactors. Will attach v3 patch soon. was (Author: zhz): Thanks [~mingma] for the feedback! Attaching v2 patch to add dfsAdmin command to update the config without NN restart. bq. For balancer or over replicated scenarios, chooseReplicasToDelete uses absolute free space, maybe that needs to be change to percentage based? Agreed. In general I think all capacity-based decisions (Balancer, placement policy etc) should be consistent. bq. What if we move this new policy to isGoodDatanode? That's a good thought. The challenge is that this capacity consideration is not a _hard_ restriction. I.e. if a DN's capacity usage if over the factor, it is not an ideal candidate from balancing perspective, but if there's no other valid candidates, it should still be considered. I think I found a way to apply the logic to remote writer and {{BlockPlacementPolicyRackFaultTolerant}}. But it requires some additional refactors. Will attach v3 patch soon. > Add configuration for BlockPlacementPolicy to avoid near-full DataNodes > --- > > Key: HDFS-10967 > URL: https://issues.apache.org/jira/browse/HDFS-10967 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Labels: balancer > Attachments: HDFS-10967.00.patch, HDFS-10967.01.patch, > HDFS-10967.02.patch, HDFS-10967.03.patch > > > Large production clusters are likely to have heterogeneous nodes in terms of > storage capacity, memory, and CPU cores. It is not always possible to > proportionally ingest data into DataNodes based on their remaining storage > capacity. Therefore it's possible for a subset of DataNodes to be much closer > to full capacity than the rest. > This heterogeneity is most likely rack-by-rack -- i.e. _m_ whole racks of > low-storage nodes and _n_ whole racks of high-storage nodes. So It'd be very > useful if we can lower the chance for those near-full DataNodes to become > destinations for the 2nd and 3rd replicas. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10967) Add configuration for BlockPlacementPolicy to avoid near-full DataNodes
[ https://issues.apache.org/jira/browse/HDFS-10967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-10967: - Attachment: HDFS-10967.03.patch Attaching v3 patch to include remote writer and {{BlockPlacementPolicyRackFaultTolerant}} scenarios as [~mingma] suggested above. When writing v3 patch I realized an issue in v2 patch that if a DN is considered in a random try, it will be put into excluded nodes list. v3 patch addresses that issue as well. > Add configuration for BlockPlacementPolicy to avoid near-full DataNodes > --- > > Key: HDFS-10967 > URL: https://issues.apache.org/jira/browse/HDFS-10967 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Labels: balancer > Attachments: HDFS-10967.00.patch, HDFS-10967.01.patch, > HDFS-10967.02.patch, HDFS-10967.03.patch > > > Large production clusters are likely to have heterogeneous nodes in terms of > storage capacity, memory, and CPU cores. It is not always possible to > proportionally ingest data into DataNodes based on their remaining storage > capacity. Therefore it's possible for a subset of DataNodes to be much closer > to full capacity than the rest. > This heterogeneity is most likely rack-by-rack -- i.e. _m_ whole racks of > low-storage nodes and _n_ whole racks of high-storage nodes. So It'd be very > useful if we can lower the chance for those near-full DataNodes to become > destinations for the 2nd and 3rd replicas. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10991) libhdfs : Client compilation is failing for hdfsTruncateFile API
[ https://issues.apache.org/jira/browse/HDFS-10991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563615#comment-15563615 ] James Clampffer commented on HDFS-10991: I normally stick to the HDFS-8707 branch so I'm not sure if my +1 counts here, but if it does this seems like a very straightforward fix, +1. > libhdfs : Client compilation is failing for hdfsTruncateFile API > - > > Key: HDFS-10991 > URL: https://issues.apache.org/jira/browse/HDFS-10991 > Project: Hadoop HDFS > Issue Type: Bug > Components: libhdfs >Reporter: Surendra Singh Lilhore >Assignee: Surendra Singh Lilhore >Priority: Blocker > Attachments: HDFS-10991.patch > > > {noformat} > /tmp/ccJNUj6m.o: In function `main': > test.c:(.text+0x812): undefined reference to `hdfsTruncateFile' > collect2: ld returned 1 exit status > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10629) Federation Router
[ https://issues.apache.org/jira/browse/HDFS-10629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Kace updated HDFS-10629: -- Attachment: HDFS-10629-HDFS-10467-007.patch Updating patch to include: 1) Releasing router->NN client connections after use, prevents the addition of unnecessary connections to the pool. 2) Separating ConnectionManager and ConnectionPool into separate java files. 3) Fixing most findbugs issues and some style items. > Federation Router > - > > Key: HDFS-10629 > URL: https://issues.apache.org/jira/browse/HDFS-10629 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs >Reporter: Inigo Goiri >Assignee: Jason Kace > Attachments: HDFS-10629-HDFS-10467-002.patch, > HDFS-10629-HDFS-10467-003.patch, HDFS-10629-HDFS-10467-004.patch, > HDFS-10629-HDFS-10467-005.patch, HDFS-10629-HDFS-10467-006.patch, > HDFS-10629-HDFS-10467-007.patch, HDFS-10629.000.patch, HDFS-10629.001.patch > > > Component that routes calls from the clients to the right Namespace. It > implements {{ClientProtocol}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10984) Expose nntop output as metrics
[ https://issues.apache.org/jira/browse/HDFS-10984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Wagle updated HDFS-10984: --- Status: Patch Available (was: Open) Re-submitting patch for review + tests. > Expose nntop output as metrics > - > > Key: HDFS-10984 > URL: https://issues.apache.org/jira/browse/HDFS-10984 > Project: Hadoop HDFS > Issue Type: Task > Components: namenode >Affects Versions: 2.7.0 >Reporter: Siddharth Wagle >Assignee: Siddharth Wagle > Fix For: 2.7.3 > > Attachments: HDFS-10984.patch, HDFS-10984.v1.patch, > HDFS-10984.v2.patch, HDFS-10984.v3.patch > > > The nntop output is already exposed via JMX with HDFS-6982. > However external metrics systems do not get this data. It would be valuable > to track this as a timeseries as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10984) Expose nntop output as metrics
[ https://issues.apache.org/jira/browse/HDFS-10984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Wagle updated HDFS-10984: --- Attachment: HDFS-10984.v3.patch > Expose nntop output as metrics > - > > Key: HDFS-10984 > URL: https://issues.apache.org/jira/browse/HDFS-10984 > Project: Hadoop HDFS > Issue Type: Task > Components: namenode >Affects Versions: 2.7.0 >Reporter: Siddharth Wagle >Assignee: Siddharth Wagle > Fix For: 2.7.3 > > Attachments: HDFS-10984.patch, HDFS-10984.v1.patch, > HDFS-10984.v2.patch, HDFS-10984.v3.patch > > > The nntop output is already exposed via JMX with HDFS-6982. > However external metrics systems do not get this data. It would be valuable > to track this as a timeseries as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10984) Expose nntop output as metrics
[ https://issues.apache.org/jira/browse/HDFS-10984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Wagle updated HDFS-10984: --- Attachment: (was: HDFS-10984.v3.patch) > Expose nntop output as metrics > - > > Key: HDFS-10984 > URL: https://issues.apache.org/jira/browse/HDFS-10984 > Project: Hadoop HDFS > Issue Type: Task > Components: namenode >Affects Versions: 2.7.0 >Reporter: Siddharth Wagle >Assignee: Siddharth Wagle > Fix For: 2.7.3 > > Attachments: HDFS-10984.patch, HDFS-10984.v1.patch, > HDFS-10984.v2.patch, HDFS-10984.v3.patch > > > The nntop output is already exposed via JMX with HDFS-6982. > However external metrics systems do not get this data. It would be valuable > to track this as a timeseries as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10984) Expose nntop output as metrics
[ https://issues.apache.org/jira/browse/HDFS-10984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Wagle updated HDFS-10984: --- Attachment: HDFS-10984.v3.patch Addressed review comments from [~xyao]. > Expose nntop output as metrics > - > > Key: HDFS-10984 > URL: https://issues.apache.org/jira/browse/HDFS-10984 > Project: Hadoop HDFS > Issue Type: Task > Components: namenode >Affects Versions: 2.7.0 >Reporter: Siddharth Wagle >Assignee: Siddharth Wagle > Fix For: 2.7.3 > > Attachments: HDFS-10984.patch, HDFS-10984.v1.patch, > HDFS-10984.v2.patch, HDFS-10984.v3.patch > > > The nntop output is already exposed via JMX with HDFS-6982. > However external metrics systems do not get this data. It would be valuable > to track this as a timeseries as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10967) Add configuration for BlockPlacementPolicy to avoid near-full DataNodes
[ https://issues.apache.org/jira/browse/HDFS-10967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-10967: - Attachment: HDFS-10967.02.patch Thanks [~mingma] for the feedback! Attaching v2 patch to add dfsAdmin command to update the config without NN restart. bq. For balancer or over replicated scenarios, chooseReplicasToDelete uses absolute free space, maybe that needs to be change to percentage based? Agreed. In general I think all capacity-based decisions (Balancer, placement policy etc) should be consistent. bq. What if we move this new policy to isGoodDatanode? That's a good thought. The challenge is that this capacity consideration is not a _hard_ restriction. I.e. if a DN's capacity usage if over the factor, it is not an ideal candidate from balancing perspective, but if there's no other valid candidates, it should still be considered. I think I found a way to apply the logic to remote writer and {{BlockPlacementPolicyRackFaultTolerant}}. But it requires some additional refactors. Will attach v3 patch soon. > Add configuration for BlockPlacementPolicy to avoid near-full DataNodes > --- > > Key: HDFS-10967 > URL: https://issues.apache.org/jira/browse/HDFS-10967 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Labels: balancer > Attachments: HDFS-10967.00.patch, HDFS-10967.01.patch, > HDFS-10967.02.patch > > > Large production clusters are likely to have heterogeneous nodes in terms of > storage capacity, memory, and CPU cores. It is not always possible to > proportionally ingest data into DataNodes based on their remaining storage > capacity. Therefore it's possible for a subset of DataNodes to be much closer > to full capacity than the rest. > This heterogeneity is most likely rack-by-rack -- i.e. _m_ whole racks of > low-storage nodes and _n_ whole racks of high-storage nodes. So It'd be very > useful if we can lower the chance for those near-full DataNodes to become > destinations for the 2nd and 3rd replicas. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10972) Add unit test for HDFS command 'dfsadmin -getDatanodeInfo'
[ https://issues.apache.org/jira/browse/HDFS-10972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563538#comment-15563538 ] Mingliang Liu commented on HDFS-10972: -- +1 pending on Jenkins. > Add unit test for HDFS command 'dfsadmin -getDatanodeInfo' > -- > > Key: HDFS-10972 > URL: https://issues.apache.org/jira/browse/HDFS-10972 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: fs, shell, test >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Fix For: 3.0.0-alpha2 > > Attachments: HDFS-10972-branch-2.8.003.patch, HDFS-10972.000.patch, > HDFS-10972.001.patch, HDFS-10972.002.patch, HDFS-10972.003.patch > > > getDatanodeInfo should be tested in admin CLI. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10984) Expose nntop output as metrics
[ https://issues.apache.org/jira/browse/HDFS-10984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Wagle updated HDFS-10984: --- Status: Open (was: Patch Available) Cancelling to incorporate review comments. > Expose nntop output as metrics > - > > Key: HDFS-10984 > URL: https://issues.apache.org/jira/browse/HDFS-10984 > Project: Hadoop HDFS > Issue Type: Task > Components: namenode >Affects Versions: 2.7.0 >Reporter: Siddharth Wagle >Assignee: Siddharth Wagle > Fix For: 2.7.3 > > Attachments: HDFS-10984.patch, HDFS-10984.v1.patch, > HDFS-10984.v2.patch > > > The nntop output is already exposed via JMX with HDFS-6982. > However external metrics systems do not get this data. It would be valuable > to track this as a timeseries as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10985) o.a.h.ha.TestZKFailoverController should not use fixed time sleep before assertions
[ https://issues.apache.org/jira/browse/HDFS-10985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563493#comment-15563493 ] Hudson commented on HDFS-10985: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10582 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/10582/]) HDFS-10985. o.a.h.ha.TestZKFailoverController should not use fixed time (liuml07: rev c874fa914dfbf07d1731f5e87398607366675879) * (edit) hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/ha/TestZKFailoverController.java > o.a.h.ha.TestZKFailoverController should not use fixed time sleep before > assertions > --- > > Key: HDFS-10985 > URL: https://issues.apache.org/jira/browse/HDFS-10985 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ha, test >Reporter: Mingliang Liu >Assignee: Mingliang Liu >Priority: Minor > Fix For: 2.8.0, 3.0.0-alpha2 > > Attachments: HDFS-10985-branch-2.8.001.patch, HDFS-10985.000.patch, > HDFS-10985.001.patch > > > {{TestZKFailoverController#testGracefulFailoverMultipleZKfcs}} uses fixed > time sleep before assertions. This may fail sometimes though 10 seconds are > generally long enough. I think we can use {{GenericTestUtils.waitFor()}} to > retry the assertions. > This is also true to {{TestZKFailoverController#testGracefulFailover}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10972) Add unit test for HDFS command 'dfsadmin -getDatanodeInfo'
[ https://issues.apache.org/jira/browse/HDFS-10972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563487#comment-15563487 ] Xiaobing Zhou commented on HDFS-10972: -- branch-2.8 patch is posted [~liuml07] thanks. > Add unit test for HDFS command 'dfsadmin -getDatanodeInfo' > -- > > Key: HDFS-10972 > URL: https://issues.apache.org/jira/browse/HDFS-10972 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: fs, shell, test >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Fix For: 3.0.0-alpha2 > > Attachments: HDFS-10972-branch-2.8.003.patch, HDFS-10972.000.patch, > HDFS-10972.001.patch, HDFS-10972.002.patch, HDFS-10972.003.patch > > > getDatanodeInfo should be tested in admin CLI. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10972) Add unit test for HDFS command 'dfsadmin -getDatanodeInfo'
[ https://issues.apache.org/jira/browse/HDFS-10972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaobing Zhou updated HDFS-10972: - Attachment: HDFS-10972-branch-2.8.003.patch > Add unit test for HDFS command 'dfsadmin -getDatanodeInfo' > -- > > Key: HDFS-10972 > URL: https://issues.apache.org/jira/browse/HDFS-10972 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: fs, shell, test >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Fix For: 3.0.0-alpha2 > > Attachments: HDFS-10972-branch-2.8.003.patch, HDFS-10972.000.patch, > HDFS-10972.001.patch, HDFS-10972.002.patch, HDFS-10972.003.patch > > > getDatanodeInfo should be tested in admin CLI. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-10967) Add configuration for BlockPlacementPolicy to avoid near-full DataNodes
[ https://issues.apache.org/jira/browse/HDFS-10967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563120#comment-15563120 ] Ming Ma edited comment on HDFS-10967 at 10/10/16 8:56 PM: -- Thanks [~zhz]. Indeed that is an issue when the cluster has heterogeneous nodes, and the rack-based assumption makes sense to me. * For balancer or over replicated scenarios, {{chooseReplicasToDelete}} uses absolute free space, maybe that needs to be change to percentage based? * What if we move this new policy to {{isGoodDatanode}}? It has several benefits: ** BlockPlacementPolicyRackFaultTolerant can use it. ** Cover the case where the writer is outside of the cluster, thus the call path is chooseLocalRack -> chooseRandom. was (Author: mingma): Thanks [~zhz]. Indeed that is an issue when the cluster has heterogeneous nodes, and the rack-based assumption makes sense to me. * For balancer or over replicated scenarios, {{chooseReplicasToDelete}} uses absolute free space, maybe that needs to be change to percentage based? * What if we move this new policy to {{isGoodDatanode}}? It has several benefits: ** BlockPlacementPolicyRackFaultTolerant can uses it. ** Cover the case where the writer is outside of the cluster, thus the call path is chooseLocalRack -> chooseRandom. * Typo below you meant this.considerCapacity. {noformat} this.considerLoad = conf.getBoolean(...); {noformat} > Add configuration for BlockPlacementPolicy to avoid near-full DataNodes > --- > > Key: HDFS-10967 > URL: https://issues.apache.org/jira/browse/HDFS-10967 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Labels: balancer > Attachments: HDFS-10967.00.patch, HDFS-10967.01.patch > > > Large production clusters are likely to have heterogeneous nodes in terms of > storage capacity, memory, and CPU cores. It is not always possible to > proportionally ingest data into DataNodes based on their remaining storage > capacity. Therefore it's possible for a subset of DataNodes to be much closer > to full capacity than the rest. > This heterogeneity is most likely rack-by-rack -- i.e. _m_ whole racks of > low-storage nodes and _n_ whole racks of high-storage nodes. So It'd be very > useful if we can lower the chance for those near-full DataNodes to become > destinations for the 2nd and 3rd replicas. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10985) o.a.h.ha.TestZKFailoverController should not use fixed time sleep before assertions
[ https://issues.apache.org/jira/browse/HDFS-10985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mingliang Liu updated HDFS-10985: - Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.0.0-alpha2 2.8.0 Status: Resolved (was: Patch Available) Committed to {{trunk}} through {{branch-2.8}}. Thanks [~ste...@apache.org] for review. > o.a.h.ha.TestZKFailoverController should not use fixed time sleep before > assertions > --- > > Key: HDFS-10985 > URL: https://issues.apache.org/jira/browse/HDFS-10985 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ha, test >Reporter: Mingliang Liu >Assignee: Mingliang Liu >Priority: Minor > Fix For: 2.8.0, 3.0.0-alpha2 > > Attachments: HDFS-10985-branch-2.8.001.patch, HDFS-10985.000.patch, > HDFS-10985.001.patch > > > {{TestZKFailoverController#testGracefulFailoverMultipleZKfcs}} uses fixed > time sleep before assertions. This may fail sometimes though 10 seconds are > generally long enough. I think we can use {{GenericTestUtils.waitFor()}} to > retry the assertions. > This is also true to {{TestZKFailoverController#testGracefulFailover}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10988) Refactor TestBalancerBandwidth
[ https://issues.apache.org/jira/browse/HDFS-10988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563430#comment-15563430 ] Hudson commented on HDFS-10988: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10581 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/10581/]) HDFS-10988. Refactor TestBalancerBandwidth. Contributed by Brahma Reddy (liuml07: rev b963818621c200160bb37624f177bdcb059de4eb) * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBalancerBandwidth.java > Refactor TestBalancerBandwidth > -- > > Key: HDFS-10988 > URL: https://issues.apache.org/jira/browse/HDFS-10988 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: balancer & mover, test >Reporter: Brahma Reddy Battula >Assignee: Brahma Reddy Battula > Fix For: 2.8.0, 3.0.0-alpha2 > > Attachments: HDFS-10988-002.patch, HDFS-10988.patch > > > This jira will deal the following. > 1) Remove Fixed sleep > 2) Remove unused dnproxy > 3) use try with resources -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10988) Refactor TestBalancerBandwidth
[ https://issues.apache.org/jira/browse/HDFS-10988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mingliang Liu updated HDFS-10988: - Component/s: balancer & mover > Refactor TestBalancerBandwidth > -- > > Key: HDFS-10988 > URL: https://issues.apache.org/jira/browse/HDFS-10988 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: balancer & mover, test >Reporter: Brahma Reddy Battula >Assignee: Brahma Reddy Battula > Fix For: 2.8.0, 3.0.0-alpha2 > > Attachments: HDFS-10988-002.patch, HDFS-10988.patch > > > This jira will deal the following. > 1) Remove Fixed sleep > 2) Remove unused dnproxy > 3) use try with resources -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10988) Refactor TestBalancerBandwidth
[ https://issues.apache.org/jira/browse/HDFS-10988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mingliang Liu updated HDFS-10988: - Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.0.0-alpha2 2.8.0 Status: Resolved (was: Patch Available) Committed to {{trunk}} through {{branch-2.8}}. Thanks for your contribution, [~brahmareddy]. > Refactor TestBalancerBandwidth > -- > > Key: HDFS-10988 > URL: https://issues.apache.org/jira/browse/HDFS-10988 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: balancer & mover, test >Reporter: Brahma Reddy Battula >Assignee: Brahma Reddy Battula > Fix For: 2.8.0, 3.0.0-alpha2 > > Attachments: HDFS-10988-002.patch, HDFS-10988.patch > > > This jira will deal the following. > 1) Remove Fixed sleep > 2) Remove unused dnproxy > 3) use try with resources -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-10984) Expose nntop output as metrics
[ https://issues.apache.org/jira/browse/HDFS-10984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15557062#comment-15557062 ] Siddharth Wagle edited comment on HDFS-10984 at 10/10/16 8:25 PM: -- Sample flattened output of the window metrics emitted to Ambari Metrics System: {quote} dfs.NNTopUserOpCounts.windowMs=6.op=listStatus.user=mapred.count {quote} was (Author: swagle): Sample flattened output of the window metrics emitted to Ambari Metrics System: {quote} dfs.TopUserOpCounts.windowMs=6.op=listStatus.user=mapred.count {quote} > Expose nntop output as metrics > - > > Key: HDFS-10984 > URL: https://issues.apache.org/jira/browse/HDFS-10984 > Project: Hadoop HDFS > Issue Type: Task > Components: namenode >Affects Versions: 2.7.0 >Reporter: Siddharth Wagle >Assignee: Siddharth Wagle > Fix For: 2.7.3 > > Attachments: HDFS-10984.patch, HDFS-10984.v1.patch, > HDFS-10984.v2.patch > > > The nntop output is already exposed via JMX with HDFS-6982. > However external metrics systems do not get this data. It would be valuable > to track this as a timeseries as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10984) Expose nntop output as metrics
[ https://issues.apache.org/jira/browse/HDFS-10984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563376#comment-15563376 ] Xiaoyu Yao commented on HDFS-10984: --- bq. Confused about point 4. I have linked the two tickets, did you mean a Jira comment or something for the javadoc as a class level commentary? I mean the description of the JIRA "The nntop output is already exposed via JMX with HDFS-6982." > Expose nntop output as metrics > - > > Key: HDFS-10984 > URL: https://issues.apache.org/jira/browse/HDFS-10984 > Project: Hadoop HDFS > Issue Type: Task > Components: namenode >Affects Versions: 2.7.0 >Reporter: Siddharth Wagle >Assignee: Siddharth Wagle > Fix For: 2.7.3 > > Attachments: HDFS-10984.patch, HDFS-10984.v1.patch, > HDFS-10984.v2.patch > > > The nntop output is already exposed via JMX with HDFS-6982. > However external metrics systems do not get this data. It would be valuable > to track this as a timeseries as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10984) Expose nntop output as metrics
[ https://issues.apache.org/jira/browse/HDFS-10984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563351#comment-15563351 ] Hadoop QA commented on HDFS-10984: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 33s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 57s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 26s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 8 new + 185 unchanged - 0 fixed = 193 total (was 185) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 63m 18s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 23s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 84m 15s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.datanode.TestFsDatasetCache | | | hadoop.hdfs.tools.TestDFSAdmin | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Issue | HDFS-10984 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12832522/HDFS-10984.v2.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 358780967afa 3.13.0-92-generic #139-Ubuntu SMP Tue Jun 28 20:42:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 3441c74 | | Default Java | 1.8.0_101 | | findbugs | v3.0.0 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/17086/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/17086/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/17086/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/17086/console | | Powered by | Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Expose nntop output as metrics > - > > Key: HDFS-10984 > URL:
[jira] [Commented] (HDFS-10967) Add configuration for BlockPlacementPolicy to avoid near-full DataNodes
[ https://issues.apache.org/jira/browse/HDFS-10967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563326#comment-15563326 ] Hadoop QA commented on HDFS-10967: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 53s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 43s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 30s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 10 new + 452 unchanged - 0 fixed = 462 total (was 452) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 61m 27s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 21s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 83m 26s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.namenode.TestReconstructStripedBlocks | | | hadoop.hdfs.server.blockmanagement.TestReplicationPolicyWithUpgradeDomain | | | hadoop.tools.TestHdfsConfigFields | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Issue | HDFS-10967 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12832518/HDFS-10967.01.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 0693f663ee00 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / cef61d5 | | Default Java | 1.8.0_101 | | findbugs | v3.0.0 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/17084/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/17084/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/17084/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/17084/console | | Powered by | Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Add configuration for
[jira] [Commented] (HDFS-10984) Expose nntop output as metrics
[ https://issues.apache.org/jira/browse/HDFS-10984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563318#comment-15563318 ] Siddharth Wagle commented on HDFS-10984: Thanks [~xyao] for the review comments, I will work on incorporating them into the patch. Confused about point 4. I have linked the two tickets, did you mean a Jira comment or something for the javadoc as a class level commentary? > Expose nntop output as metrics > - > > Key: HDFS-10984 > URL: https://issues.apache.org/jira/browse/HDFS-10984 > Project: Hadoop HDFS > Issue Type: Task > Components: namenode >Affects Versions: 2.7.0 >Reporter: Siddharth Wagle >Assignee: Siddharth Wagle > Fix For: 2.7.3 > > Attachments: HDFS-10984.patch, HDFS-10984.v1.patch, > HDFS-10984.v2.patch > > > The nntop output is already exposed via JMX with HDFS-6982. > However external metrics systems do not get this data. It would be valuable > to track this as a timeseries as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10301) BlockReport retransmissions may lead to storages falsely being declared zombie if storage report processing happens out of order
[ https://issues.apache.org/jira/browse/HDFS-10301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563305#comment-15563305 ] Daryn Sharp commented on HDFS-10301: Will take a look this afternoon. > BlockReport retransmissions may lead to storages falsely being declared > zombie if storage report processing happens out of order > > > Key: HDFS-10301 > URL: https://issues.apache.org/jira/browse/HDFS-10301 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.6.1 >Reporter: Konstantin Shvachko >Assignee: Vinitha Reddy Gankidi >Priority: Critical > Attachments: HDFS-10301.002.patch, HDFS-10301.003.patch, > HDFS-10301.004.patch, HDFS-10301.005.patch, HDFS-10301.006.patch, > HDFS-10301.007.patch, HDFS-10301.008.patch, HDFS-10301.009.patch, > HDFS-10301.01.patch, HDFS-10301.010.patch, HDFS-10301.011.patch, > HDFS-10301.012.patch, HDFS-10301.013.patch, HDFS-10301.014.patch, > HDFS-10301.015.patch, HDFS-10301.branch-2.7.patch, HDFS-10301.branch-2.patch, > HDFS-10301.sample.patch, zombieStorageLogs.rtf > > > When NameNode is busy a DataNode can timeout sending a block report. Then it > sends the block report again. Then NameNode while process these two reports > at the same time can interleave processing storages from different reports. > This screws up the blockReportId field, which makes NameNode think that some > storages are zombie. Replicas from zombie storages are immediately removed, > causing missing blocks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10985) o.a.h.ha.TestZKFailoverController should not use fixed time sleep before assertions
[ https://issues.apache.org/jira/browse/HDFS-10985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563289#comment-15563289 ] Hadoop QA commented on HDFS-10985: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 9s{color} | {color:green} branch-2.8 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 10s{color} | {color:green} branch-2.8 passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 23s{color} | {color:green} branch-2.8 passed with JDK v1.7.0_111 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 22s{color} | {color:green} branch-2.8 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 2s{color} | {color:green} branch-2.8 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s{color} | {color:green} branch-2.8 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 43s{color} | {color:green} branch-2.8 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 52s{color} | {color:green} branch-2.8 passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 2s{color} | {color:green} branch-2.8 passed with JDK v1.7.0_111 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 5s{color} | {color:green} the patch passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 37s{color} | {color:green} the patch passed with JDK v1.7.0_111 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 47 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 49s{color} | {color:green} the patch passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 2s{color} | {color:green} the patch passed with JDK v1.7.0_111 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 22s{color} | {color:green} hadoop-common in the patch passed with JDK v1.7.0_111. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 22s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 68m 41s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:5af2af1 | | JIRA Issue | HDFS-10985 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12832516/HDFS-10985-branch-2.8.001.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 897f5c503ae1 3.13.0-92-generic #139-Ubuntu SMP Tue Jun 28 20:42:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | branch-2.8 / 583283d | | Default Java | 1.7.0_111 | | Multi-JDK versions | /usr/lib/jvm/java-8-oracle:1.8.0_101
[jira] [Commented] (HDFS-10985) o.a.h.ha.TestZKFailoverController should not use fixed time sleep before assertions
[ https://issues.apache.org/jira/browse/HDFS-10985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563210#comment-15563210 ] Steve Loughran commented on HDFS-10985: --- +1 > o.a.h.ha.TestZKFailoverController should not use fixed time sleep before > assertions > --- > > Key: HDFS-10985 > URL: https://issues.apache.org/jira/browse/HDFS-10985 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ha, test >Reporter: Mingliang Liu >Assignee: Mingliang Liu >Priority: Minor > Attachments: HDFS-10985-branch-2.8.001.patch, HDFS-10985.000.patch, > HDFS-10985.001.patch > > > {{TestZKFailoverController#testGracefulFailoverMultipleZKfcs}} uses fixed > time sleep before assertions. This may fail sometimes though 10 seconds are > generally long enough. I think we can use {{GenericTestUtils.waitFor()}} to > retry the assertions. > This is also true to {{TestZKFailoverController#testGracefulFailover}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-10984) Expose nntop output as metrics
[ https://issues.apache.org/jira/browse/HDFS-10984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563190#comment-15563190 ] Xiaoyu Yao edited comment on HDFS-10984 at 10/10/16 7:16 PM: - Thanks [~swagle] for reporting the issue and posting the patch. The latest patch v2 with the JMXGet issue fixed looks good to me overall. Here are a few minor issues: 1. In TopMetrics#getMetrics(), should we check if top metrics is enabled or not like we do in FSNamesystem#getTopUserOpCounts. Or have a unit test verify that the metrics source works as expected when top metrics is not enabled. 2. I notice that the patch added a metric record per window. So there would be multiple record per getMetrics() call for each window. Can we elaborate this in the description of the comments? 3. Checkstyle issue from Jenkins. 4. You may also add context info into this ticket. The original HDFS-6982 has a Metrics2 source implemented but was removed as part of HDFS-7426 "Change nntop JMX format to be a JSON blob". was (Author: xyao): Thanks [~swagle] for reporting the issue and posting the patch. The latest patch v2 with the JMXGet issue fixed looks good to me overall. Here are a few minor issues: 1. In TopMetrics#getMetrics(), should we check if top metrics is enabled or not like we do in FSNamesystem#getTopUserOpCounts. Or have a unit test verify that the metrics source works as expected when top metrics is not enabled. 2. I notice that the patch added a metric record per window. So there would be multiple record per getMetrics() call for each window. Can we elaborate this in the description of the comments? 3. Checkstyle issue from Jenkins. > Expose nntop output as metrics > - > > Key: HDFS-10984 > URL: https://issues.apache.org/jira/browse/HDFS-10984 > Project: Hadoop HDFS > Issue Type: Task > Components: namenode >Affects Versions: 2.7.0 >Reporter: Siddharth Wagle >Assignee: Siddharth Wagle > Fix For: 2.7.3 > > Attachments: HDFS-10984.patch, HDFS-10984.v1.patch, > HDFS-10984.v2.patch > > > The nntop output is already exposed via JMX with HDFS-6982. > However external metrics systems do not get this data. It would be valuable > to track this as a timeseries as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10984) Expose nntop output as metrics
[ https://issues.apache.org/jira/browse/HDFS-10984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563190#comment-15563190 ] Xiaoyu Yao commented on HDFS-10984: --- Thanks [~swagle] for reporting the issue and posting the patch. The latest patch v2 with the JMXGet issue fixed looks good to me overall. Here are a few minor issues: 1. In TopMetrics#getMetrics(), should we check if top metrics is enabled or not like we do in FSNamesystem#getTopUserOpCounts. Or have a unit test verify that the metrics source works as expected when top metrics is not enabled. 2. I notice that the patch added a metric record per window. So there would be multiple record per getMetrics() call for each window. Can we elaborate this in the description of the comments? 3. Checkstyle issue from Jenkins. > Expose nntop output as metrics > - > > Key: HDFS-10984 > URL: https://issues.apache.org/jira/browse/HDFS-10984 > Project: Hadoop HDFS > Issue Type: Task > Components: namenode >Affects Versions: 2.7.0 >Reporter: Siddharth Wagle >Assignee: Siddharth Wagle > Fix For: 2.7.3 > > Attachments: HDFS-10984.patch, HDFS-10984.v1.patch, > HDFS-10984.v2.patch > > > The nntop output is already exposed via JMX with HDFS-6982. > However external metrics systems do not get this data. It would be valuable > to track this as a timeseries as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10972) Add unit test for HDFS command 'dfsadmin -getDatanodeInfo'
[ https://issues.apache.org/jira/browse/HDFS-10972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563164#comment-15563164 ] Hudson commented on HDFS-10972: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10578 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/10578/]) HDFS-10972. Add unit test for HDFS command 'dfsadmin -getDatanodeInfo'. (liuml07: rev 3441c746b5f35c46fca5a0f252c86c8357fe932e) * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/TestDFSAdmin.java > Add unit test for HDFS command 'dfsadmin -getDatanodeInfo' > -- > > Key: HDFS-10972 > URL: https://issues.apache.org/jira/browse/HDFS-10972 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: fs, shell, test >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Fix For: 3.0.0-alpha2 > > Attachments: HDFS-10972.000.patch, HDFS-10972.001.patch, > HDFS-10972.002.patch, HDFS-10972.003.patch > > > getDatanodeInfo should be tested in admin CLI. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10972) Add unit test for HDFS command 'dfsadmin -getDatanodeInfo'
[ https://issues.apache.org/jira/browse/HDFS-10972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mingliang Liu updated HDFS-10972: - Fix Version/s: 3.0.0-alpha2 > Add unit test for HDFS command 'dfsadmin -getDatanodeInfo' > -- > > Key: HDFS-10972 > URL: https://issues.apache.org/jira/browse/HDFS-10972 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: fs, shell, test >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Fix For: 3.0.0-alpha2 > > Attachments: HDFS-10972.000.patch, HDFS-10972.001.patch, > HDFS-10972.002.patch, HDFS-10972.003.patch > > > getDatanodeInfo should be tested in admin CLI. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10972) Add unit test for HDFS command 'dfsadmin -getDatanodeInfo'
[ https://issues.apache.org/jira/browse/HDFS-10972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563140#comment-15563140 ] Mingliang Liu commented on HDFS-10972: -- Committed to {{trunk}} and {{branch-2}}. Can you provide a patch for {{branch-2.8}}, if it applies? Thanks. > Add unit test for HDFS command 'dfsadmin -getDatanodeInfo' > -- > > Key: HDFS-10972 > URL: https://issues.apache.org/jira/browse/HDFS-10972 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: fs, shell, test >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Attachments: HDFS-10972.000.patch, HDFS-10972.001.patch, > HDFS-10972.002.patch, HDFS-10972.003.patch > > > getDatanodeInfo should be tested in admin CLI. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10967) Add configuration for BlockPlacementPolicy to avoid near-full DataNodes
[ https://issues.apache.org/jira/browse/HDFS-10967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563120#comment-15563120 ] Ming Ma commented on HDFS-10967: Thanks [~zhz]. Indeed that is an issue when the cluster has heterogeneous nodes, and the rack-based assumption makes sense to me. * For balancer or over replicated scenarios, {{chooseReplicasToDelete}} uses absolute free space, maybe that needs to be change to percentage based? * What if we move this new policy to {{isGoodDatanode}}? It has several benefits: ** BlockPlacementPolicyRackFaultTolerant can uses it. ** Cover the case where the writer is outside of the cluster, thus the call path is chooseLocalRack -> chooseRandom. * Typo below you meant this.considerCapacity. {noformat} this.considerLoad = conf.getBoolean(...); {noformat} > Add configuration for BlockPlacementPolicy to avoid near-full DataNodes > --- > > Key: HDFS-10967 > URL: https://issues.apache.org/jira/browse/HDFS-10967 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Labels: balancer > Attachments: HDFS-10967.00.patch, HDFS-10967.01.patch > > > Large production clusters are likely to have heterogeneous nodes in terms of > storage capacity, memory, and CPU cores. It is not always possible to > proportionally ingest data into DataNodes based on their remaining storage > capacity. Therefore it's possible for a subset of DataNodes to be much closer > to full capacity than the rest. > This heterogeneity is most likely rack-by-rack -- i.e. _m_ whole racks of > low-storage nodes and _n_ whole racks of high-storage nodes. So It'd be very > useful if we can lower the chance for those near-full DataNodes to become > destinations for the 2nd and 3rd replicas. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10988) Refactor TestBalancerBandwidth
[ https://issues.apache.org/jira/browse/HDFS-10988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563109#comment-15563109 ] Mingliang Liu commented on HDFS-10988: -- Test failure is not related. > Refactor TestBalancerBandwidth > -- > > Key: HDFS-10988 > URL: https://issues.apache.org/jira/browse/HDFS-10988 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: test >Reporter: Brahma Reddy Battula >Assignee: Brahma Reddy Battula > Attachments: HDFS-10988-002.patch, HDFS-10988.patch > > > This jira will deal the following. > 1) Remove Fixed sleep > 2) Remove unused dnproxy > 3) use try with resources -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10988) Refactor TestBalancerBandwidth
[ https://issues.apache.org/jira/browse/HDFS-10988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563108#comment-15563108 ] Mingliang Liu commented on HDFS-10988: -- +1 Will commit shortly. > Refactor TestBalancerBandwidth > -- > > Key: HDFS-10988 > URL: https://issues.apache.org/jira/browse/HDFS-10988 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: test >Reporter: Brahma Reddy Battula >Assignee: Brahma Reddy Battula > Attachments: HDFS-10988-002.patch, HDFS-10988.patch > > > This jira will deal the following. > 1) Remove Fixed sleep > 2) Remove unused dnproxy > 3) use try with resources -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10984) Expose nntop output as metrics
[ https://issues.apache.org/jira/browse/HDFS-10984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Wagle updated HDFS-10984: --- Status: Patch Available (was: Open) > Expose nntop output as metrics > - > > Key: HDFS-10984 > URL: https://issues.apache.org/jira/browse/HDFS-10984 > Project: Hadoop HDFS > Issue Type: Task > Components: namenode >Affects Versions: 2.7.0 >Reporter: Siddharth Wagle >Assignee: Siddharth Wagle > Fix For: 2.7.3 > > Attachments: HDFS-10984.patch, HDFS-10984.v1.patch, > HDFS-10984.v2.patch > > > The nntop output is already exposed via JMX with HDFS-6982. > However external metrics systems do not get this data. It would be valuable > to track this as a timeseries as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10984) Expose nntop output as metrics
[ https://issues.apache.org/jira/browse/HDFS-10984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Wagle updated HDFS-10984: --- Attachment: HDFS-10984.v2.patch > Expose nntop output as metrics > - > > Key: HDFS-10984 > URL: https://issues.apache.org/jira/browse/HDFS-10984 > Project: Hadoop HDFS > Issue Type: Task > Components: namenode >Affects Versions: 2.7.0 >Reporter: Siddharth Wagle >Assignee: Siddharth Wagle > Fix For: 2.7.3 > > Attachments: HDFS-10984.patch, HDFS-10984.v1.patch, > HDFS-10984.v2.patch > > > The nntop output is already exposed via JMX with HDFS-6982. > However external metrics systems do not get this data. It would be valuable > to track this as a timeseries as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10985) o.a.h.ha.TestZKFailoverController should not use fixed time sleep before assertions
[ https://issues.apache.org/jira/browse/HDFS-10985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563096#comment-15563096 ] Hadoop QA commented on HDFS-10985: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 43s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 32s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 22s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 40m 42s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Issue | HDFS-10985 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12832511/HDFS-10985.001.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 9e14d3db9866 3.13.0-92-generic #139-Ubuntu SMP Tue Jun 28 20:42:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / cef61d5 | | Default Java | 1.8.0_101 | | findbugs | v3.0.0 | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/17082/testReport/ | | modules | C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/17082/console | | Powered by | Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > o.a.h.ha.TestZKFailoverController should not use fixed time sleep before > assertions > --- > > Key: HDFS-10985 > URL: https://issues.apache.org/jira/browse/HDFS-10985 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ha, test >Reporter: Mingliang Liu >Assignee: Mingliang Liu >Priority: Minor > Attachments: HDFS-10985-branch-2.8.001.patch, HDFS-10985.000.patch, > HDFS-10985.001.patch > > >
[jira] [Updated] (HDFS-10984) Expose nntop output as metrics
[ https://issues.apache.org/jira/browse/HDFS-10984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Wagle updated HDFS-10984: --- Attachment: (was: HDFS-10984.v2.patch) > Expose nntop output as metrics > - > > Key: HDFS-10984 > URL: https://issues.apache.org/jira/browse/HDFS-10984 > Project: Hadoop HDFS > Issue Type: Task > Components: namenode >Affects Versions: 2.7.0 >Reporter: Siddharth Wagle >Assignee: Siddharth Wagle > Fix For: 2.7.3 > > Attachments: HDFS-10984.patch, HDFS-10984.v1.patch > > > The nntop output is already exposed via JMX with HDFS-6982. > However external metrics systems do not get this data. It would be valuable > to track this as a timeseries as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10984) Expose nntop output as metrics
[ https://issues.apache.org/jira/browse/HDFS-10984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Wagle updated HDFS-10984: --- Status: Open (was: Patch Available) > Expose nntop output as metrics > - > > Key: HDFS-10984 > URL: https://issues.apache.org/jira/browse/HDFS-10984 > Project: Hadoop HDFS > Issue Type: Task > Components: namenode >Affects Versions: 2.7.0 >Reporter: Siddharth Wagle >Assignee: Siddharth Wagle > Fix For: 2.7.3 > > Attachments: HDFS-10984.patch, HDFS-10984.v1.patch, > HDFS-10984.v2.patch > > > The nntop output is already exposed via JMX with HDFS-6982. > However external metrics systems do not get this data. It would be valuable > to track this as a timeseries as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10972) Add unit test for HDFS command 'dfsadmin -getDatanodeInfo'
[ https://issues.apache.org/jira/browse/HDFS-10972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563086#comment-15563086 ] Mingliang Liu commented on HDFS-10972: -- Test failures are not related. > Add unit test for HDFS command 'dfsadmin -getDatanodeInfo' > -- > > Key: HDFS-10972 > URL: https://issues.apache.org/jira/browse/HDFS-10972 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: fs, shell, test >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Attachments: HDFS-10972.000.patch, HDFS-10972.001.patch, > HDFS-10972.002.patch, HDFS-10972.003.patch > > > getDatanodeInfo should be tested in admin CLI. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10972) Add unit test for HDFS command 'dfsadmin -getDatanodeInfo'
[ https://issues.apache.org/jira/browse/HDFS-10972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563084#comment-15563084 ] Mingliang Liu commented on HDFS-10972: -- +1 Will commit shortly. > Add unit test for HDFS command 'dfsadmin -getDatanodeInfo' > -- > > Key: HDFS-10972 > URL: https://issues.apache.org/jira/browse/HDFS-10972 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: fs, shell, test >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Attachments: HDFS-10972.000.patch, HDFS-10972.001.patch, > HDFS-10972.002.patch, HDFS-10972.003.patch > > > getDatanodeInfo should be tested in admin CLI. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10967) Add configuration for BlockPlacementPolicy to avoid near-full DataNodes
[ https://issues.apache.org/jira/browse/HDFS-10967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-10967: - Attachment: HDFS-10967.01.patch Updating patch to include a unit test. Since the new config knob is meant to be used when the cluster is already imbalanced, I'll add a dfsAdmin command in the next rev to change the config value without restarting NN. > Add configuration for BlockPlacementPolicy to avoid near-full DataNodes > --- > > Key: HDFS-10967 > URL: https://issues.apache.org/jira/browse/HDFS-10967 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Labels: balancer > Attachments: HDFS-10967.00.patch, HDFS-10967.01.patch > > > Large production clusters are likely to have heterogeneous nodes in terms of > storage capacity, memory, and CPU cores. It is not always possible to > proportionally ingest data into DataNodes based on their remaining storage > capacity. Therefore it's possible for a subset of DataNodes to be much closer > to full capacity than the rest. > This heterogeneity is most likely rack-by-rack -- i.e. _m_ whole racks of > low-storage nodes and _n_ whole racks of high-storage nodes. So It'd be very > useful if we can lower the chance for those near-full DataNodes to become > destinations for the 2nd and 3rd replicas. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10967) Add configuration for BlockPlacementPolicy to avoid near-full DataNodes
[ https://issues.apache.org/jira/browse/HDFS-10967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-10967: - Attachment: (was: HDFS-10967.poc.patch) > Add configuration for BlockPlacementPolicy to avoid near-full DataNodes > --- > > Key: HDFS-10967 > URL: https://issues.apache.org/jira/browse/HDFS-10967 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Labels: balancer > Attachments: HDFS-10967.00.patch, HDFS-10967.01.patch > > > Large production clusters are likely to have heterogeneous nodes in terms of > storage capacity, memory, and CPU cores. It is not always possible to > proportionally ingest data into DataNodes based on their remaining storage > capacity. Therefore it's possible for a subset of DataNodes to be much closer > to full capacity than the rest. > This heterogeneity is most likely rack-by-rack -- i.e. _m_ whole racks of > low-storage nodes and _n_ whole racks of high-storage nodes. So It'd be very > useful if we can lower the chance for those near-full DataNodes to become > destinations for the 2nd and 3rd replicas. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10301) BlockReport retransmissions may lead to storages falsely being declared zombie if storage report processing happens out of order
[ https://issues.apache.org/jira/browse/HDFS-10301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563049#comment-15563049 ] Arpit Agarwal commented on HDFS-10301: -- Hi [~shv], the v15 patch lgtm. Thank you for waiting. Assuming Daryn is okay with this approach we can commit it. > BlockReport retransmissions may lead to storages falsely being declared > zombie if storage report processing happens out of order > > > Key: HDFS-10301 > URL: https://issues.apache.org/jira/browse/HDFS-10301 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.6.1 >Reporter: Konstantin Shvachko >Assignee: Vinitha Reddy Gankidi >Priority: Critical > Attachments: HDFS-10301.002.patch, HDFS-10301.003.patch, > HDFS-10301.004.patch, HDFS-10301.005.patch, HDFS-10301.006.patch, > HDFS-10301.007.patch, HDFS-10301.008.patch, HDFS-10301.009.patch, > HDFS-10301.01.patch, HDFS-10301.010.patch, HDFS-10301.011.patch, > HDFS-10301.012.patch, HDFS-10301.013.patch, HDFS-10301.014.patch, > HDFS-10301.015.patch, HDFS-10301.branch-2.7.patch, HDFS-10301.branch-2.patch, > HDFS-10301.sample.patch, zombieStorageLogs.rtf > > > When NameNode is busy a DataNode can timeout sending a block report. Then it > sends the block report again. Then NameNode while process these two reports > at the same time can interleave processing storages from different reports. > This screws up the blockReportId field, which makes NameNode think that some > storages are zombie. Replicas from zombie storages are immediately removed, > causing missing blocks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10985) o.a.h.ha.TestZKFailoverController should not use fixed time sleep before assertions
[ https://issues.apache.org/jira/browse/HDFS-10985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mingliang Liu updated HDFS-10985: - Attachment: HDFS-10985-branch-2.8.001.patch Thank you [~ste...@apache.org] for your review. Can you also have a look at the v1 patch that addressed one more place in the test? > o.a.h.ha.TestZKFailoverController should not use fixed time sleep before > assertions > --- > > Key: HDFS-10985 > URL: https://issues.apache.org/jira/browse/HDFS-10985 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ha, test >Reporter: Mingliang Liu >Assignee: Mingliang Liu >Priority: Minor > Attachments: HDFS-10985-branch-2.8.001.patch, HDFS-10985.000.patch, > HDFS-10985.001.patch > > > {{TestZKFailoverController#testGracefulFailoverMultipleZKfcs}} uses fixed > time sleep before assertions. This may fail sometimes though 10 seconds are > generally long enough. I think we can use {{GenericTestUtils.waitFor()}} to > retry the assertions. > This is also true to {{TestZKFailoverController#testGracefulFailover}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10984) Expose nntop output as metrics
[ https://issues.apache.org/jira/browse/HDFS-10984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Wagle updated HDFS-10984: --- Status: Patch Available (was: Open) Fixed JMXGet issue, verified that it works locally with several runs. > Expose nntop output as metrics > - > > Key: HDFS-10984 > URL: https://issues.apache.org/jira/browse/HDFS-10984 > Project: Hadoop HDFS > Issue Type: Task > Components: namenode >Affects Versions: 2.7.0 >Reporter: Siddharth Wagle >Assignee: Siddharth Wagle > Fix For: 2.7.3 > > Attachments: HDFS-10984.patch, HDFS-10984.v1.patch, > HDFS-10984.v2.patch > > > The nntop output is already exposed via JMX with HDFS-6982. > However external metrics systems do not get this data. It would be valuable > to track this as a timeseries as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10984) Expose nntop output as metrics
[ https://issues.apache.org/jira/browse/HDFS-10984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Wagle updated HDFS-10984: --- Status: Open (was: Patch Available) > Expose nntop output as metrics > - > > Key: HDFS-10984 > URL: https://issues.apache.org/jira/browse/HDFS-10984 > Project: Hadoop HDFS > Issue Type: Task > Components: namenode >Affects Versions: 2.7.0 >Reporter: Siddharth Wagle >Assignee: Siddharth Wagle > Fix For: 2.7.3 > > Attachments: HDFS-10984.patch, HDFS-10984.v1.patch, > HDFS-10984.v2.patch > > > The nntop output is already exposed via JMX with HDFS-6982. > However external metrics systems do not get this data. It would be valuable > to track this as a timeseries as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10984) Expose nntop output as metrics
[ https://issues.apache.org/jira/browse/HDFS-10984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Wagle updated HDFS-10984: --- Attachment: HDFS-10984.v2.patch > Expose nntop output as metrics > - > > Key: HDFS-10984 > URL: https://issues.apache.org/jira/browse/HDFS-10984 > Project: Hadoop HDFS > Issue Type: Task > Components: namenode >Affects Versions: 2.7.0 >Reporter: Siddharth Wagle >Assignee: Siddharth Wagle > Fix For: 2.7.3 > > Attachments: HDFS-10984.patch, HDFS-10984.v1.patch, > HDFS-10984.v2.patch > > > The nntop output is already exposed via JMX with HDFS-6982. > However external metrics systems do not get this data. It would be valuable > to track this as a timeseries as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10985) o.a.h.ha.TestZKFailoverController should not use fixed time sleep before assertions
[ https://issues.apache.org/jira/browse/HDFS-10985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mingliang Liu updated HDFS-10985: - Description: {{TestZKFailoverController#testGracefulFailoverMultipleZKfcs}} uses fixed time sleep before assertions. This may fail sometimes though 10 seconds are generally long enough. I think we can use {{GenericTestUtils.waitFor()}} to retry the assertions. This is also true to {{TestZKFailoverController#testGracefulFailover}}. was: {{TestZKFailoverController#testGracefulFailoverMultipleZKfcs}} uses fixed time sleep before assertions. This may fail sometimes though 10 seconds are generally long enough. I think we can use {{GenericTestUtils.waitFor()}} to retry the assertions. If this makes sense, we can address all other places in {{TestZKFailoverController}}, including {{testGracefulFailover}} and {{testDontFailoverToUnhealthyNode}}. Summary: o.a.h.ha.TestZKFailoverController should not use fixed time sleep before assertions (was: o.a.h.ha.TestZKFailoverController should not use fixed time sleep before assertsions) > o.a.h.ha.TestZKFailoverController should not use fixed time sleep before > assertions > --- > > Key: HDFS-10985 > URL: https://issues.apache.org/jira/browse/HDFS-10985 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ha, test >Reporter: Mingliang Liu >Assignee: Mingliang Liu >Priority: Minor > Attachments: HDFS-10985.000.patch, HDFS-10985.001.patch > > > {{TestZKFailoverController#testGracefulFailoverMultipleZKfcs}} uses fixed > time sleep before assertions. This may fail sometimes though 10 seconds are > generally long enough. I think we can use {{GenericTestUtils.waitFor()}} to > retry the assertions. > This is also true to {{TestZKFailoverController#testGracefulFailover}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9668) Optimize the locking in FsDatasetImpl
[ https://issues.apache.org/jira/browse/HDFS-9668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563010#comment-15563010 ] Lei (Eddy) Xu commented on HDFS-9668: - Hi, [~jingcheng...@intel.com] Thanks for the updates. Some nits {code} boolean useFairLock = conf.getBoolean("dfs.datanode.dataset.lock.fair", true); blockOpLocksSize = conf.getInt("dfs.datanode.dataset.lock.size", 1024); {code} Please define configuration keys and default values in {{DFSConfigKeys}}. Btw, What is your plan to co-operate this JIRA with HDFS-10804? +1 pending after addressing the comments. Thanks! > Optimize the locking in FsDatasetImpl > - > > Key: HDFS-9668 > URL: https://issues.apache.org/jira/browse/HDFS-9668 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Jingcheng Du >Assignee: Jingcheng Du > Attachments: HDFS-9668-1.patch, HDFS-9668-10.patch, > HDFS-9668-11.patch, HDFS-9668-12.patch, HDFS-9668-13.patch, > HDFS-9668-14.patch, HDFS-9668-14.patch, HDFS-9668-15.patch, > HDFS-9668-16.patch, HDFS-9668-17.patch, HDFS-9668-2.patch, HDFS-9668-3.patch, > HDFS-9668-4.patch, HDFS-9668-5.patch, HDFS-9668-6.patch, HDFS-9668-7.patch, > HDFS-9668-8.patch, HDFS-9668-9.patch, execution_time.png > > > During the HBase test on a tiered storage of HDFS (WAL is stored in > SSD/RAMDISK, and all other files are stored in HDD), we observe many > long-time BLOCKED threads on FsDatasetImpl in DataNode. The following is part > of the jstack result: > {noformat} > "DataXceiver for client DFSClient_NONMAPREDUCE_-1626037897_1 at > /192.168.50.16:48521 [Receiving block > BP-1042877462-192.168.50.13-1446173170517:blk_1073779272_40852]" - Thread > t@93336 >java.lang.Thread.State: BLOCKED > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:) > - waiting to lock <18324c9> (a > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl) owned by > "DataXceiver for client DFSClient_NONMAPREDUCE_-1626037897_1 at > /192.168.50.16:48520 [Receiving block > BP-1042877462-192.168.50.13-1446173170517:blk_1073779271_40851]" t@93335 > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:113) > at > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.(BlockReceiver.java:183) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:615) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235) > at java.lang.Thread.run(Thread.java:745) >Locked ownable synchronizers: > - None > > "DataXceiver for client DFSClient_NONMAPREDUCE_-1626037897_1 at > /192.168.50.16:48520 [Receiving block > BP-1042877462-192.168.50.13-1446173170517:blk_1073779271_40851]" - Thread > t@93335 >java.lang.Thread.State: RUNNABLE > at java.io.UnixFileSystem.createFileExclusively(Native Method) > at java.io.File.createNewFile(File.java:1012) > at > org.apache.hadoop.hdfs.server.datanode.DatanodeUtil.createTmpFile(DatanodeUtil.java:66) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.createRbwFile(BlockPoolSlice.java:271) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.createRbwFile(FsVolumeImpl.java:286) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:1140) > - locked <18324c9> (a > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:113) > at > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.(BlockReceiver.java:183) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:615) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235) > at java.lang.Thread.run(Thread.java:745) >Locked ownable synchronizers: > - None > {noformat} > We measured the execution of some operations in FsDatasetImpl during the > test. Here following is the result. > !execution_time.png! > The operations of finalizeBlock, addBlock and createRbw on HDD in a heavy > load take a really long time. > It means one slow operation of finalizeBlock, addBlock and createRbw in a
[jira] [Updated] (HDFS-10985) o.a.h.ha.TestZKFailoverController should not use fixed time sleep before assertsions
[ https://issues.apache.org/jira/browse/HDFS-10985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mingliang Liu updated HDFS-10985: - Attachment: HDFS-10985.001.patch The v1 patch also fixed the {{testGracefulFailover}} test case. > o.a.h.ha.TestZKFailoverController should not use fixed time sleep before > assertsions > > > Key: HDFS-10985 > URL: https://issues.apache.org/jira/browse/HDFS-10985 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ha, test >Reporter: Mingliang Liu >Assignee: Mingliang Liu >Priority: Minor > Attachments: HDFS-10985.000.patch, HDFS-10985.001.patch > > > {{TestZKFailoverController#testGracefulFailoverMultipleZKfcs}} uses fixed > time sleep before assertions. This may fail sometimes though 10 seconds are > generally long enough. I think we can use {{GenericTestUtils.waitFor()}} to > retry the assertions. > If this makes sense, we can address all other places in > {{TestZKFailoverController}}, including {{testGracefulFailover}} and > {{testDontFailoverToUnhealthyNode}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-10993) rename may fail without a clear message indicating the failure reason.
[ https://issues.apache.org/jira/browse/HDFS-10993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge reassigned HDFS-10993: - Assignee: John Zhuge > rename may fail without a clear message indicating the failure reason. > -- > > Key: HDFS-10993 > URL: https://issues.apache.org/jira/browse/HDFS-10993 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: Yongjun Zhang >Assignee: John Zhuge > > Currently the FSDirRenameOp$unprotectedRenameTo looks like > {code} > static INodesInPath unprotectedRenameTo(FSDirectory fsd, > final INodesInPath srcIIP, final INodesInPath dstIIP, long timestamp) > throws IOException { > assert fsd.hasWriteLock(); > final INode srcInode = srcIIP.getLastINode(); > try { > validateRenameSource(fsd, srcIIP); > } catch (SnapshotException e) { > throw e; > } catch (IOException ignored) { > return null; > } > String src = srcIIP.getPath(); > String dst = dstIIP.getPath(); > // validate the destination > if (dst.equals(src)) { > return dstIIP; > } > try { > validateDestination(src, dst, srcInode); > } catch (IOException ignored) { > return null; > } > if (dstIIP.getLastINode() != null) { > NameNode.stateChangeLog.warn("DIR* FSDirectory.unprotectedRenameTo: " + > "failed to rename " + src + " to " + dst + " because destination " + > "exists"); > return null; > } > INode dstParent = dstIIP.getINode(-2); > if (dstParent == null) { > NameNode.stateChangeLog.warn("DIR* FSDirectory.unprotectedRenameTo: " + > "failed to rename " + src + " to " + dst + " because destination's > " + > "parent does not exist"); > return null; > } > fsd.ezManager.checkMoveValidity(srcIIP, dstIIP, src); > // Ensure dst has quota to accommodate rename > verifyFsLimitsForRename(fsd, srcIIP, dstIIP); > verifyQuotaForRename(fsd, srcIIP, dstIIP); > RenameOperation tx = new RenameOperation(fsd, srcIIP, dstIIP); > boolean added = false; > INodesInPath renamedIIP = null; > try { > // remove src > if (!tx.removeSrc4OldRename()) { > return null; > } > renamedIIP = tx.addSourceToDestination(); > added = (renamedIIP != null); > if (added) { > if (NameNode.stateChangeLog.isDebugEnabled()) { > NameNode.stateChangeLog.debug("DIR* FSDirectory" + > ".unprotectedRenameTo: " + src + " is renamed to " + dst); > } > tx.updateMtimeAndLease(timestamp); > tx.updateQuotasInSourceTree(fsd.getBlockStoragePolicySuite()); > return renamedIIP; > } > } finally { > if (!added) { > tx.restoreSource(); > } > } > NameNode.stateChangeLog.warn("DIR* FSDirectory.unprotectedRenameTo: " + > "failed to rename " + src + " to " + dst); > return null; > } > {code} > There are several places that returns null without a clear message. Though > that seems to be on purpose in the code, it left to user to guess what's > going on. > It seems to make sense to have a warning for each failed scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-10993) rename may fail without a clear message indicating the failure reason.
Yongjun Zhang created HDFS-10993: Summary: rename may fail without a clear message indicating the failure reason. Key: HDFS-10993 URL: https://issues.apache.org/jira/browse/HDFS-10993 Project: Hadoop HDFS Issue Type: Bug Components: hdfs Reporter: Yongjun Zhang Currently the FSDirRenameOp$unprotectedRenameTo looks like {code} static INodesInPath unprotectedRenameTo(FSDirectory fsd, final INodesInPath srcIIP, final INodesInPath dstIIP, long timestamp) throws IOException { assert fsd.hasWriteLock(); final INode srcInode = srcIIP.getLastINode(); try { validateRenameSource(fsd, srcIIP); } catch (SnapshotException e) { throw e; } catch (IOException ignored) { return null; } String src = srcIIP.getPath(); String dst = dstIIP.getPath(); // validate the destination if (dst.equals(src)) { return dstIIP; } try { validateDestination(src, dst, srcInode); } catch (IOException ignored) { return null; } if (dstIIP.getLastINode() != null) { NameNode.stateChangeLog.warn("DIR* FSDirectory.unprotectedRenameTo: " + "failed to rename " + src + " to " + dst + " because destination " + "exists"); return null; } INode dstParent = dstIIP.getINode(-2); if (dstParent == null) { NameNode.stateChangeLog.warn("DIR* FSDirectory.unprotectedRenameTo: " + "failed to rename " + src + " to " + dst + " because destination's " + "parent does not exist"); return null; } fsd.ezManager.checkMoveValidity(srcIIP, dstIIP, src); // Ensure dst has quota to accommodate rename verifyFsLimitsForRename(fsd, srcIIP, dstIIP); verifyQuotaForRename(fsd, srcIIP, dstIIP); RenameOperation tx = new RenameOperation(fsd, srcIIP, dstIIP); boolean added = false; INodesInPath renamedIIP = null; try { // remove src if (!tx.removeSrc4OldRename()) { return null; } renamedIIP = tx.addSourceToDestination(); added = (renamedIIP != null); if (added) { if (NameNode.stateChangeLog.isDebugEnabled()) { NameNode.stateChangeLog.debug("DIR* FSDirectory" + ".unprotectedRenameTo: " + src + " is renamed to " + dst); } tx.updateMtimeAndLease(timestamp); tx.updateQuotasInSourceTree(fsd.getBlockStoragePolicySuite()); return renamedIIP; } } finally { if (!added) { tx.restoreSource(); } } NameNode.stateChangeLog.warn("DIR* FSDirectory.unprotectedRenameTo: " + "failed to rename " + src + " to " + dst); return null; } {code} There are several places that returns null without a clear message. Though that seems to be on purpose in the code, it left to user to guess what's going on. It seems to make sense to have a warning for each failed scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10971) Distcp should not copy replication factor if source file is erasure coded
[ https://issues.apache.org/jira/browse/HDFS-10971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15562891#comment-15562891 ] Wei-Chiu Chuang commented on HDFS-10971: I think it makes sense to me to add a CreateFlag for stripped files. Adding a new -p flag to preserve EC policy also makes sense to me. > Distcp should not copy replication factor if source file is erasure coded > - > > Key: HDFS-10971 > URL: https://issues.apache.org/jira/browse/HDFS-10971 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: distcp >Affects Versions: 3.0.0-alpha1 >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Labels: hdfs-ec-3.0-must-do > Attachments: HDFS-10971.testcase.patch > > > The current erasure coding implementation uses replication factor field to > store erasure coding policy. > Distcp copies the source file's replication factor to the destination if > {{-pr}} is specified. However, if the source file is EC, the replication > factor (which is EC policy) should not be replicated to the destination file. > When a HdfsFileStatus is converted to FileStatus, the replication factor is > set to 0 if it's an EC file. > In fact, I will attach a test case that shows trying to replicate the > replication factor of an EC file results in an IOException: "Requested > replication factor of 0 is less than the required minimum of 1 for > /tmp/dst/dest2" -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10971) Distcp should not copy replication factor if source file is erasure coded
[ https://issues.apache.org/jira/browse/HDFS-10971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15562859#comment-15562859 ] Andrew Wang commented on HDFS-10971: I wonder what the right distcp behavior is in this case of "-pr". It indicates that the user wants the dest file to be replicated if the src file is replicated, even if a dst parent directory has an EC policy set. I don't think we have a create API that supports that right now. Combinations: * no "-pr" specified: dst file is written with whatever is the default for that destination path, which could be EC or not * "-pr" on a replicated file: dst is written replicated, with the same replication factor * "-pr" on a striped file: Not sure, but I lean toward the same behavior as no "-pr" flag: dst file is written with whatever is the default for that destination path, which could be EC or not. We could then add a new "-p" flag to additionally preserve EC policy. > Distcp should not copy replication factor if source file is erasure coded > - > > Key: HDFS-10971 > URL: https://issues.apache.org/jira/browse/HDFS-10971 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: distcp >Affects Versions: 3.0.0-alpha1 >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Labels: hdfs-ec-3.0-must-do > Attachments: HDFS-10971.testcase.patch > > > The current erasure coding implementation uses replication factor field to > store erasure coding policy. > Distcp copies the source file's replication factor to the destination if > {{-pr}} is specified. However, if the source file is EC, the replication > factor (which is EC policy) should not be replicated to the destination file. > When a HdfsFileStatus is converted to FileStatus, the replication factor is > set to 0 if it's an EC file. > In fact, I will attach a test case that shows trying to replicate the > replication factor of an EC file results in an IOException: "Requested > replication factor of 0 is less than the required minimum of 1 for > /tmp/dst/dest2" -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10987) Make Decommission less expensive when lot of blocks present.
[ https://issues.apache.org/jira/browse/HDFS-10987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15562723#comment-15562723 ] Brahma Reddy Battula commented on HDFS-10987: - Thanks [~kihwal] for taking look.. IIUC,[~daryn] was doing finegrained locking which is big change..? what's your view on current patch..? JFYI,Tested the patch,NN was available(only first time it will take 15 to 25 sec) while running the decommission and all the depended services(HBase,Spark,Hive..) are able to communicate. > Make Decommission less expensive when lot of blocks present. > > > Key: HDFS-10987 > URL: https://issues.apache.org/jira/browse/HDFS-10987 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Brahma Reddy Battula >Assignee: Brahma Reddy Battula >Priority: Critical > Attachments: HDFS-10987.patch > > > When user want to decommission a node which having 50M blocks +,it could hold > the namesystem lock for long time.We've seen it is taking 36 sec+. > As we knew during this time, Namenode will not available... As this > decommission will continuosly run till all the blocks got replicated,hence > Namenode will unavailable. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10987) Make Decommission less expensive when lot of blocks present.
[ https://issues.apache.org/jira/browse/HDFS-10987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15562642#comment-15562642 ] Kihwal Lee commented on HDFS-10987: --- We've seen this also. We don't have that many blocks per node, but still the lock time can be multiple seconds. [~daryn] was going to do something similar, but he was also improving locking in replication monitor. > Make Decommission less expensive when lot of blocks present. > > > Key: HDFS-10987 > URL: https://issues.apache.org/jira/browse/HDFS-10987 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Brahma Reddy Battula >Assignee: Brahma Reddy Battula >Priority: Critical > Attachments: HDFS-10987.patch > > > When user want to decommission a node which having 50M blocks +,it could hold > the namesystem lock for long time.We've seen it is taking 36 sec+. > As we knew during this time, Namenode will not available... As this > decommission will continuosly run till all the blocks got replicated,hence > Namenode will unavailable. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10992) file is under construction but no leases found
[ https://issues.apache.org/jira/browse/HDFS-10992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15562569#comment-15562569 ] Rushabh S Shah commented on HDFS-10992: --- Is this a dupe of HDFS-10763 ? > file is under construction but no leases found > -- > > Key: HDFS-10992 > URL: https://issues.apache.org/jira/browse/HDFS-10992 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.7.1 > Environment: hortonworks 2.3 build 2557. 10 Datanodes , 2 NameNode in > auto failover >Reporter: Chernishev Aleksandr > > On hdfs after recording a small number of files (at least 1000) the size > (150Mb - 1,6Gb) found 13 damaged files with incomplete last block. > hadoop fsck /hadoop/files/load_tarifer-zf-4_20160902165521521.csv > -openforwrite -files -blocks -locations > DEPRECATED: Use of this script to execute hdfs command is deprecated. > Instead use the hdfs command for it. > Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF-8 -Dsun.jnu.encoding=UTF-8 > Connecting to namenode via > http://hadoop-hdfs:50070/fsck?ugi=hdfs=1=1=1=1=%2Fstaging%2Flanding%2Fstream%2Fitc_dwh%2Ffiles%2Fload_tarifer-zf-4_20160902165521521.csv > FSCK started by hdfs (auth:SIMPLE) from /10.0.0.178 for path > /hadoop/files/load_tarifer-zf-4_20160902165521521.csv at Mon Oct 10 17:12:25 > MSK 2016 > /hadoop/files/load_tarifer-zf-4_20160902165521521.csv 920596121 bytes, 7 > block(s), OPENFORWRITE: MISSING 1 blocks of total size 115289753 B > 0. BP-1552885336-10.0.0.178-1446159880991:blk_1084952841_17798971 > len=134217728 repl=4 > [DatanodeInfoWithStorage[10.0.0.188:50010,DS-9ba44a76-113a-43ac-87dc-46aa97ba3267,DISK], > > DatanodeInfoWithStorage[10.0.0.183:50010,DS-eccd375a-ea32-491b-a4a3-5ea3faca4171,DISK], > > DatanodeInfoWithStorage[10.0.0.184:50010,DS-ec462491-6766-490a-a92f-38e9bb3be5ce,DISK], > > DatanodeInfoWithStorage[10.0.0.182:50010,DS-cef46399-bb70-4f1a-ac55-d71c7e820c29,DISK]] > 1. BP-1552885336-10.0.0.178-1446159880991:blk_1084952850_17799207 > len=134217728 repl=3 > [DatanodeInfoWithStorage[10.0.0.184:50010,DS-412769e0-0ec2-48d3-b644-b08a516b1c2c,DISK], > > DatanodeInfoWithStorage[10.0.0.181:50010,DS-97388b2f-c542-417d-ab06-c8d81b94fa9d,DISK], > > DatanodeInfoWithStorage[10.0.0.187:50010,DS-e7a11951-4315-4425-a88b-a9f6429cc058,DISK]] > 2. BP-1552885336-10.0.0.178-1446159880991:blk_1084952857_17799489 > len=134217728 repl=3 > [DatanodeInfoWithStorage[10.0.0.184:50010,DS-7a08c597-b0f4-46eb-9916-f028efac66d7,DISK], > > DatanodeInfoWithStorage[10.0.0.180:50010,DS-fa6a4630-1626-43d8-9988-955a86ac3736,DISK], > > DatanodeInfoWithStorage[10.0.0.182:50010,DS-8670e77d-c4db-4323-bb01-e0e64bd5b78e,DISK]] > 3. BP-1552885336-10.0.0.178-1446159880991:blk_1084952866_17799725 > len=134217728 repl=3 > [DatanodeInfoWithStorage[10.0.0.185:50010,DS-b5ff8ba0-275e-4846-b5a4-deda35aa0ad8,DISK], > > DatanodeInfoWithStorage[10.0.0.180:50010,DS-9cb6cade-9395-4f3a-ab7b-7fabd400b7f2,DISK], > > DatanodeInfoWithStorage[10.0.0.183:50010,DS-e277dcf3-1bce-4efd-a668-cd6fb2e10588,DISK]] > 4. BP-1552885336-10.0.0.178-1446159880991:blk_1084952872_17799891 > len=134217728 repl=4 > [DatanodeInfoWithStorage[10.0.0.184:50010,DS-e1d8f278-1a22-4294-ac7e-e12d554aef7f,DISK], > > DatanodeInfoWithStorage[10.0.0.186:50010,DS-5d9aeb2b-e677-41cd-844e-4b36b3c84092,DISK], > > DatanodeInfoWithStorage[10.0.0.183:50010,DS-eccd375a-ea32-491b-a4a3-5ea3faca4171,DISK], > > DatanodeInfoWithStorage[10.0.0.182:50010,DS-8670e77d-c4db-4323-bb01-e0e64bd5b78e,DISK]] > 5. BP-1552885336-10.0.0.78-1446159880991:blk_1084952880_17800120 > len=134217728 repl=3 > [DatanodeInfoWithStorage[10.0.0.181:50010,DS-79185b75-1938-4c91-a6d0-bb6687ca7e56,DISK], > > DatanodeInfoWithStorage[10.0.0.184:50010,DS-dcbd20aa-0334-49e0-b807-d2489f5923c6,DISK], > > DatanodeInfoWithStorage[10.0.0.183:50010,DS-f1d77328-f3af-483e-82e9-66ab0723a52c,DISK]] > 6. > BP-1552885336-10.0.0.178-1446159880991:blk_1084952887_17800316{UCState=COMMITTED, > truncateBlock=null, primaryNodeIndex=-1, > replicas=[ReplicaUC[[DISK]DS-5f3eac72-eb55-4df7-bcaa-a6fa35c166a0:NORMAL:10.0.0.188:50010|RBW], > > ReplicaUC[[DISK]DS-a2a0d8f0-772e-419f-b4ff-10b4966c57ca:NORMAL:10.0.0.184:50010|RBW], > > ReplicaUC[[DISK]DS-52984aa0-598e-4fff-acfa-8904ca7b585c:NORMAL:10.0.0.185:50010|RBW]]} > len=115289753 MISSING! > Status: CORRUPT > Total size: 920596121 B > Total dirs: 0 > Total files: 1 > Total symlinks: 0 > Total blocks (validated):7 (avg. block size 131513731 B) > > UNDER MIN REPL'D BLOCKS:1 (14.285714 %) > dfs.namenode.replication.min: 1 > CORRUPT FILES: 1 > MISSING BLOCKS: 1 > MISSING SIZE: 115289753 B > > Minimally replicated blocks: 6 (85.71429 %) > Over-replicated
[jira] [Updated] (HDFS-10797) Disk usage summary of snapshots causes renamed blocks to get counted twice
[ https://issues.apache.org/jira/browse/HDFS-10797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HDFS-10797: - Release Note: Disk usage summaries previously incorrectly counted files twice if they had been renamed since being snapshotted. Summaries now include current data plus snapshotted data that is no longer under in the directory either due to deletion or being moved outside of the directory. Thanks [~xiaochen]. I added a release note... > Disk usage summary of snapshots causes renamed blocks to get counted twice > -- > > Key: HDFS-10797 > URL: https://issues.apache.org/jira/browse/HDFS-10797 > Project: Hadoop HDFS > Issue Type: Bug > Components: snapshots >Affects Versions: 2.8.0 >Reporter: Sean Mackrory >Assignee: Sean Mackrory > Fix For: 2.8.0, 3.0.0-alpha2 > > Attachments: HDFS-10797.001.patch, HDFS-10797.002.patch, > HDFS-10797.003.patch, HDFS-10797.004.patch, HDFS-10797.005.patch, > HDFS-10797.006.patch, HDFS-10797.007.patch, HDFS-10797.008.patch, > HDFS-10797.009.patch, HDFS-10797.010.patch, HDFS-10797.010.patch > > > DirectoryWithSnapshotFeature.computeContentSummary4Snapshot calculates how > much disk usage is used by a snapshot by tallying up the files in the > snapshot that have since been deleted (that way it won't overlap with regular > files whose disk usage is computed separately). However that is determined > from a diff that shows moved (to Trash or otherwise) or renamed files as a > deletion and a creation operation that may overlap with the list of blocks. > Only the deletion operation is taken into consideration, and this causes > those blocks to get represented twice in the disk usage tallying. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10992) file is under construction but no leases found
[ https://issues.apache.org/jira/browse/HDFS-10992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chernishev Aleksandr updated HDFS-10992: Description: On hdfs after recording a small number of files (at least 1000) the size (150Mb - 1,6Gb) found 13 damaged files with incomplete last block. hadoop fsck /hadoop/files/load_tarifer-zf-4_20160902165521521.csv -openforwrite -files -blocks -locations DEPRECATED: Use of this script to execute hdfs command is deprecated. Instead use the hdfs command for it. Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF-8 -Dsun.jnu.encoding=UTF-8 Connecting to namenode via http://hadoop-hdfs:50070/fsck?ugi=hdfs=1=1=1=1=%2Fstaging%2Flanding%2Fstream%2Fitc_dwh%2Ffiles%2Fload_tarifer-zf-4_20160902165521521.csv FSCK started by hdfs (auth:SIMPLE) from /10.0.0.178 for path /hadoop/files/load_tarifer-zf-4_20160902165521521.csv at Mon Oct 10 17:12:25 MSK 2016 /hadoop/files/load_tarifer-zf-4_20160902165521521.csv 920596121 bytes, 7 block(s), OPENFORWRITE: MISSING 1 blocks of total size 115289753 B 0. BP-1552885336-10.0.0.178-1446159880991:blk_1084952841_17798971 len=134217728 repl=4 [DatanodeInfoWithStorage[10.0.0.188:50010,DS-9ba44a76-113a-43ac-87dc-46aa97ba3267,DISK], DatanodeInfoWithStorage[10.0.0.183:50010,DS-eccd375a-ea32-491b-a4a3-5ea3faca4171,DISK], DatanodeInfoWithStorage[10.0.0.184:50010,DS-ec462491-6766-490a-a92f-38e9bb3be5ce,DISK], DatanodeInfoWithStorage[10.0.0.182:50010,DS-cef46399-bb70-4f1a-ac55-d71c7e820c29,DISK]] 1. BP-1552885336-10.0.0.178-1446159880991:blk_1084952850_17799207 len=134217728 repl=3 [DatanodeInfoWithStorage[10.0.0.184:50010,DS-412769e0-0ec2-48d3-b644-b08a516b1c2c,DISK], DatanodeInfoWithStorage[10.0.0.181:50010,DS-97388b2f-c542-417d-ab06-c8d81b94fa9d,DISK], DatanodeInfoWithStorage[10.0.0.187:50010,DS-e7a11951-4315-4425-a88b-a9f6429cc058,DISK]] 2. BP-1552885336-10.0.0.178-1446159880991:blk_1084952857_17799489 len=134217728 repl=3 [DatanodeInfoWithStorage[10.0.0.184:50010,DS-7a08c597-b0f4-46eb-9916-f028efac66d7,DISK], DatanodeInfoWithStorage[10.0.0.180:50010,DS-fa6a4630-1626-43d8-9988-955a86ac3736,DISK], DatanodeInfoWithStorage[10.0.0.182:50010,DS-8670e77d-c4db-4323-bb01-e0e64bd5b78e,DISK]] 3. BP-1552885336-10.0.0.178-1446159880991:blk_1084952866_17799725 len=134217728 repl=3 [DatanodeInfoWithStorage[10.0.0.185:50010,DS-b5ff8ba0-275e-4846-b5a4-deda35aa0ad8,DISK], DatanodeInfoWithStorage[10.0.0.180:50010,DS-9cb6cade-9395-4f3a-ab7b-7fabd400b7f2,DISK], DatanodeInfoWithStorage[10.0.0.183:50010,DS-e277dcf3-1bce-4efd-a668-cd6fb2e10588,DISK]] 4. BP-1552885336-10.0.0.178-1446159880991:blk_1084952872_17799891 len=134217728 repl=4 [DatanodeInfoWithStorage[10.0.0.184:50010,DS-e1d8f278-1a22-4294-ac7e-e12d554aef7f,DISK], DatanodeInfoWithStorage[10.0.0.186:50010,DS-5d9aeb2b-e677-41cd-844e-4b36b3c84092,DISK], DatanodeInfoWithStorage[10.0.0.183:50010,DS-eccd375a-ea32-491b-a4a3-5ea3faca4171,DISK], DatanodeInfoWithStorage[10.0.0.182:50010,DS-8670e77d-c4db-4323-bb01-e0e64bd5b78e,DISK]] 5. BP-1552885336-10.0.0.78-1446159880991:blk_1084952880_17800120 len=134217728 repl=3 [DatanodeInfoWithStorage[10.0.0.181:50010,DS-79185b75-1938-4c91-a6d0-bb6687ca7e56,DISK], DatanodeInfoWithStorage[10.0.0.184:50010,DS-dcbd20aa-0334-49e0-b807-d2489f5923c6,DISK], DatanodeInfoWithStorage[10.0.0.183:50010,DS-f1d77328-f3af-483e-82e9-66ab0723a52c,DISK]] 6. BP-1552885336-10.0.0.178-1446159880991:blk_1084952887_17800316{UCState=COMMITTED, truncateBlock=null, primaryNodeIndex=-1, replicas=[ReplicaUC[[DISK]DS-5f3eac72-eb55-4df7-bcaa-a6fa35c166a0:NORMAL:10.0.0.188:50010|RBW], ReplicaUC[[DISK]DS-a2a0d8f0-772e-419f-b4ff-10b4966c57ca:NORMAL:10.0.0.184:50010|RBW], ReplicaUC[[DISK]DS-52984aa0-598e-4fff-acfa-8904ca7b585c:NORMAL:10.0.0.185:50010|RBW]]} len=115289753 MISSING! Status: CORRUPT Total size:920596121 B Total dirs:0 Total files: 1 Total symlinks:0 Total blocks (validated): 7 (avg. block size 131513731 B) UNDER MIN REPL'D BLOCKS: 1 (14.285714 %) dfs.namenode.replication.min: 1 CORRUPT FILES:1 MISSING BLOCKS: 1 MISSING SIZE: 115289753 B Minimally replicated blocks: 6 (85.71429 %) Over-replicated blocks:2 (28.571428 %) Under-replicated blocks: 0 (0.0 %) Mis-replicated blocks: 0 (0.0 %) Default replication factor:3 Average block replication: 2.857143 Corrupt blocks:0 Missing replicas: 0 (0.0 %) Number of data-nodes: 10 Number of racks: 1 FSCK ended at Mon Oct 10 17:12:25 MSK 2016 in 0 milliseconds The filesystem under path '/hadoop/files/load_tarifer-zf-4_20160902165521521.csv' is CORRUPT File is UNDER_RECOVERY, NameNode think that last block in COMMITTED state, datanode think that block in RBW state. Recover not executed. The last
[jira] [Updated] (HDFS-10992) file is under construction but no leases found
[ https://issues.apache.org/jira/browse/HDFS-10992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chernishev Aleksandr updated HDFS-10992: Description: On hdfs after recording a small number of files (at least 1000) the size (150Mb - 1,6Gb) found 13 damaged files with incomplete last block. hadoop fsck /hadoop/files/load_tarifer-zf-4_20160902165521521.csv -openforwrite -files -blocks -locations DEPRECATED: Use of this script to execute hdfs command is deprecated. Instead use the hdfs command for it. Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF-8 -Dsun.jnu.encoding=UTF-8 Connecting to namenode via http://hadoop-hdfs:50070/fsck?ugi=hdfs=1=1=1=1=%2Fstaging%2Flanding%2Fstream%2Fitc_dwh%2Ffiles%2Fload_tarifer-zf-4_20160902165521521.csv FSCK started by hdfs (auth:SIMPLE) from /10.0.0.178 for path /hadoop/files/load_tarifer-zf-4_20160902165521521.csv at Mon Oct 10 17:12:25 MSK 2016 /hadoop/files/load_tarifer-zf-4_20160902165521521.csv 920596121 bytes, 7 block(s), OPENFORWRITE: MISSING 1 blocks of total size 115289753 B 0. BP-1552885336-10.0.0.178-1446159880991:blk_1084952841_17798971 len=134217728 repl=4 [DatanodeInfoWithStorage[10.0.0.188:50010,DS-9ba44a76-113a-43ac-87dc-46aa97ba3267,DISK], DatanodeInfoWithStorage[10.0.0.183:50010,DS-eccd375a-ea32-491b-a4a3-5ea3faca4171,DISK], DatanodeInfoWithStorage[10.0.0.184:50010,DS-ec462491-6766-490a-a92f-38e9bb3be5ce,DISK], DatanodeInfoWithStorage[10.0.0.182:50010,DS-cef46399-bb70-4f1a-ac55-d71c7e820c29,DISK]] 1. BP-1552885336-10.0.0.178-1446159880991:blk_1084952850_17799207 len=134217728 repl=3 [DatanodeInfoWithStorage[10.0.0.184:50010,DS-412769e0-0ec2-48d3-b644-b08a516b1c2c,DISK], DatanodeInfoWithStorage[10.0.0.181:50010,DS-97388b2f-c542-417d-ab06-c8d81b94fa9d,DISK], DatanodeInfoWithStorage[10.0.0.187:50010,DS-e7a11951-4315-4425-a88b-a9f6429cc058,DISK]] 2. BP-1552885336-10.0.0.178-1446159880991:blk_1084952857_17799489 len=134217728 repl=3 [DatanodeInfoWithStorage[10.0.0.184:50010,DS-7a08c597-b0f4-46eb-9916-f028efac66d7,DISK], DatanodeInfoWithStorage[10.0.0.180:50010,DS-fa6a4630-1626-43d8-9988-955a86ac3736,DISK], DatanodeInfoWithStorage[10.0.0.182:50010,DS-8670e77d-c4db-4323-bb01-e0e64bd5b78e,DISK]] 3. BP-1552885336-10.0.0.178-1446159880991:blk_1084952866_17799725 len=134217728 repl=3 [DatanodeInfoWithStorage[10.0.0.185:50010,DS-b5ff8ba0-275e-4846-b5a4-deda35aa0ad8,DISK], DatanodeInfoWithStorage[10.0.0.180:50010,DS-9cb6cade-9395-4f3a-ab7b-7fabd400b7f2,DISK], DatanodeInfoWithStorage[10.0.0.183:50010,DS-e277dcf3-1bce-4efd-a668-cd6fb2e10588,DISK]] 4. BP-1552885336-10.0.0.178-1446159880991:blk_1084952872_17799891 len=134217728 repl=4 [DatanodeInfoWithStorage[10.0.0.184:50010,DS-e1d8f278-1a22-4294-ac7e-e12d554aef7f,DISK], DatanodeInfoWithStorage[10.0.0.186:50010,DS-5d9aeb2b-e677-41cd-844e-4b36b3c84092,DISK], DatanodeInfoWithStorage[10.0.0.183:50010,DS-eccd375a-ea32-491b-a4a3-5ea3faca4171,DISK], DatanodeInfoWithStorage[10.0.0.182:50010,DS-8670e77d-c4db-4323-bb01-e0e64bd5b78e,DISK]] 5. BP-1552885336-10.0.0.78-1446159880991:blk_1084952880_17800120 len=134217728 repl=3 [DatanodeInfoWithStorage[10.0.0.181:50010,DS-79185b75-1938-4c91-a6d0-bb6687ca7e56,DISK], DatanodeInfoWithStorage[10.0.0.184:50010,DS-dcbd20aa-0334-49e0-b807-d2489f5923c6,DISK], DatanodeInfoWithStorage[10.0.0.183:50010,DS-f1d77328-f3af-483e-82e9-66ab0723a52c,DISK]] 6. BP-1552885336-10.0.0.178-1446159880991:blk_1084952887_17800316{UCState=COMMITTED, truncateBlock=null, primaryNodeIndex=-1, replicas=[ReplicaUC[[DISK]DS-5f3eac72-eb55-4df7-bcaa-a6fa35c166a0:NORMAL:10.0.0.188:50010|RBW], ReplicaUC[[DISK]DS-a2a0d8f0-772e-419f-b4ff-10b4966c57ca:NORMAL:10.0.0.184:50010|RBW], ReplicaUC[[DISK]DS-52984aa0-598e-4fff-acfa-8904ca7b585c:NORMAL:10.0.0.185:50010|RBW]]} len=115289753 MISSING! Status: CORRUPT Total size:920596121 B Total dirs:0 Total files: 1 Total symlinks:0 Total blocks (validated): 7 (avg. block size 131513731 B) UNDER MIN REPL'D BLOCKS: 1 (14.285714 %) dfs.namenode.replication.min: 1 CORRUPT FILES:1 MISSING BLOCKS: 1 MISSING SIZE: 115289753 B Minimally replicated blocks: 6 (85.71429 %) Over-replicated blocks:2 (28.571428 %) Under-replicated blocks: 0 (0.0 %) Mis-replicated blocks: 0 (0.0 %) Default replication factor:3 Average block replication: 2.857143 Corrupt blocks:0 Missing replicas: 0 (0.0 %) Number of data-nodes: 10 Number of racks: 1 FSCK ended at Mon Oct 10 17:12:25 MSK 2016 in 0 milliseconds The filesystem under path '/hadoop/files/load_tarifer-zf-4_20160902165521521.csv' is CORRUPT File is UNDER_RECOVERY, NameNode think that last block in COMMITTED state, datanode think that block in RBW state. Recover not executed. The last
[jira] [Updated] (HDFS-10992) file is under construction but no leases found
[ https://issues.apache.org/jira/browse/HDFS-10992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chernishev Aleksandr updated HDFS-10992: Description: On hdfs after recording a small number of files (at least 1000) the size (150Mb - 1,6Gb) found 13 damaged files with incomplete last block. hadoop fsck /hadoop/811-ITF-ZO-P-bad/load_tarifer-zf-4_20160902165521521.csv -openforwrite -files -blocks -locations DEPRECATED: Use of this script to execute hdfs command is deprecated. Instead use the hdfs command for it. Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF-8 -Dsun.jnu.encoding=UTF-8 Connecting to namenode via http://hadoop-hdfs:50070/fsck?ugi=hdfs=1=1=1=1=%2Fstaging%2Flanding%2Fstream%2Fitc_dwh%2F811-ITF-ZO-P-bad%2Fload_tarifer-zf-4_20160902165521521.csv FSCK started by hdfs (auth:SIMPLE) from /10.0.0.178 for path /hadoop/811-ITF-ZO-P-bad/load_tarifer-zf-4_20160902165521521.csv at Mon Oct 10 17:12:25 MSK 2016 /hadoop/811-ITF-ZO-P-bad/load_tarifer-zf-4_20160902165521521.csv 920596121 bytes, 7 block(s), OPENFORWRITE: MISSING 1 blocks of total size 115289753 B 0. BP-1552885336-10.0.0.178-1446159880991:blk_1084952841_17798971 len=134217728 repl=4 [DatanodeInfoWithStorage[10.0.0.188:50010,DS-9ba44a76-113a-43ac-87dc-46aa97ba3267,DISK], DatanodeInfoWithStorage[10.0.0.183:50010,DS-eccd375a-ea32-491b-a4a3-5ea3faca4171,DISK], DatanodeInfoWithStorage[10.0.0.184:50010,DS-ec462491-6766-490a-a92f-38e9bb3be5ce,DISK], DatanodeInfoWithStorage[10.0.0.182:50010,DS-cef46399-bb70-4f1a-ac55-d71c7e820c29,DISK]] 1. BP-1552885336-10.0.0.178-1446159880991:blk_1084952850_17799207 len=134217728 repl=3 [DatanodeInfoWithStorage[10.0.0.184:50010,DS-412769e0-0ec2-48d3-b644-b08a516b1c2c,DISK], DatanodeInfoWithStorage[10.0.0.181:50010,DS-97388b2f-c542-417d-ab06-c8d81b94fa9d,DISK], DatanodeInfoWithStorage[10.0.0.187:50010,DS-e7a11951-4315-4425-a88b-a9f6429cc058,DISK]] 2. BP-1552885336-10.0.0.178-1446159880991:blk_1084952857_17799489 len=134217728 repl=3 [DatanodeInfoWithStorage[10.0.0.184:50010,DS-7a08c597-b0f4-46eb-9916-f028efac66d7,DISK], DatanodeInfoWithStorage[10.0.0.180:50010,DS-fa6a4630-1626-43d8-9988-955a86ac3736,DISK], DatanodeInfoWithStorage[10.0.0.182:50010,DS-8670e77d-c4db-4323-bb01-e0e64bd5b78e,DISK]] 3. BP-1552885336-10.0.0.178-1446159880991:blk_1084952866_17799725 len=134217728 repl=3 [DatanodeInfoWithStorage[10.0.0.185:50010,DS-b5ff8ba0-275e-4846-b5a4-deda35aa0ad8,DISK], DatanodeInfoWithStorage[10.0.0.180:50010,DS-9cb6cade-9395-4f3a-ab7b-7fabd400b7f2,DISK], DatanodeInfoWithStorage[10.0.0.183:50010,DS-e277dcf3-1bce-4efd-a668-cd6fb2e10588,DISK]] 4. BP-1552885336-10.0.0.178-1446159880991:blk_1084952872_17799891 len=134217728 repl=4 [DatanodeInfoWithStorage[10.0.0.184:50010,DS-e1d8f278-1a22-4294-ac7e-e12d554aef7f,DISK], DatanodeInfoWithStorage[10.0.0.186:50010,DS-5d9aeb2b-e677-41cd-844e-4b36b3c84092,DISK], DatanodeInfoWithStorage[10.0.0.183:50010,DS-eccd375a-ea32-491b-a4a3-5ea3faca4171,DISK], DatanodeInfoWithStorage[10.0.0.182:50010,DS-8670e77d-c4db-4323-bb01-e0e64bd5b78e,DISK]] 5. BP-1552885336-10.0.0.78-1446159880991:blk_1084952880_17800120 len=134217728 repl=3 [DatanodeInfoWithStorage[10.0.0.181:50010,DS-79185b75-1938-4c91-a6d0-bb6687ca7e56,DISK], DatanodeInfoWithStorage[10.0.0.184:50010,DS-dcbd20aa-0334-49e0-b807-d2489f5923c6,DISK], DatanodeInfoWithStorage[10.0.0.183:50010,DS-f1d77328-f3af-483e-82e9-66ab0723a52c,DISK]] 6. BP-1552885336-10.0.0.178-1446159880991:blk_1084952887_17800316{UCState=COMMITTED, truncateBlock=null, primaryNodeIndex=-1, replicas=[ReplicaUC[[DISK]DS-5f3eac72-eb55-4df7-bcaa-a6fa35c166a0:NORMAL:10.0.0.188:50010|RBW], ReplicaUC[[DISK]DS-a2a0d8f0-772e-419f-b4ff-10b4966c57ca:NORMAL:10.0.0.184:50010|RBW], ReplicaUC[[DISK]DS-52984aa0-598e-4fff-acfa-8904ca7b585c:NORMAL:10.0.0.185:50010|RBW]]} len=115289753 MISSING! Status: CORRUPT Total size:920596121 B Total dirs:0 Total files: 1 Total symlinks:0 Total blocks (validated): 7 (avg. block size 131513731 B) UNDER MIN REPL'D BLOCKS: 1 (14.285714 %) dfs.namenode.replication.min: 1 CORRUPT FILES:1 MISSING BLOCKS: 1 MISSING SIZE: 115289753 B Minimally replicated blocks: 6 (85.71429 %) Over-replicated blocks:2 (28.571428 %) Under-replicated blocks: 0 (0.0 %) Mis-replicated blocks: 0 (0.0 %) Default replication factor:3 Average block replication: 2.857143 Corrupt blocks:0 Missing replicas: 0 (0.0 %) Number of data-nodes: 10 Number of racks: 1 FSCK ended at Mon Oct 10 17:12:25 MSK 2016 in 0 milliseconds The filesystem under path '/hadoop/811-ITF-ZO-P-bad/load_tarifer-zf-4_20160902165521521.csv' is CORRUPT File is UNDER_RECOVERY, NameNode think that last block in COMMITTED state, datanode think that
[jira] [Updated] (HDFS-10992) file is under construction but no leases found
[ https://issues.apache.org/jira/browse/HDFS-10992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chernishev Aleksandr updated HDFS-10992: Description: On hdfs after recording a small number of files (at least 1000) the size (150Mb - 1,6Gb) found 13 damaged files with incomplete last block. hadoop fsck /hadoop/811-ITF-ZO-P-bad/load_tarifer-zf-4_20160902165521521.csv -openforwrite -files -blocks -locations DEPRECATED: Use of this script to execute hdfs command is deprecated. Instead use the hdfs command for it. Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF-8 -Dsun.jnu.encoding=UTF-8 Connecting to namenode via http://hadoop-hdfs:50070/fsck?ugi=hdfs=1=1=1=1=%2Fstaging%2Flanding%2Fstream%2Fitc_dwh%2F811-ITF-ZO-P-bad%2Fload_tarifer-zf-4_20160902165521521.csv FSCK started by hdfs (auth:SIMPLE) from /10.0.0.178 for path /hadoop/811-ITF-ZO-P-bad/load_tarifer-zf-4_20160902165521521.csv at Mon Oct 10 17:12:25 MSK 2016 /hadoop/811-ITF-ZO-P-bad/load_tarifer-zf-4_20160902165521521.csv 920596121 bytes, 7 block(s), OPENFORWRITE: MISSING 1 blocks of total size 115289753 B 0. BP-1552885336-10.0.0.178-1446159880991:blk_1084952841_17798971 len=134217728 repl=4 [DatanodeInfoWithStorage[10.0.0.188:50010,DS-9ba44a76-113a-43ac-87dc-46aa97ba3267,DISK], DatanodeInfoWithStorage[10.0.0.183:50010,DS-eccd375a-ea32-491b-a4a3-5ea3faca4171,DISK], DatanodeInfoWithStorage[10.0.0.184:50010,DS-ec462491-6766-490a-a92f-38e9bb3be5ce,DISK], DatanodeInfoWithStorage[10.0.0.182:50010,DS-cef46399-bb70-4f1a-ac55-d71c7e820c29,DISK]] 1. BP-1552885336-10.0.0.178-1446159880991:blk_1084952850_17799207 len=134217728 repl=3 [DatanodeInfoWithStorage[10.0.0.184:50010,DS-412769e0-0ec2-48d3-b644-b08a516b1c2c,DISK], DatanodeInfoWithStorage[10.0.0.181:50010,DS-97388b2f-c542-417d-ab06-c8d81b94fa9d,DISK], DatanodeInfoWithStorage[10.0.0.187:50010,DS-e7a11951-4315-4425-a88b-a9f6429cc058,DISK]] 2. BP-1552885336-10.0.0.178-1446159880991:blk_1084952857_17799489 len=134217728 repl=3 [DatanodeInfoWithStorage[10.0.0.184:50010,DS-7a08c597-b0f4-46eb-9916-f028efac66d7,DISK], DatanodeInfoWithStorage[10.0.0.180:50010,DS-fa6a4630-1626-43d8-9988-955a86ac3736,DISK], DatanodeInfoWithStorage[10.0.0.182:50010,DS-8670e77d-c4db-4323-bb01-e0e64bd5b78e,DISK]] 3. BP-1552885336-10.0.0.178-1446159880991:blk_1084952866_17799725 len=134217728 repl=3 [DatanodeInfoWithStorage[10.0.0.185:50010,DS-b5ff8ba0-275e-4846-b5a4-deda35aa0ad8,DISK], DatanodeInfoWithStorage[10.0.0.180:50010,DS-9cb6cade-9395-4f3a-ab7b-7fabd400b7f2,DISK], DatanodeInfoWithStorage[10.0.0.183:50010,DS-e277dcf3-1bce-4efd-a668-cd6fb2e10588,DISK]] 4. BP-1552885336-10.0.0.178-1446159880991:blk_1084952872_17799891 len=134217728 repl=4 [DatanodeInfoWithStorage[10.0.0.184:50010,DS-e1d8f278-1a22-4294-ac7e-e12d554aef7f,DISK], DatanodeInfoWithStorage[10.0.0.186:50010,DS-5d9aeb2b-e677-41cd-844e-4b36b3c84092,DISK], DatanodeInfoWithStorage[10.0.0.183:50010,DS-eccd375a-ea32-491b-a4a3-5ea3faca4171,DISK], DatanodeInfoWithStorage[10.0.0.182:50010,DS-8670e77d-c4db-4323-bb01-e0e64bd5b78e,DISK]] 5. BP-1552885336-10.0.0.78-1446159880991:blk_1084952880_17800120 len=134217728 repl=3 [DatanodeInfoWithStorage[10.0.0.181:50010,DS-79185b75-1938-4c91-a6d0-bb6687ca7e56,DISK], DatanodeInfoWithStorage[10.0.0.184:50010,DS-dcbd20aa-0334-49e0-b807-d2489f5923c6,DISK], DatanodeInfoWithStorage[10.0.0.183:50010,DS-f1d77328-f3af-483e-82e9-66ab0723a52c,DISK]] 6. BP-1552885336-10.0.0.178-1446159880991:blk_1084952887_17800316{UCState=COMMITTED, truncateBlock=null, primaryNodeIndex=-1, replicas=[ReplicaUC[[DISK]DS-5f3eac72-eb55-4df7-bcaa-a6fa35c166a0:NORMAL:10.0.0.188:50010|RBW], ReplicaUC[[DISK]DS-a2a0d8f0-772e-419f-b4ff-10b4966c57ca:NORMAL:10.0.0.184:50010|RBW], ReplicaUC[[DISK]DS-52984aa0-598e-4fff-acfa-8904ca7b585c:NORMAL:10.0.0.185:50010|RBW]]} len=115289753 MISSING! Status: CORRUPT Total size:920596121 B Total dirs:0 Total files: 1 Total symlinks:0 Total blocks (validated): 7 (avg. block size 131513731 B) UNDER MIN REPL'D BLOCKS: 1 (14.285714 %) dfs.namenode.replication.min: 1 CORRUPT FILES:1 MISSING BLOCKS: 1 MISSING SIZE: 115289753 B Minimally replicated blocks: 6 (85.71429 %) Over-replicated blocks:2 (28.571428 %) Under-replicated blocks: 0 (0.0 %) Mis-replicated blocks: 0 (0.0 %) Default replication factor:3 Average block replication: 2.857143 Corrupt blocks:0 Missing replicas: 0 (0.0 %) Number of data-nodes: 10 Number of racks: 1 FSCK ended at Mon Oct 10 17:12:25 MSK 2016 in 0 milliseconds The filesystem under path '/hadoop/811-ITF-ZO-P-bad/load_tarifer-zf-4_20160902165521521.csv' is CORRUPT File is UNDER_RECOVERY, NameNode think that last block in COMMITTED state, datanode think that