[jira] [Commented] (HDFS-14754) Erasure Coding : The number of Under-Replicated Blocks never reduced
[ https://issues.apache.org/jira/browse/HDFS-14754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941376#comment-16941376 ] Wei-Chiu Chuang commented on HDFS-14754: Too bad this one didn't land in 3.2.1 and 3.1.3. Let's make sure they get added to lower releases. > Erasure Coding : The number of Under-Replicated Blocks never reduced > - > > Key: HDFS-14754 > URL: https://issues.apache.org/jira/browse/HDFS-14754 > Project: Hadoop HDFS > Issue Type: Bug > Components: ec >Reporter: hemanthboyina >Assignee: hemanthboyina >Priority: Critical > Fix For: 3.3.0 > > Attachments: HDFS-14754-addendum.001.patch, HDFS-14754.001.patch, > HDFS-14754.002.patch, HDFS-14754.003.patch, HDFS-14754.004.patch, > HDFS-14754.005.patch, HDFS-14754.006.patch, HDFS-14754.007.patch, > HDFS-14754.008.patch > > > Using EC RS-3-2, 6 DN > We came accross a scenario where in the EC 5 blocks , same block is > replicated thrice and two blocks got missing > Replicated block was not deleting and missing block is not able to ReConstruct -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-14754) Erasure Coding : The number of Under-Replicated Blocks never reduced
[ https://issues.apache.org/jira/browse/HDFS-14754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941376#comment-16941376 ] Wei-Chiu Chuang edited comment on HDFS-14754 at 9/30/19 10:20 PM: -- Too bad this one didn't land in 3.2.1 and 3.1.3. Let's make sure it gets added to lower releases. was (Author: jojochuang): Too bad this one didn't land in 3.2.1 and 3.1.3. Let's make sure they get added to lower releases. > Erasure Coding : The number of Under-Replicated Blocks never reduced > - > > Key: HDFS-14754 > URL: https://issues.apache.org/jira/browse/HDFS-14754 > Project: Hadoop HDFS > Issue Type: Bug > Components: ec >Reporter: hemanthboyina >Assignee: hemanthboyina >Priority: Critical > Fix For: 3.3.0 > > Attachments: HDFS-14754-addendum.001.patch, HDFS-14754.001.patch, > HDFS-14754.002.patch, HDFS-14754.003.patch, HDFS-14754.004.patch, > HDFS-14754.005.patch, HDFS-14754.006.patch, HDFS-14754.007.patch, > HDFS-14754.008.patch > > > Using EC RS-3-2, 6 DN > We came accross a scenario where in the EC 5 blocks , same block is > replicated thrice and two blocks got missing > Replicated block was not deleting and missing block is not able to ReConstruct -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14808) EC: Improper size values for corrupt ec block in LOG
[ https://issues.apache.org/jira/browse/HDFS-14808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-14808: --- Component/s: ec > EC: Improper size values for corrupt ec block in LOG > - > > Key: HDFS-14808 > URL: https://issues.apache.org/jira/browse/HDFS-14808 > Project: Hadoop HDFS > Issue Type: Bug > Components: ec >Reporter: Harshakiran Reddy >Assignee: Ayush Saxena >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14808-01.patch > > > If the block corruption reason is size mismatch the log. The values shown and > compared are ambiguous. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14305) Serial number in BlockTokenSecretManager could overlap between different namenodes
[ https://issues.apache.org/jira/browse/HDFS-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941381#comment-16941381 ] Hadoop QA commented on HDFS-14305: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 2m 4s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 54s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 47s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 16s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 5s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 11s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}107m 47s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 33s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}174m 33s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean | | | hadoop.hdfs.tools.TestDFSZKFailoverController | \\ \\ || Subsystem || Report/Notes || | Docker | Client=18.09.7 Server=18.09.7 Image:yetus/hadoop:efed4450bf1 | | JIRA Issue | HDFS-14305 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12981828/HDFS-14305-008.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux bc13cb7fa98b 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 4d3c580 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/27989/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/27989/testReport/ | | Max. process+thread count | 2864 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/27989/console | | Powered
[jira] [Commented] (HDFS-14305) Serial number in BlockTokenSecretManager could overlap between different namenodes
[ https://issues.apache.org/jira/browse/HDFS-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941393#comment-16941393 ] Chen Liang commented on HDFS-14305: --- Looks like the key idea of v8 patch is that calling {{nextInt(int bound)}} which gives a non-negative value, instead of {{nextInt()}} which can return negative value. So that the range start is never negative, and so we avoid the overlapping ranges. Assuming we will address the potential confliction issue separately, +1 for the v08 patch. > Serial number in BlockTokenSecretManager could overlap between different > namenodes > -- > > Key: HDFS-14305 > URL: https://issues.apache.org/jira/browse/HDFS-14305 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode, security >Reporter: Chao Sun >Assignee: Xiaoqiao He >Priority: Major > Labels: multi-sbnn, release-blocker > Fix For: 3.0.4, 3.3.0, 3.2.1, 3.1.3 > > Attachments: HDFS-14305-007.patch, HDFS-14305-008.patch, > HDFS-14305.001.patch, HDFS-14305.002.patch, HDFS-14305.003.patch, > HDFS-14305.004.patch, HDFS-14305.005.patch, HDFS-14305.006.patch > > > Currently, a {{BlockTokenSecretManager}} starts with a random integer as the > initial serial number, and then use this formula to rotate it: > {code:java} > this.intRange = Integer.MAX_VALUE / numNNs; > this.nnRangeStart = intRange * nnIndex; > this.serialNo = (this.serialNo % intRange) + (nnRangeStart); > {code} > while {{numNNs}} is the total number of NameNodes in the cluster, and > {{nnIndex}} is the index of the current NameNode specified in the > configuration {{dfs.ha.namenodes.}}. > However, with this approach, different NameNode could have overlapping ranges > for serial number. For simplicity, let's assume {{Integer.MAX_VALUE}} is 100, > and we have 2 NameNodes {{nn1}} and {{nn2}} in configuration. Then the ranges > for these two are: > {code} > nn1 -> [-49, 49] > nn2 -> [1, 99] > {code} > This is because the initial serial number could be any negative integer. > Moreover, when the keys are updated, the serial number will again be updated > with the formula: > {code} > this.serialNo = (this.serialNo % intRange) + (nnRangeStart); > {code} > which means the new serial number could be updated to a range that belongs to > a different NameNode, thus increasing the chance of collision again. > When the collision happens, DataNodes could overwrite an existing key which > will cause clients to fail because of {{InvalidToken}} error. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2205) checkstyle.sh reports wrong failure count
[ https://issues.apache.org/jira/browse/HDDS-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941409#comment-16941409 ] Hudson commented on HDDS-2205: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17420 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/17420/]) HDDS-2205. checkstyle.sh reports wrong failure count (aengineer: rev e5bba592a84a94e0545479b668e6925eb4b8858c) * (edit) hadoop-ozone/dev-support/checks/checkstyle.sh > checkstyle.sh reports wrong failure count > - > > Key: HDDS-2205 > URL: https://issues.apache.org/jira/browse/HDDS-2205 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: test >Reporter: Attila Doroszlai >Assignee: Attila Doroszlai >Priority: Trivial > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 1h > Remaining Estimate: 0h > > {{checkstyle.sh}} outputs files with checkstyle violations and the violations > themselves on separate lines. It then reports line count as number of > failures. > {code:title=target/checkstyle/summary.txt} > hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/OmUtils.java > 49: Unused import - org.apache.hadoop.ozone.om.OMMetadataManager. > {code} > {code:title=target/checkstyle/failures} > 2 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-1984) Fix listBucket API
[ https://issues.apache.org/jira/browse/HDDS-1984?focusedWorklogId=321037=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-321037 ] ASF GitHub Bot logged work on HDDS-1984: Author: ASF GitHub Bot Created on: 01/Oct/19 02:56 Start Date: 01/Oct/19 02:56 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on issue #1555: HDDS-1984. Fix listBucket API. URL: https://github.com/apache/hadoop/pull/1555#issuecomment-536837875 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | 0 | reexec | 40 | Docker mode activated. | ||| _ Prechecks _ | | +1 | dupname | 0 | No case conflicting files found. | | +1 | @author | 0 | The patch does not contain any @author tags. | | +1 | test4tests | 0 | The patch appears to include 1 new or modified test files. | ||| _ trunk Compile Tests _ | | 0 | mvndep | 25 | Maven dependency ordering for branch | | -1 | mvninstall | 31 | hadoop-hdds in trunk failed. | | -1 | mvninstall | 36 | hadoop-ozone in trunk failed. | | -1 | compile | 20 | hadoop-hdds in trunk failed. | | -1 | compile | 15 | hadoop-ozone in trunk failed. | | +1 | checkstyle | 51 | trunk passed | | +1 | mvnsite | 0 | trunk passed | | +1 | shadedclient | 818 | branch has no errors when building and testing our client artifacts. | | -1 | javadoc | 25 | hadoop-hdds in trunk failed. | | -1 | javadoc | 16 | hadoop-ozone in trunk failed. | | 0 | spotbugs | 919 | Used deprecated FindBugs config; considering switching to SpotBugs. | | -1 | findbugs | 34 | hadoop-hdds in trunk failed. | | -1 | findbugs | 22 | hadoop-ozone in trunk failed. | ||| _ Patch Compile Tests _ | | 0 | mvndep | 28 | Maven dependency ordering for patch | | -1 | mvninstall | 37 | hadoop-hdds in the patch failed. | | -1 | mvninstall | 39 | hadoop-ozone in the patch failed. | | -1 | compile | 25 | hadoop-hdds in the patch failed. | | -1 | compile | 16 | hadoop-ozone in the patch failed. | | -1 | javac | 25 | hadoop-hdds in the patch failed. | | -1 | javac | 16 | hadoop-ozone in the patch failed. | | -0 | checkstyle | 25 | hadoop-hdds: The patch generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) | | +1 | mvnsite | 0 | the patch passed | | +1 | whitespace | 0 | The patch has no whitespace issues. | | +1 | shadedclient | 707 | patch has no errors when building and testing our client artifacts. | | -1 | javadoc | 21 | hadoop-hdds in the patch failed. | | -1 | javadoc | 20 | hadoop-ozone in the patch failed. | | -1 | findbugs | 32 | hadoop-hdds in the patch failed. | | -1 | findbugs | 20 | hadoop-ozone in the patch failed. | ||| _ Other Tests _ | | -1 | unit | 30 | hadoop-hdds in the patch failed. | | -1 | unit | 28 | hadoop-ozone in the patch failed. | | +1 | asflicense | 34 | The patch does not generate ASF License warnings. | | | | 2360 | | | Subsystem | Report/Notes | |--:|:-| | Docker | Client=19.03.1 Server=19.03.1 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/2/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/1555 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 883182c3dde4 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | personality/hadoop.sh | | git revision | trunk / b3275ab | | Default Java | 1.8.0_222 | | mvninstall | https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/2/artifact/out/branch-mvninstall-hadoop-hdds.txt | | mvninstall | https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/2/artifact/out/branch-mvninstall-hadoop-ozone.txt | | compile | https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/2/artifact/out/branch-compile-hadoop-hdds.txt | | compile | https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/2/artifact/out/branch-compile-hadoop-ozone.txt | | javadoc | https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/2/artifact/out/branch-javadoc-hadoop-hdds.txt | | javadoc | https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/2/artifact/out/branch-javadoc-hadoop-ozone.txt | | findbugs | https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/2/artifact/out/branch-findbugs-hadoop-hdds.txt | | findbugs | https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/2/artifact/out/branch-findbugs-hadoop-ozone.txt | | mvninstall | https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/2/artifact/out/patch-mvninstall-hadoop-hdds.txt | | mvninstall |
[jira] [Work logged] (HDDS-1984) Fix listBucket API
[ https://issues.apache.org/jira/browse/HDDS-1984?focusedWorklogId=321036=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-321036 ] ASF GitHub Bot logged work on HDDS-1984: Author: ASF GitHub Bot Created on: 01/Oct/19 02:56 Start Date: 01/Oct/19 02:56 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on issue #1555: HDDS-1984. Fix listBucket API. URL: https://github.com/apache/hadoop/pull/1555#issuecomment-536837796 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | 0 | reexec | 42 | Docker mode activated. | ||| _ Prechecks _ | | +1 | dupname | 1 | No case conflicting files found. | | +1 | @author | 0 | The patch does not contain any @author tags. | | +1 | test4tests | 0 | The patch appears to include 1 new or modified test files. | ||| _ trunk Compile Tests _ | | 0 | mvndep | 68 | Maven dependency ordering for branch | | -1 | mvninstall | 44 | hadoop-hdds in trunk failed. | | -1 | mvninstall | 39 | hadoop-ozone in trunk failed. | | -1 | compile | 21 | hadoop-hdds in trunk failed. | | -1 | compile | 14 | hadoop-ozone in trunk failed. | | +1 | checkstyle | 62 | trunk passed | | +1 | mvnsite | 0 | trunk passed | | +1 | shadedclient | 851 | branch has no errors when building and testing our client artifacts. | | -1 | javadoc | 21 | hadoop-hdds in trunk failed. | | -1 | javadoc | 18 | hadoop-ozone in trunk failed. | | 0 | spotbugs | 955 | Used deprecated FindBugs config; considering switching to SpotBugs. | | -1 | findbugs | 38 | hadoop-hdds in trunk failed. | | -1 | findbugs | 22 | hadoop-ozone in trunk failed. | ||| _ Patch Compile Tests _ | | 0 | mvndep | 30 | Maven dependency ordering for patch | | -1 | mvninstall | 36 | hadoop-hdds in the patch failed. | | -1 | mvninstall | 40 | hadoop-ozone in the patch failed. | | -1 | compile | 26 | hadoop-hdds in the patch failed. | | -1 | compile | 18 | hadoop-ozone in the patch failed. | | -1 | javac | 26 | hadoop-hdds in the patch failed. | | -1 | javac | 18 | hadoop-ozone in the patch failed. | | -0 | checkstyle | 28 | hadoop-hdds: The patch generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) | | +1 | mvnsite | 0 | the patch passed | | +1 | whitespace | 0 | The patch has no whitespace issues. | | +1 | shadedclient | 715 | patch has no errors when building and testing our client artifacts. | | -1 | javadoc | 24 | hadoop-hdds in the patch failed. | | -1 | javadoc | 18 | hadoop-ozone in the patch failed. | | -1 | findbugs | 32 | hadoop-hdds in the patch failed. | | -1 | findbugs | 20 | hadoop-ozone in the patch failed. | ||| _ Other Tests _ | | -1 | unit | 28 | hadoop-hdds in the patch failed. | | -1 | unit | 29 | hadoop-ozone in the patch failed. | | +1 | asflicense | 35 | The patch does not generate ASF License warnings. | | | | 2467 | | | Subsystem | Report/Notes | |--:|:-| | Docker | Client=19.03.1 Server=19.03.1 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/1/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/1555 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 08929fca86df 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | personality/hadoop.sh | | git revision | trunk / b3275ab | | Default Java | 1.8.0_222 | | mvninstall | https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/1/artifact/out/branch-mvninstall-hadoop-hdds.txt | | mvninstall | https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/1/artifact/out/branch-mvninstall-hadoop-ozone.txt | | compile | https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/1/artifact/out/branch-compile-hadoop-hdds.txt | | compile | https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/1/artifact/out/branch-compile-hadoop-ozone.txt | | javadoc | https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/1/artifact/out/branch-javadoc-hadoop-hdds.txt | | javadoc | https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/1/artifact/out/branch-javadoc-hadoop-ozone.txt | | findbugs | https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/1/artifact/out/branch-findbugs-hadoop-hdds.txt | | findbugs | https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/1/artifact/out/branch-findbugs-hadoop-ozone.txt | | mvninstall | https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/1/artifact/out/patch-mvninstall-hadoop-hdds.txt | | mvninstall |
[jira] [Work logged] (HDDS-2169) Avoid buffer copies while submitting client requests in Ratis
[ https://issues.apache.org/jira/browse/HDDS-2169?focusedWorklogId=321044=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-321044 ] ASF GitHub Bot logged work on HDDS-2169: Author: ASF GitHub Bot Created on: 01/Oct/19 03:31 Start Date: 01/Oct/19 03:31 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on issue #1517: HDDS-2169 URL: https://github.com/apache/hadoop/pull/1517#issuecomment-536845581 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | 0 | reexec | 96 | Docker mode activated. | ||| _ Prechecks _ | | +1 | dupname | 0 | No case conflicting files found. | | +1 | @author | 0 | The patch does not contain any @author tags. | | +1 | test4tests | 0 | The patch appears to include 1 new or modified test files. | ||| _ trunk Compile Tests _ | | 0 | mvndep | 20 | Maven dependency ordering for branch | | -1 | mvninstall | 31 | hadoop-hdds in trunk failed. | | -1 | mvninstall | 34 | hadoop-ozone in trunk failed. | | -1 | compile | 19 | hadoop-hdds in trunk failed. | | -1 | compile | 13 | hadoop-ozone in trunk failed. | | +1 | checkstyle | 59 | trunk passed | | +1 | mvnsite | 0 | trunk passed | | +1 | shadedclient | 971 | branch has no errors when building and testing our client artifacts. | | -1 | javadoc | 18 | hadoop-hdds in trunk failed. | | -1 | javadoc | 16 | hadoop-ozone in trunk failed. | | 0 | spotbugs | 1058 | Used deprecated FindBugs config; considering switching to SpotBugs. | | -1 | findbugs | 31 | hadoop-hdds in trunk failed. | | -1 | findbugs | 17 | hadoop-ozone in trunk failed. | ||| _ Patch Compile Tests _ | | 0 | mvndep | 14 | Maven dependency ordering for patch | | -1 | mvninstall | 33 | hadoop-hdds in the patch failed. | | -1 | mvninstall | 36 | hadoop-ozone in the patch failed. | | -1 | compile | 21 | hadoop-hdds in the patch failed. | | -1 | compile | 16 | hadoop-ozone in the patch failed. | | -1 | javac | 21 | hadoop-hdds in the patch failed. | | -1 | javac | 17 | hadoop-ozone in the patch failed. | | +1 | checkstyle | 54 | the patch passed | | +1 | mvnsite | 0 | the patch passed | | +1 | whitespace | 0 | The patch has no whitespace issues. | | +1 | shadedclient | 779 | patch has no errors when building and testing our client artifacts. | | -1 | javadoc | 19 | hadoop-hdds in the patch failed. | | -1 | javadoc | 16 | hadoop-ozone in the patch failed. | | -1 | findbugs | 28 | hadoop-hdds in the patch failed. | | -1 | findbugs | 18 | hadoop-ozone in the patch failed. | ||| _ Other Tests _ | | -1 | unit | 24 | hadoop-hdds in the patch failed. | | -1 | unit | 23 | hadoop-ozone in the patch failed. | | +1 | asflicense | 29 | The patch does not generate ASF License warnings. | | | | 2550 | | | Subsystem | Report/Notes | |--:|:-| | Docker | Client=19.03.0 Server=19.03.0 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1517/7/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/1517 | | JIRA Issue | HDDS-2169 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux f83947a622e3 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | personality/hadoop.sh | | git revision | trunk / b3275ab | | Default Java | 1.8.0_222 | | mvninstall | https://builds.apache.org/job/hadoop-multibranch/job/PR-1517/7/artifact/out/branch-mvninstall-hadoop-hdds.txt | | mvninstall | https://builds.apache.org/job/hadoop-multibranch/job/PR-1517/7/artifact/out/branch-mvninstall-hadoop-ozone.txt | | compile | https://builds.apache.org/job/hadoop-multibranch/job/PR-1517/7/artifact/out/branch-compile-hadoop-hdds.txt | | compile | https://builds.apache.org/job/hadoop-multibranch/job/PR-1517/7/artifact/out/branch-compile-hadoop-ozone.txt | | javadoc | https://builds.apache.org/job/hadoop-multibranch/job/PR-1517/7/artifact/out/branch-javadoc-hadoop-hdds.txt | | javadoc | https://builds.apache.org/job/hadoop-multibranch/job/PR-1517/7/artifact/out/branch-javadoc-hadoop-ozone.txt | | findbugs | https://builds.apache.org/job/hadoop-multibranch/job/PR-1517/7/artifact/out/branch-findbugs-hadoop-hdds.txt | | findbugs | https://builds.apache.org/job/hadoop-multibranch/job/PR-1517/7/artifact/out/branch-findbugs-hadoop-ozone.txt | | mvninstall | https://builds.apache.org/job/hadoop-multibranch/job/PR-1517/7/artifact/out/patch-mvninstall-hadoop-hdds.txt | | mvninstall | https://builds.apache.org/job/hadoop-multibranch/job/PR-1517/7/artifact/out/patch-mvninstall-hadoop-ozone.txt
[jira] [Commented] (HDDS-2169) Avoid buffer copies while submitting client requests in Ratis
[ https://issues.apache.org/jira/browse/HDDS-2169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941498#comment-16941498 ] Hadoop QA commented on HDDS-2169: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 36s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 20s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 31s{color} | {color:red} hadoop-hdds in trunk failed. {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 34s{color} | {color:red} hadoop-ozone in trunk failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 19s{color} | {color:red} hadoop-hdds in trunk failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 13s{color} | {color:red} hadoop-ozone in trunk failed. {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 11s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 18s{color} | {color:red} hadoop-hdds in trunk failed. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 16s{color} | {color:red} hadoop-ozone in trunk failed. {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 17m 38s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 31s{color} | {color:red} hadoop-hdds in trunk failed. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 17s{color} | {color:red} hadoop-ozone in trunk failed. {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 33s{color} | {color:red} hadoop-hdds in the patch failed. {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 36s{color} | {color:red} hadoop-ozone in the patch failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 21s{color} | {color:red} hadoop-hdds in the patch failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 16s{color} | {color:red} hadoop-ozone in the patch failed. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 21s{color} | {color:red} hadoop-hdds in the patch failed. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 17s{color} | {color:red} hadoop-ozone in the patch failed. {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 59s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 19s{color} | {color:red} hadoop-hdds in the patch failed. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 16s{color} | {color:red} hadoop-ozone in the patch failed. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 28s{color} | {color:red} hadoop-hdds in the patch failed. {color} | |
[jira] [Created] (HDFS-14885) UI: Fix a typo in
Xieming Li created HDFS-14885: - Summary: UI: Fix a typo in Key: HDFS-14885 URL: https://issues.apache.org/jira/browse/HDFS-14885 Project: Hadoop HDFS Issue Type: Bug Components: datanode, ui Reporter: Xieming Li Assignee: Xieming Li Attachments: Screen Shot 2019-10-01 at 12.40.29.png A Period('.') should be added to the end of following sentence on WebUI of DataNode. "No nodes are decommissioning" -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14885) UI: Fix a typo on WebUI of DataNode.
[ https://issues.apache.org/jira/browse/HDFS-14885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941542#comment-16941542 ] Hadoop QA commented on HDFS-14885: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 22s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 54s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 36m 54s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 15s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 28s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 53m 42s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=18.09.7 Server=18.09.7 Image:yetus/hadoop:efed4450bf1 | | JIRA Issue | HDFS-14885 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12981848/HDFS-14885.patch | | Optional Tests | dupname asflicense shadedclient | | uname | Linux 459dc065a713 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 137546a | | maven | version: Apache Maven 3.3.9 | | Max. process+thread count | 342 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/27990/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > UI: Fix a typo on WebUI of DataNode. > > > Key: HDFS-14885 > URL: https://issues.apache.org/jira/browse/HDFS-14885 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, ui >Reporter: Xieming Li >Assignee: Xieming Li >Priority: Trivial > Attachments: HDFS-14885.patch, Screen Shot 2019-10-01 at 12.40.29.png > > > A Period('.') should be added to the end of following sentence on WebUI of > DataNode. > "No nodes are decommissioning" > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2213) Reduce key provider loading log level in OzoneFileSystem#getAdditionalTokenIssuers
[ https://issues.apache.org/jira/browse/HDDS-2213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoyu Yao updated HDDS-2213: - Description: OzoneFileSystem#getAdditionalTokenIssuers log an error when secure client tries to collect ozone delegation token to run MR/Spark jobs but ozone file system does not have a kms provider configured. In this case, we simply return null provider here in the code below. This is a benign error and we should reduce the log level to debug level. {code:java} KeyProvider keyProvider; try { keyProvider = getKeyProvider(); } catch (IOException ioe) { LOG.error("Error retrieving KeyProvider.", ioe); return null; } {code} was: OzoneFileSystem#getAdditionalTokenIssuers log an error when secure client tries to collect ozone delegation token to run MR/Spark jobs but ozone file system does not have a kms provider configured. In this case, we simply return null provider here in the code below. This is a benign error and we should reduce the log level to debug level. \{code} KeyProvider keyProvider; try { keyProvider = getKeyProvider(); } catch (IOException ioe) { LOG.error("Error retrieving KeyProvider.", ioe); return null; } {code} > Reduce key provider loading log level in > OzoneFileSystem#getAdditionalTokenIssuers > -- > > Key: HDDS-2213 > URL: https://issues.apache.org/jira/browse/HDDS-2213 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Vivek Ratnavel Subramanian >Priority: Minor > > OzoneFileSystem#getAdditionalTokenIssuers log an error when secure client > tries to collect ozone delegation token to run MR/Spark jobs but ozone file > system does not have a kms provider configured. In this case, we simply > return null provider here in the code below. This is a benign error and we > should reduce the log level to debug level. > {code:java} > KeyProvider keyProvider; > try { > keyProvider = getKeyProvider(); } > catch (IOException ioe) { > LOG.error("Error retrieving KeyProvider.", ioe); > return null; > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2203) Race condition in ByteStringHelper.init()
[ https://issues.apache.org/jira/browse/HDDS-2203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941398#comment-16941398 ] Anu Engineer commented on HDDS-2203: makes sense. Do you want this patch committed? or just move to the new model ? > Race condition in ByteStringHelper.init() > - > > Key: HDDS-2203 > URL: https://issues.apache.org/jira/browse/HDDS-2203 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Client, SCM >Reporter: Istvan Fajth >Assignee: Istvan Fajth >Priority: Critical > Labels: pull-request-available, pull-requests-available > Time Spent: 1h > Remaining Estimate: 0h > > The current init method: > {code} > public static void init(boolean isUnsafeByteOperation) { > final boolean set = INITIALIZED.compareAndSet(false, true); > if (set) { > ByteStringHelper.isUnsafeByteOperationsEnabled = >isUnsafeByteOperation; >} else { > // already initialized, check values > Preconditions.checkState(isUnsafeByteOperationsEnabled >== isUnsafeByteOperation); >} > } > {code} > In a scenario when two thread accesses this method, and the execution order > is the following, then the second thread runs into an exception from > PreCondition.checkState() in the else branch. > In an unitialized state: > - T1 thread arrives to the method with true as the parameter, the class > initialises the isUnsafeByteOperationsEnabled to false > - T1 sets INITIALIZED true > - T2 arrives to the method with true as the parameter > - T2 reads the INITALIZED value and as it is not false goes to else branch > - T2 tries to check if the internal boolean property is the same true as it > wanted to set, and as T1 still to set the value, the checkState throws an > IllegalArgumentException. > This happens in certain Hive query cases, as it came from that testing, the > exception we see there is the following: > {code} > Error: Error while processing statement: FAILED: Execution Error, return code > 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, > vertexName=Map 2, vertexId=vertex_1569486223160_0334_1_02, > diagnostics=[Vertex vertex_1569486223160_0334_1_02 [Map 2] killed/failed > due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: item initializer failed, > vertex=vertex_1569486223160_0334_1_02 [Map 2], java.io.IOException: Couldn't > create RpcClient protocol > at > org.apache.hadoop.ozone.client.OzoneClientFactory.getClientProtocol(OzoneClientFactory.java:263) > at > org.apache.hadoop.ozone.client.OzoneClientFactory.getClientProtocol(OzoneClientFactory.java:239) > at > org.apache.hadoop.ozone.client.OzoneClientFactory.getRpcClient(OzoneClientFactory.java:203) > at > org.apache.hadoop.ozone.client.OzoneClientFactory.getRpcClient(OzoneClientFactory.java:165) > at > org.apache.hadoop.fs.ozone.BasicOzoneClientAdapterImpl.(BasicOzoneClientAdapterImpl.java:158) > at > org.apache.hadoop.fs.ozone.OzoneClientAdapterImpl.(OzoneClientAdapterImpl.java:50) > at > org.apache.hadoop.fs.ozone.OzoneFileSystem.createAdapter(OzoneFileSystem.java:102) > at > org.apache.hadoop.fs.ozone.BasicOzoneFileSystem.initialize(BasicOzoneFileSystem.java:155) > at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3315) > at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:136) > at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3364) > at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3332) > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:491) > at org.apache.hadoop.fs.Path.getFileSystem(Path.java:361) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1821) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:2002) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:524) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:781) > at > org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:243) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:278) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:269) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:269) >
[jira] [Work logged] (HDDS-1720) Add ability to configure RocksDB logs for Ozone Manager
[ https://issues.apache.org/jira/browse/HDDS-1720?focusedWorklogId=320994=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320994 ] ASF GitHub Bot logged work on HDDS-1720: Author: ASF GitHub Bot Created on: 01/Oct/19 00:22 Start Date: 01/Oct/19 00:22 Worklog Time Spent: 10m Work Description: anuengineer commented on issue #1538: HDDS-1720 : Add ability to configure RocksDB logs for Ozone Manager. URL: https://github.com/apache/hadoop/pull/1538#issuecomment-536804802 Can you please add a test case that proves that RocksDB actually produces logs that we can see. There is a log listener class in the Hadoop. You can use them, or create a test and then grep for some of the log statements. Otherwise the change looks quite good to me. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320994) Time Spent: 1h 10m (was: 1h) > Add ability to configure RocksDB logs for Ozone Manager > --- > > Key: HDDS-1720 > URL: https://issues.apache.org/jira/browse/HDDS-1720 > Project: Hadoop Distributed Data Store > Issue Type: Task >Reporter: Aravindan Vijayan >Assignee: Aravindan Vijayan >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > While doing performance testing, it was seen that there was no way to get > RocksDB logs for Ozone Manager. Along with Rocksdb metrics, this may be a > useful mechanism to understand the health of Rocksdb while investigating > large clusters. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-1984) Fix listBucket API
[ https://issues.apache.org/jira/browse/HDDS-1984?focusedWorklogId=321027=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-321027 ] ASF GitHub Bot logged work on HDDS-1984: Author: ASF GitHub Bot Created on: 01/Oct/19 02:14 Start Date: 01/Oct/19 02:14 Worklog Time Spent: 10m Work Description: bharatviswa504 commented on pull request #1555: HDDS-1984. Fix listBucket API. URL: https://github.com/apache/hadoop/pull/1555 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 321027) Remaining Estimate: 0h Time Spent: 10m > Fix listBucket API > -- > > Key: HDDS-1984 > URL: https://issues.apache.org/jira/browse/HDDS-1984 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Bharat Viswanadham >Assignee: YiSheng Lien >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > This Jira is to fix listBucket API in HA code path. > In HA, we have an in-memory cache, where we put the result to in-memory cache > and return the response, later it will be picked by double buffer thread and > it will flush to disk. So, now when do listBuckets, it should use both > in-memory cache and rocksdb bucket table to list buckets in a volume. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1984) Fix listBucket API
[ https://issues.apache.org/jira/browse/HDDS-1984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDDS-1984: - Labels: pull-request-available (was: ) > Fix listBucket API > -- > > Key: HDDS-1984 > URL: https://issues.apache.org/jira/browse/HDDS-1984 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Bharat Viswanadham >Assignee: YiSheng Lien >Priority: Major > Labels: pull-request-available > > This Jira is to fix listBucket API in HA code path. > In HA, we have an in-memory cache, where we put the result to in-memory cache > and return the response, later it will be picked by double buffer thread and > it will flush to disk. So, now when do listBuckets, it should use both > in-memory cache and rocksdb bucket table to list buckets in a volume. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1984) Fix listBucket API
[ https://issues.apache.org/jira/browse/HDDS-1984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941466#comment-16941466 ] Bharat Viswanadham commented on HDDS-1984: -- Hey Hi [~cxorm] Thanks for taking this up. I have missed that you have assigned this Jira to yourself. I have started working on this and have a patch for this. Sorry for this. You can take up other list APIs which are similar kind of this Jira. > Fix listBucket API > -- > > Key: HDDS-1984 > URL: https://issues.apache.org/jira/browse/HDDS-1984 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Bharat Viswanadham >Assignee: YiSheng Lien >Priority: Major > > This Jira is to fix listBucket API in HA code path. > In HA, we have an in-memory cache, where we put the result to in-memory cache > and return the response, later it will be picked by double buffer thread and > it will flush to disk. So, now when do listBuckets, it should use both > in-memory cache and rocksdb bucket table to list buckets in a volume. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-1984) Fix listBucket API
[ https://issues.apache.org/jira/browse/HDDS-1984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] YiSheng Lien reassigned HDDS-1984: -- Assignee: Bharat Viswanadham > Fix listBucket API > -- > > Key: HDDS-1984 > URL: https://issues.apache.org/jira/browse/HDDS-1984 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > This Jira is to fix listBucket API in HA code path. > In HA, we have an in-memory cache, where we put the result to in-memory cache > and return the response, later it will be picked by double buffer thread and > it will flush to disk. So, now when do listBuckets, it should use both > in-memory cache and rocksdb bucket table to list buckets in a volume. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1984) Fix listBucket API
[ https://issues.apache.org/jira/browse/HDDS-1984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941471#comment-16941471 ] Bharat Viswanadham commented on HDDS-1984: -- Just posted an initial PR, still thinking about it how can I improve this. (As with posted patch, every time I will iterate entire map) Posted a starter patch. (Will need further look in to, how can I improve further) > Fix listBucket API > -- > > Key: HDDS-1984 > URL: https://issues.apache.org/jira/browse/HDDS-1984 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > This Jira is to fix listBucket API in HA code path. > In HA, we have an in-memory cache, where we put the result to in-memory cache > and return the response, later it will be picked by double buffer thread and > it will flush to disk. So, now when do listBuckets, it should use both > in-memory cache and rocksdb bucket table to list buckets in a volume. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-1984) Fix listBucket API
[ https://issues.apache.org/jira/browse/HDDS-1984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] YiSheng Lien reassigned HDDS-1984: -- Assignee: (was: YiSheng Lien) > Fix listBucket API > -- > > Key: HDDS-1984 > URL: https://issues.apache.org/jira/browse/HDDS-1984 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Bharat Viswanadham >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > This Jira is to fix listBucket API in HA code path. > In HA, we have an in-memory cache, where we put the result to in-memory cache > and return the response, later it will be picked by double buffer thread and > it will flush to disk. So, now when do listBuckets, it should use both > in-memory cache and rocksdb bucket table to list buckets in a volume. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1984) Fix listBucket API
[ https://issues.apache.org/jira/browse/HDDS-1984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941469#comment-16941469 ] YiSheng Lien commented on HDDS-1984: Hello [~bharat] Thanks for the comment, Never mind about it, I can learn more about it with your PR :) > Fix listBucket API > -- > > Key: HDDS-1984 > URL: https://issues.apache.org/jira/browse/HDDS-1984 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Bharat Viswanadham >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > This Jira is to fix listBucket API in HA code path. > In HA, we have an in-memory cache, where we put the result to in-memory cache > and return the response, later it will be picked by double buffer thread and > it will flush to disk. So, now when do listBuckets, it should use both > in-memory cache and rocksdb bucket table to list buckets in a volume. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2175) Propagate System Exceptions from the OzoneManager
[ https://issues.apache.org/jira/browse/HDDS-2175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941476#comment-16941476 ] Anu Engineer commented on HDDS-2175: It is something that I disagree with. But if you feel strongly about this; please go ahead. > Propagate System Exceptions from the OzoneManager > - > > Key: HDDS-2175 > URL: https://issues.apache.org/jira/browse/HDDS-2175 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Manager >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > Exceptions encountered while processing requests on the OM are categorized as > business exceptions and system exceptions. All of the business exceptions are > captured as OMException and have an associated status code which is returned > to the client. The handling of these is not going to be changed. > Currently system exceptions are returned as INTERNAL ERROR to the client with > a 1 line message string from the exception. The scope of this jira is to > capture system exceptions and propagate the related information(including the > complete stack trace) back to the client. > There are 3 sub-tasks required to achieve this > 1. Separate capture and handling for OMException and the other > exceptions(IOException). For system exceptions, use Hadoop IPC > ServiceException mechanism to send the stack trace to the client. > 2. track and propagate exceptions inside Ratis OzoneManagerStateMachine and > propagate up to the OzoneManager layer (on the leader). Currently, these > exceptions are not being tracked. > 3. Handle and propagate exceptions from Ratis. > Will raise jira for each sub-task. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2169) Avoid buffer copies while submitting client requests in Ratis
[ https://issues.apache.org/jira/browse/HDDS-2169?focusedWorklogId=321035=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-321035 ] ASF GitHub Bot logged work on HDDS-2169: Author: ASF GitHub Bot Created on: 01/Oct/19 02:50 Start Date: 01/Oct/19 02:50 Worklog Time Spent: 10m Work Description: szetszwo commented on issue #1517: HDDS-2169 URL: https://github.com/apache/hadoop/pull/1517#issuecomment-536836755 > Thanks @szetszwo for working on this. With the patch, while running the tests in TestDataValidateWithUnsafeByteOperations, the below issue is observed. > > `2019-09-30 21:58:02,745 [grpc-default-executor-2] ERROR segmented.SegmentedRaftLogWorker (SegmentedRaftLogWorker.java:(449)) - e4ab8454-30fe-420c-a1cf-40d223cb4898@group-D0335C23E8DA-SegmentedRaftLogWorker: writeStateMachineData failed for index 1, entry=(t:1, i:1), STATEMACHINELOGENTRY, client-6C45A0D09519, cid=8 java.lang.IndexOutOfBoundsException: End index: 135008824 >= 207 at org.apache.ratis.thirdparty.com.google.protobuf.ByteString.checkRange(ByteString.java:1233) at org.apache.ratis.thirdparty.com.google.protobuf.ByteString$LiteralByteString.substring(ByteString.java:1288) at org.apache.hadoop.hdds.ratis.ContainerCommandRequestMessage.toProto(ContainerCommandRequestMessage.java:66) at org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.getContainerCommandRequestProto(ContainerStateMachine.java:375) at org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.writeStateMachineData(ContainerStateMachine.java:494) at org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker$WriteLog.(SegmentedRaftLogWorker.java:447) at org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker.writeLogEntry(SegmentedRaftLogWorker.java:397) at org.apache.ratis.server.raftlog.segmented.SegmentedRaftLog.appendEntryImpl(SegmentedRaftLog.java:411) at org.apache.ratis.server.raftlog.RaftLog.lambda$appendEntry$10(RaftLog.java:359) at org.apache.ratis.server.raftlog.RaftLogSequentialOps$Runner.runSequentially(RaftLogSequentialOps.java:77) at org.apache.ratis.server.raftlog.RaftLog.appendEntry(RaftLog.java:359) at org.apache.ratis.server.raftlog.RaftLog.appendImpl(RaftLog.java:183) at org.apache.ratis.server.raftlog.RaftLog.lambda$append$2(RaftLog.java:159) at org.apache.ratis.server.raftlog.RaftLogSequentialOps$Runner.runSequentially(RaftLogSequentialOps.java:68) at org.apache.ratis.server.raftlog.RaftLog.append(RaftLog.java:159) at org.apache.ratis.server.impl.ServerState.appendLog(ServerState.java:282) at org.apache.ratis.server.impl.RaftServerImpl.appendTransaction(RaftServerImpl.java:505) at org.apache.ratis.server.impl.RaftServerImpl.submitClientRequestAsync(RaftServerImpl.java:576) at org.apache.ratis.server.impl.RaftServerProxy.lambda$submitClientRequestAsync$7(RaftServerProxy.java:333) at org.apache.ratis.server.impl.RaftServerProxy.lambda$null$5(RaftServerProxy.java:328) at org.apache.ratis.util.JavaUtils.callAsUnchecked(JavaUtils.java:109) at org.apache.ratis.server.impl.RaftServerProxy.lambda$submitRequest$6(RaftServerProxy.java:328) at java.util.concurrent.CompletableFuture.uniComposeStage(CompletableFuture.java:981) at java.util.concurrent.CompletableFuture.thenCompose(CompletableFuture.java:2124) at org.apache.ratis.server.impl.RaftServerProxy.submitRequest(RaftServerProxy.java:327) at org.apache.ratis.server.impl.RaftServerProxy.submitClientRequestAsync(RaftServerProxy.java:333) at org.apache.ratis.grpc.client.GrpcClientProtocolService$RequestStreamObserver.processClientRequest(GrpcClientProtocolService.java:220) at org.apache.ratis.grpc.client.GrpcClientProtocolService$OrderedRequestStreamObserver.processClientRequest(GrpcClientProtocolService.java:326) at org.apache.ratis.util.SlidingWindow$Server.processRequestsFromHead(SlidingWindow.java:429) at org.apache.ratis.util.SlidingWindow$Server.receivedRequest(SlidingWindow.java:421) at org.apache.ratis.grpc.client.GrpcClientProtocolService$OrderedRequestStreamObserver.processClientRequest(GrpcClientProtocolService.java:345) at org.apache.ratis.grpc.client.GrpcClientProtocolService$RequestStreamObserver.onNext(GrpcClientProtocolService.java:240) at org.apache.ratis.grpc.client.GrpcClientProtocolService$RequestStreamObserver.onNext(GrpcClientProtocolService.java:168) at org.apache.ratis.thirdparty.io.grpc.stub.ServerCalls$StreamingServerCallHandler$StreamingServerCallListener.onMessage(ServerCalls.java:248) at org.apache.ratis.thirdparty.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailable(ServerCallImpl.java:263) at org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1MessagesAvailable.runInContext(ServerImpl.java:686) at
[jira] [Commented] (HDDS-2001) Update Ratis version to 0.4.0
[ https://issues.apache.org/jira/browse/HDDS-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941516#comment-16941516 ] Nanda kumar commented on HDDS-2001: --- The change in ozone-0.4.1 branch is done as part of HDDS-2020 > Update Ratis version to 0.4.0 > - > > Key: HDDS-2001 > URL: https://issues.apache.org/jira/browse/HDDS-2001 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Nanda kumar >Assignee: Nanda kumar >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 2.5h > Remaining Estimate: 0h > > Update Ratis version to 0.4.0 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-2001) Update Ratis version to 0.4.0
[ https://issues.apache.org/jira/browse/HDDS-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar resolved HDDS-2001. --- Resolution: Fixed > Update Ratis version to 0.4.0 > - > > Key: HDDS-2001 > URL: https://issues.apache.org/jira/browse/HDDS-2001 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Nanda kumar >Assignee: Nanda kumar >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 2.5h > Remaining Estimate: 0h > > Update Ratis version to 0.4.0 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14885) UI: Fix a typo on WebUI of DataNode.
[ https://issues.apache.org/jira/browse/HDFS-14885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xieming Li updated HDFS-14885: -- Summary: UI: Fix a typo on WebUI of DataNode. (was: UI: Fix a typo in ) > UI: Fix a typo on WebUI of DataNode. > > > Key: HDFS-14885 > URL: https://issues.apache.org/jira/browse/HDFS-14885 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, ui >Reporter: Xieming Li >Assignee: Xieming Li >Priority: Trivial > Attachments: Screen Shot 2019-10-01 at 12.40.29.png > > > A Period('.') should be added to the end of following sentence on WebUI of > DataNode. > "No nodes are decommissioning" > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14885) UI: Fix a typo on WebUI of DataNode.
[ https://issues.apache.org/jira/browse/HDFS-14885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xieming Li updated HDFS-14885: -- Attachment: HDFS-14885.patch Status: Patch Available (was: Open) > UI: Fix a typo on WebUI of DataNode. > > > Key: HDFS-14885 > URL: https://issues.apache.org/jira/browse/HDFS-14885 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, ui >Reporter: Xieming Li >Assignee: Xieming Li >Priority: Trivial > Attachments: HDFS-14885.patch, Screen Shot 2019-10-01 at 12.40.29.png > > > A Period('.') should be added to the end of following sentence on WebUI of > DataNode. > "No nodes are decommissioning" > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14373) EC : Decoding is failing when block group last incomplete cell fall in to AlignedStripe
[ https://issues.apache.org/jira/browse/HDFS-14373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Surendra Singh Lilhore updated HDFS-14373: -- Component/s: ec Priority: Critical (was: Major) > EC : Decoding is failing when block group last incomplete cell fall in to > AlignedStripe > --- > > Key: HDFS-14373 > URL: https://issues.apache.org/jira/browse/HDFS-14373 > Project: Hadoop HDFS > Issue Type: Bug > Components: ec, hdfs-client >Affects Versions: 3.1.1 >Reporter: Surendra Singh Lilhore >Assignee: Surendra Singh Lilhore >Priority: Critical > Attachments: HDFS-14373.001.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Reopened] (HDFS-7134) Replication count for a block should not update till the blocks have settled on Datanodes
[ https://issues.apache.org/jira/browse/HDFS-7134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang reopened HDFS-7134: --- > Replication count for a block should not update till the blocks have settled > on Datanodes > - > > Key: HDFS-7134 > URL: https://issues.apache.org/jira/browse/HDFS-7134 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, hdfs >Affects Versions: 1.2.1, 2.6.0, 2.7.3 > Environment: Linux nn1.cluster1.com 2.6.32-431.20.3.el6.x86_64 #1 SMP > Thu Jun 19 21:14:45 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux > [hadoop@nn1 conf]$ cat /etc/redhat-release > CentOS release 6.5 (Final) >Reporter: gurmukh singh >Priority: Critical > Labels: HDFS > Fix For: 3.1.0 > > > The count for the number of replica's for a block should not change till the > blocks have settled on the datanodes. > Test Case: > Hadoop Cluster with 1 namenode and 3 datanodes. > nn1.cluster1.com(192.168.1.70) > dn1.cluster1.com(192.168.1.72) > dn2.cluster1.com(192.168.1.73) > dn3.cluster1.com(192.168.1.74) > Cluster up and running fine with replication set to "1" for parameter > "dfs.replication on all nodes" > > dfs.replication > 1 > > To reduce the wait time, have reduced the dfs.heartbeat and recheck > parameters. > on datanode2 (192.168.1.72) > [hadoop@dn2 ~]$ hadoop fs -Ddfs.replication=2 -put from_dn2 / > [hadoop@dn2 ~]$ hadoop fs -ls /from_dn2 > Found 1 items > -rw-r--r-- 2 hadoop supergroup 17 2014-09-23 13:33 /from_dn2 > On Namenode > === > As expected, copy was done from datanode2, one copy will go locally. > [hadoop@nn1 conf]$ hadoop fsck /from_dn2 -files -blocks -locations > FSCK started by hadoop from /192.168.1.70 for path /from_dn2 at Tue Sep 23 > 13:53:16 IST 2014 > /from_dn2 17 bytes, 1 block(s): OK > 0. blk_8132629811771280764_1175 len=17 repl=2 [192.168.1.74:50010, > 192.168.1.73:50010] > Can see the blocks on the data nodes disks as well under the "current" > directory. > Now, shutdown datanode2(192.168.1.73) and as expected block moves to another > datanode to maintain a replication of 2 > [hadoop@nn1 conf]$ hadoop fsck /from_dn2 -files -blocks -locations > FSCK started by hadoop from /192.168.1.70 for path /from_dn2 at Tue Sep 23 > 13:54:21 IST 2014 > /from_dn2 17 bytes, 1 block(s): OK > 0. blk_8132629811771280764_1175 len=17 repl=2 [192.168.1.74:50010, > 192.168.1.72:50010] > But, now if i bring back the datanode2, and although the namenode see that > this block is at 3 places now and fires a invalidate command for > datanode1(192.168.1.72) but the replication on the namenode is bumped to 3 > immediately. > [hadoop@nn1 conf]$ hadoop fsck /from_dn2 -files -blocks -locations > FSCK started by hadoop from /192.168.1.70 for path /from_dn2 at Tue Sep 23 > 13:56:12 IST 2014 > /from_dn2 17 bytes, 1 block(s): OK > 0. blk_8132629811771280764_1175 len=17 repl=3 [192.168.1.74:50010, > 192.168.1.72:50010, 192.168.1.73:50010] > on Datanode1 - The invalidate command has been fired immediately and the > block deleted. > = > 2014-09-23 13:54:17,483 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > Receiving blk_8132629811771280764_1175 src: /192.168.1.74:38099 dest: > /192.168.1.72:50010 > 2014-09-23 13:54:17,502 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > Received blk_8132629811771280764_1175 src: /192.168.1.74:38099 dest: > /192.168.1.72:50010 size 17 > 2014-09-23 13:55:28,720 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > Scheduling blk_8132629811771280764_1175 file > /space/disk1/current/blk_8132629811771280764 for deletion > 2014-09-23 13:55:28,721 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > Deleted blk_8132629811771280764_1175 at file > /space/disk1/current/blk_8132629811771280764 > The namenode still shows 3 replica's. even if one has been deleted, even > after more then 30 mins. > [hadoop@nn1 conf]$ hadoop fsck /from_dn2 -files -blocks -locations > FSCK started by hadoop from /192.168.1.70 for path /from_dn2 at Tue Sep 23 > 14:21:27 IST 2014 > /from_dn2 17 bytes, 1 block(s): OK > 0. blk_8132629811771280764_1175 len=17 repl=3 [192.168.1.74:50010, > 192.168.1.72:50010, 192.168.1.73:50010] > This could be a dangerous, if someone remove or other 2 datanodes fail. > On Datanode 1 > = > Before, the datanode1 is brought back > [hadoop@dn1 conf]$ ls -l /space/disk*/current > /space/disk1/current: > total 28 > -rw-rw-r-- 1 hadoop hadoop 13 Sep 21 09:09 blk_2278001646987517832 > -rw-rw-r-- 1 hadoop hadoop 11 Sep 21 09:09 blk_2278001646987517832_1171.meta > -rw-rw-r-- 1 hadoop hadoop 17 Sep 23 13:54 blk_8132629811771280764 > -rw-rw-r-- 1 hadoop hadoop 11 Sep 23 13:54 blk_8132629811771280764_1175.meta >
[jira] [Resolved] (HDFS-7134) Replication count for a block should not update till the blocks have settled on Datanodes
[ https://issues.apache.org/jira/browse/HDFS-7134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang resolved HDFS-7134. --- Resolution: Cannot Reproduce Resolve as cannot reproduce. > Replication count for a block should not update till the blocks have settled > on Datanodes > - > > Key: HDFS-7134 > URL: https://issues.apache.org/jira/browse/HDFS-7134 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, hdfs >Affects Versions: 1.2.1, 2.6.0, 2.7.3 > Environment: Linux nn1.cluster1.com 2.6.32-431.20.3.el6.x86_64 #1 SMP > Thu Jun 19 21:14:45 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux > [hadoop@nn1 conf]$ cat /etc/redhat-release > CentOS release 6.5 (Final) >Reporter: gurmukh singh >Priority: Critical > Labels: HDFS > Fix For: 3.1.0 > > > The count for the number of replica's for a block should not change till the > blocks have settled on the datanodes. > Test Case: > Hadoop Cluster with 1 namenode and 3 datanodes. > nn1.cluster1.com(192.168.1.70) > dn1.cluster1.com(192.168.1.72) > dn2.cluster1.com(192.168.1.73) > dn3.cluster1.com(192.168.1.74) > Cluster up and running fine with replication set to "1" for parameter > "dfs.replication on all nodes" > > dfs.replication > 1 > > To reduce the wait time, have reduced the dfs.heartbeat and recheck > parameters. > on datanode2 (192.168.1.72) > [hadoop@dn2 ~]$ hadoop fs -Ddfs.replication=2 -put from_dn2 / > [hadoop@dn2 ~]$ hadoop fs -ls /from_dn2 > Found 1 items > -rw-r--r-- 2 hadoop supergroup 17 2014-09-23 13:33 /from_dn2 > On Namenode > === > As expected, copy was done from datanode2, one copy will go locally. > [hadoop@nn1 conf]$ hadoop fsck /from_dn2 -files -blocks -locations > FSCK started by hadoop from /192.168.1.70 for path /from_dn2 at Tue Sep 23 > 13:53:16 IST 2014 > /from_dn2 17 bytes, 1 block(s): OK > 0. blk_8132629811771280764_1175 len=17 repl=2 [192.168.1.74:50010, > 192.168.1.73:50010] > Can see the blocks on the data nodes disks as well under the "current" > directory. > Now, shutdown datanode2(192.168.1.73) and as expected block moves to another > datanode to maintain a replication of 2 > [hadoop@nn1 conf]$ hadoop fsck /from_dn2 -files -blocks -locations > FSCK started by hadoop from /192.168.1.70 for path /from_dn2 at Tue Sep 23 > 13:54:21 IST 2014 > /from_dn2 17 bytes, 1 block(s): OK > 0. blk_8132629811771280764_1175 len=17 repl=2 [192.168.1.74:50010, > 192.168.1.72:50010] > But, now if i bring back the datanode2, and although the namenode see that > this block is at 3 places now and fires a invalidate command for > datanode1(192.168.1.72) but the replication on the namenode is bumped to 3 > immediately. > [hadoop@nn1 conf]$ hadoop fsck /from_dn2 -files -blocks -locations > FSCK started by hadoop from /192.168.1.70 for path /from_dn2 at Tue Sep 23 > 13:56:12 IST 2014 > /from_dn2 17 bytes, 1 block(s): OK > 0. blk_8132629811771280764_1175 len=17 repl=3 [192.168.1.74:50010, > 192.168.1.72:50010, 192.168.1.73:50010] > on Datanode1 - The invalidate command has been fired immediately and the > block deleted. > = > 2014-09-23 13:54:17,483 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > Receiving blk_8132629811771280764_1175 src: /192.168.1.74:38099 dest: > /192.168.1.72:50010 > 2014-09-23 13:54:17,502 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > Received blk_8132629811771280764_1175 src: /192.168.1.74:38099 dest: > /192.168.1.72:50010 size 17 > 2014-09-23 13:55:28,720 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > Scheduling blk_8132629811771280764_1175 file > /space/disk1/current/blk_8132629811771280764 for deletion > 2014-09-23 13:55:28,721 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > Deleted blk_8132629811771280764_1175 at file > /space/disk1/current/blk_8132629811771280764 > The namenode still shows 3 replica's. even if one has been deleted, even > after more then 30 mins. > [hadoop@nn1 conf]$ hadoop fsck /from_dn2 -files -blocks -locations > FSCK started by hadoop from /192.168.1.70 for path /from_dn2 at Tue Sep 23 > 14:21:27 IST 2014 > /from_dn2 17 bytes, 1 block(s): OK > 0. blk_8132629811771280764_1175 len=17 repl=3 [192.168.1.74:50010, > 192.168.1.72:50010, 192.168.1.73:50010] > This could be a dangerous, if someone remove or other 2 datanodes fail. > On Datanode 1 > = > Before, the datanode1 is brought back > [hadoop@dn1 conf]$ ls -l /space/disk*/current > /space/disk1/current: > total 28 > -rw-rw-r-- 1 hadoop hadoop 13 Sep 21 09:09 blk_2278001646987517832 > -rw-rw-r-- 1 hadoop hadoop 11 Sep 21 09:09 blk_2278001646987517832_1171.meta > -rw-rw-r-- 1 hadoop hadoop 17 Sep 23 13:54 blk_8132629811771280764 > -rw-rw-r-- 1 hadoop hadoop
[jira] [Work logged] (HDDS-2205) checkstyle.sh reports wrong failure count
[ https://issues.apache.org/jira/browse/HDDS-2205?focusedWorklogId=320975=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320975 ] ASF GitHub Bot logged work on HDDS-2205: Author: ASF GitHub Bot Created on: 30/Sep/19 23:37 Start Date: 30/Sep/19 23:37 Worklog Time Spent: 10m Work Description: anuengineer commented on pull request #1548: HDDS-2205. checkstyle.sh reports wrong failure count URL: https://github.com/apache/hadoop/pull/1548 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320975) Time Spent: 1h (was: 50m) > checkstyle.sh reports wrong failure count > - > > Key: HDDS-2205 > URL: https://issues.apache.org/jira/browse/HDDS-2205 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: test >Reporter: Attila Doroszlai >Assignee: Attila Doroszlai >Priority: Trivial > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > {{checkstyle.sh}} outputs files with checkstyle violations and the violations > themselves on separate lines. It then reports line count as number of > failures. > {code:title=target/checkstyle/summary.txt} > hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/OmUtils.java > 49: Unused import - org.apache.hadoop.ozone.om.OMMetadataManager. > {code} > {code:title=target/checkstyle/failures} > 2 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2205) checkstyle.sh reports wrong failure count
[ https://issues.apache.org/jira/browse/HDDS-2205?focusedWorklogId=320974=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320974 ] ASF GitHub Bot logged work on HDDS-2205: Author: ASF GitHub Bot Created on: 30/Sep/19 23:37 Start Date: 30/Sep/19 23:37 Worklog Time Spent: 10m Work Description: anuengineer commented on issue #1548: HDDS-2205. checkstyle.sh reports wrong failure count URL: https://github.com/apache/hadoop/pull/1548#issuecomment-536795581 Thank you for the contribution. I have committed this patch to the trunk. @dineshchitlangia Thank you for the review. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320974) Time Spent: 50m (was: 40m) > checkstyle.sh reports wrong failure count > - > > Key: HDDS-2205 > URL: https://issues.apache.org/jira/browse/HDDS-2205 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: test >Reporter: Attila Doroszlai >Assignee: Attila Doroszlai >Priority: Trivial > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > {{checkstyle.sh}} outputs files with checkstyle violations and the violations > themselves on separate lines. It then reports line count as number of > failures. > {code:title=target/checkstyle/summary.txt} > hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/OmUtils.java > 49: Unused import - org.apache.hadoop.ozone.om.OMMetadataManager. > {code} > {code:title=target/checkstyle/failures} > 2 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2205) checkstyle.sh reports wrong failure count
[ https://issues.apache.org/jira/browse/HDDS-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer updated HDDS-2205: --- Fix Version/s: 0.5.0 Resolution: Fixed Status: Resolved (was: Patch Available) > checkstyle.sh reports wrong failure count > - > > Key: HDDS-2205 > URL: https://issues.apache.org/jira/browse/HDDS-2205 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: test >Reporter: Attila Doroszlai >Assignee: Attila Doroszlai >Priority: Trivial > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 1h > Remaining Estimate: 0h > > {{checkstyle.sh}} outputs files with checkstyle violations and the violations > themselves on separate lines. It then reports line count as number of > failures. > {code:title=target/checkstyle/summary.txt} > hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/OmUtils.java > 49: Unused import - org.apache.hadoop.ozone.om.OMMetadataManager. > {code} > {code:title=target/checkstyle/failures} > 2 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14235) Handle ArrayIndexOutOfBoundsException in DataNodeDiskMetrics#slowDiskDetectionDaemon
[ https://issues.apache.org/jira/browse/HDFS-14235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-14235: --- Fix Version/s: 3.1.4 > Handle ArrayIndexOutOfBoundsException in > DataNodeDiskMetrics#slowDiskDetectionDaemon > - > > Key: HDFS-14235 > URL: https://issues.apache.org/jira/browse/HDFS-14235 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Surendra Singh Lilhore >Assignee: Ranith Sardar >Priority: Major > Fix For: 3.3.0, 3.2.1, 3.1.4 > > Attachments: HDFS-14235.000.patch, HDFS-14235.001.patch, > HDFS-14235.002.patch, HDFS-14235.003.patch, NPE.png, exception.png > > > below code throwing exception because {{volumeIterator.next()}} called two > time without checking hashNext(). > {code:java} > while (volumeIterator.hasNext()) { > FsVolumeSpi volume = volumeIterator.next(); > DataNodeVolumeMetrics metrics = volumeIterator.next().getMetrics(); > String volumeName = volume.getBaseURI().getPath(); > metadataOpStats.put(volumeName, > metrics.getMetadataOperationMean()); > readIoStats.put(volumeName, metrics.getReadIoMean()); > writeIoStats.put(volumeName, metrics.getWriteIoMean()); > }{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14235) Handle ArrayIndexOutOfBoundsException in DataNodeDiskMetrics#slowDiskDetectionDaemon
[ https://issues.apache.org/jira/browse/HDFS-14235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941411#comment-16941411 ] Wei-Chiu Chuang commented on HDFS-14235: Commit applies cleanly in branch-3.1. Updated fix version > Handle ArrayIndexOutOfBoundsException in > DataNodeDiskMetrics#slowDiskDetectionDaemon > - > > Key: HDFS-14235 > URL: https://issues.apache.org/jira/browse/HDFS-14235 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Surendra Singh Lilhore >Assignee: Ranith Sardar >Priority: Major > Fix For: 3.3.0, 3.2.1, 3.1.4 > > Attachments: HDFS-14235.000.patch, HDFS-14235.001.patch, > HDFS-14235.002.patch, HDFS-14235.003.patch, NPE.png, exception.png > > > below code throwing exception because {{volumeIterator.next()}} called two > time without checking hashNext(). > {code:java} > while (volumeIterator.hasNext()) { > FsVolumeSpi volume = volumeIterator.next(); > DataNodeVolumeMetrics metrics = volumeIterator.next().getMetrics(); > String volumeName = volume.getBaseURI().getPath(); > metadataOpStats.put(volumeName, > metrics.getMetadataOperationMean()); > readIoStats.put(volumeName, metrics.getReadIoMean()); > writeIoStats.put(volumeName, metrics.getWriteIoMean()); > }{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14305) Serial number in BlockTokenSecretManager could overlap between different namenodes
[ https://issues.apache.org/jira/browse/HDFS-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941420#comment-16941420 ] Hudson commented on HDFS-14305: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17421 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/17421/]) HDFS-14305. Fix serial number calculation in BlockTokenSecretManager to (shv: rev b3275ab1f2f4546ba4bdc0e48cfa60b5b05071b9) * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/security/token/block/TestBlockToken.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestFailoverWithBlockTokensEnabled.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/security/token/block/BlockTokenSecretManager.java > Serial number in BlockTokenSecretManager could overlap between different > namenodes > -- > > Key: HDFS-14305 > URL: https://issues.apache.org/jira/browse/HDFS-14305 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode, security >Reporter: Chao Sun >Assignee: Xiaoqiao He >Priority: Major > Labels: multi-sbnn, release-blocker > Fix For: 3.0.4, 3.3.0, 3.2.1, 3.1.3 > > Attachments: HDFS-14305-007.patch, HDFS-14305-008.patch, > HDFS-14305.001.patch, HDFS-14305.002.patch, HDFS-14305.003.patch, > HDFS-14305.004.patch, HDFS-14305.005.patch, HDFS-14305.006.patch > > > Currently, a {{BlockTokenSecretManager}} starts with a random integer as the > initial serial number, and then use this formula to rotate it: > {code:java} > this.intRange = Integer.MAX_VALUE / numNNs; > this.nnRangeStart = intRange * nnIndex; > this.serialNo = (this.serialNo % intRange) + (nnRangeStart); > {code} > while {{numNNs}} is the total number of NameNodes in the cluster, and > {{nnIndex}} is the index of the current NameNode specified in the > configuration {{dfs.ha.namenodes.}}. > However, with this approach, different NameNode could have overlapping ranges > for serial number. For simplicity, let's assume {{Integer.MAX_VALUE}} is 100, > and we have 2 NameNodes {{nn1}} and {{nn2}} in configuration. Then the ranges > for these two are: > {code} > nn1 -> [-49, 49] > nn2 -> [1, 99] > {code} > This is because the initial serial number could be any negative integer. > Moreover, when the keys are updated, the serial number will again be updated > with the formula: > {code} > this.serialNo = (this.serialNo % intRange) + (nnRangeStart); > {code} > which means the new serial number could be updated to a range that belongs to > a different NameNode, thus increasing the chance of collision again. > When the collision happens, DataNodes could overwrite an existing key which > will cause clients to fail because of {{InvalidToken}} error. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14305) Serial number in BlockTokenSecretManager could overlap between different namenodes
[ https://issues.apache.org/jira/browse/HDFS-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated HDFS-14305: --- Fix Version/s: (was: 3.1.3) (was: 3.2.1) (was: 3.0.4) 3.2.2 3.1.4 2.10.0 Hadoop Flags: Reviewed Assignee: Konstantin Shvachko (was: Xiaoqiao He) Resolution: Fixed Status: Resolved (was: Patch Available) [~vagarychen] you are absolutely correct, thanks for the review. I just committed this to trunk, branch-3.2, branch-3.1, and branch-2. Updated fix versions. > Serial number in BlockTokenSecretManager could overlap between different > namenodes > -- > > Key: HDFS-14305 > URL: https://issues.apache.org/jira/browse/HDFS-14305 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode, security >Reporter: Chao Sun >Assignee: Konstantin Shvachko >Priority: Major > Labels: multi-sbnn, release-blocker > Fix For: 2.10.0, 3.3.0, 3.1.4, 3.2.2 > > Attachments: HDFS-14305-007.patch, HDFS-14305-008.patch, > HDFS-14305.001.patch, HDFS-14305.002.patch, HDFS-14305.003.patch, > HDFS-14305.004.patch, HDFS-14305.005.patch, HDFS-14305.006.patch > > > Currently, a {{BlockTokenSecretManager}} starts with a random integer as the > initial serial number, and then use this formula to rotate it: > {code:java} > this.intRange = Integer.MAX_VALUE / numNNs; > this.nnRangeStart = intRange * nnIndex; > this.serialNo = (this.serialNo % intRange) + (nnRangeStart); > {code} > while {{numNNs}} is the total number of NameNodes in the cluster, and > {{nnIndex}} is the index of the current NameNode specified in the > configuration {{dfs.ha.namenodes.}}. > However, with this approach, different NameNode could have overlapping ranges > for serial number. For simplicity, let's assume {{Integer.MAX_VALUE}} is 100, > and we have 2 NameNodes {{nn1}} and {{nn2}} in configuration. Then the ranges > for these two are: > {code} > nn1 -> [-49, 49] > nn2 -> [1, 99] > {code} > This is because the initial serial number could be any negative integer. > Moreover, when the keys are updated, the serial number will again be updated > with the formula: > {code} > this.serialNo = (this.serialNo % intRange) + (nnRangeStart); > {code} > which means the new serial number could be updated to a range that belongs to > a different NameNode, thus increasing the chance of collision again. > When the collision happens, DataNodes could overwrite an existing key which > will cause clients to fail because of {{InvalidToken}} error. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2175) Propagate System Exceptions from the OzoneManager
[ https://issues.apache.org/jira/browse/HDDS-2175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941456#comment-16941456 ] Arpit Agarwal commented on HDDS-2175: - bq. it is hard to parse these exceptions even when they are part of normal log files. And yet these exceptions are a godsend. I would rather see one exception than 10 obscure log messages since it tells me exactly when something 'exceptional' happened and the code path leading to the occurrence. bq. If we add exceptions to those strings, the human readability of those error messages goes down. The readability goes up. You now actually get a sense for what actually went wrong instead of some generic message. bq. I had a chat with Supratim Deka and I said that I am all for increasing the fidelity of the error codes, that is we can add more error codes if we want to fine tune these messages. Lot more work with inferior results. Error codes are terrible in layered systems [since multiple layers will often wind up translating codes|https://twitter.com/Obdurodon/status/1161700056740876289]. The only way to maintain full fidelity is add a new error code for every single failure path, an impossible task. Instead just present the original exception as it happened. This is friendlier for your end users and painless for developers. bq. I prefer a clear, simple contract between the server and client, I think it makes it easier for future clients to be developed more easily. Exceptions as added here will make development of future clients super easy. Since the exception is stringified and propagated over the wire, all the client has to do is print the string without any interpretation. The fears seems unfounded to me. > Propagate System Exceptions from the OzoneManager > - > > Key: HDDS-2175 > URL: https://issues.apache.org/jira/browse/HDDS-2175 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Manager >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > Exceptions encountered while processing requests on the OM are categorized as > business exceptions and system exceptions. All of the business exceptions are > captured as OMException and have an associated status code which is returned > to the client. The handling of these is not going to be changed. > Currently system exceptions are returned as INTERNAL ERROR to the client with > a 1 line message string from the exception. The scope of this jira is to > capture system exceptions and propagate the related information(including the > complete stack trace) back to the client. > There are 3 sub-tasks required to achieve this > 1. Separate capture and handling for OMException and the other > exceptions(IOException). For system exceptions, use Hadoop IPC > ServiceException mechanism to send the stack trace to the client. > 2. track and propagate exceptions inside Ratis OzoneManagerStateMachine and > propagate up to the OzoneManager layer (on the leader). Currently, these > exceptions are not being tracked. > 3. Handle and propagate exceptions from Ratis. > Will raise jira for each sub-task. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2212) Genconf tool should generate config files for secure cluster setup
Dinesh Chitlangia created HDDS-2212: --- Summary: Genconf tool should generate config files for secure cluster setup Key: HDDS-2212 URL: https://issues.apache.org/jira/browse/HDDS-2212 Project: Hadoop Distributed Data Store Issue Type: Improvement Reporter: Dinesh Chitlangia Ozone Genconf tool currently generates a minimal ozone-site.xml file. [~raje2411] was trying out a secure ozone setup over existing HDP-2.x cluster and found the config set up was not as straight forward. This jira proposes to extend the Genconf tool so we can generate required template config files for a secure setup. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2212) Genconf tool should generate config files for secure cluster setup
[ https://issues.apache.org/jira/browse/HDDS-2212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Chitlangia updated HDDS-2212: Component/s: Tools > Genconf tool should generate config files for secure cluster setup > -- > > Key: HDDS-2212 > URL: https://issues.apache.org/jira/browse/HDDS-2212 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Tools >Reporter: Dinesh Chitlangia >Priority: Major > Labels: newbie > > Ozone Genconf tool currently generates a minimal ozone-site.xml file. > [~raje2411] was trying out a secure ozone setup over existing HDP-2.x cluster > and found the config set up was not as straight forward. > This jira proposes to extend the Genconf tool so we can generate required > template config files for a secure setup. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-1615) ManagedChannel references are being leaked in ReplicationSupervisor.java
[ https://issues.apache.org/jira/browse/HDDS-1615?focusedWorklogId=320976=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320976 ] ASF GitHub Bot logged work on HDDS-1615: Author: ASF GitHub Bot Created on: 30/Sep/19 23:40 Start Date: 30/Sep/19 23:40 Worklog Time Spent: 10m Work Description: anuengineer commented on issue #1547: HDDS-1615. ManagedChannel references are being leaked in ReplicationS… URL: https://github.com/apache/hadoop/pull/1547#issuecomment-536795997 +1. LGTM. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320976) Time Spent: 0.5h (was: 20m) > ManagedChannel references are being leaked in ReplicationSupervisor.java > > > Key: HDDS-1615 > URL: https://issues.apache.org/jira/browse/HDDS-1615 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Affects Versions: 0.4.0 >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh >Priority: Major > Labels: MiniOzoneChaosCluster, pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > ManagedChannel references are being leaked in ReplicationSupervisor.java > {code} > May 30, 2019 8:10:56 AM > org.apache.ratis.thirdparty.io.grpc.internal.ManagedChannelOrphanWrapper$ManagedChannelReference > cleanQueue > SEVERE: *~*~*~ Channel ManagedChannelImpl{logId=1495, > target=192.168.0.3:49868} was not shutdown properly!!! ~*~*~* > Make sure to call shutdown()/shutdownNow() and wait until > awaitTermination() returns true. > java.lang.RuntimeException: ManagedChannel allocation site > at > org.apache.ratis.thirdparty.io.grpc.internal.ManagedChannelOrphanWrapper$ManagedChannelReference.(ManagedChannelOrphanWrapper.java:103) > at > org.apache.ratis.thirdparty.io.grpc.internal.ManagedChannelOrphanWrapper.(ManagedChannelOrphanWrapper.java:53) > at > org.apache.ratis.thirdparty.io.grpc.internal.ManagedChannelOrphanWrapper.(ManagedChannelOrphanWrapper.java:44) > at > org.apache.ratis.thirdparty.io.grpc.internal.AbstractManagedChannelImplBuilder.build(AbstractManagedChannelImplBuilder.java:411) > at > org.apache.hadoop.ozone.container.replication.GrpcReplicationClient.(GrpcReplicationClient.java:65) > at > org.apache.hadoop.ozone.container.replication.SimpleContainerDownloader.getContainerDataFromReplicas(SimpleContainerDownloader.java:87) > at > org.apache.hadoop.ozone.container.replication.DownloadAndImportReplicator.replicate(DownloadAndImportReplicator.java:118) > at > org.apache.hadoop.ozone.container.replication.ReplicationSupervisor$TaskRunner.run(ReplicationSupervisor.java:115) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-1720) Add ability to configure RocksDB logs for Ozone Manager
[ https://issues.apache.org/jira/browse/HDDS-1720?focusedWorklogId=320977=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320977 ] ASF GitHub Bot logged work on HDDS-1720: Author: ASF GitHub Bot Created on: 30/Sep/19 23:40 Start Date: 30/Sep/19 23:40 Worklog Time Spent: 10m Work Description: anuengineer commented on issue #1538: HDDS-1720 : Add ability to configure RocksDB logs for Ozone Manager. URL: https://github.com/apache/hadoop/pull/1538#issuecomment-536796152 /retest This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320977) Time Spent: 50m (was: 40m) > Add ability to configure RocksDB logs for Ozone Manager > --- > > Key: HDDS-1720 > URL: https://issues.apache.org/jira/browse/HDDS-1720 > Project: Hadoop Distributed Data Store > Issue Type: Task >Reporter: Aravindan Vijayan >Assignee: Aravindan Vijayan >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 50m > Remaining Estimate: 0h > > While doing performance testing, it was seen that there was no way to get > RocksDB logs for Ozone Manager. Along with Rocksdb metrics, this may be a > useful mechanism to understand the health of Rocksdb while investigating > large clusters. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-1720) Add ability to configure RocksDB logs for Ozone Manager
[ https://issues.apache.org/jira/browse/HDDS-1720?focusedWorklogId=320993=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320993 ] ASF GitHub Bot logged work on HDDS-1720: Author: ASF GitHub Bot Created on: 01/Oct/19 00:21 Start Date: 01/Oct/19 00:21 Worklog Time Spent: 10m Work Description: anuengineer commented on issue #1538: HDDS-1720 : Add ability to configure RocksDB logs for Ozone Manager. URL: https://github.com/apache/hadoop/pull/1538#issuecomment-536804802 Can you please add a test case that proves that RocksDB actually produces logs that we can see. There is a log listener class in the Hadoop. You can use them, or create a test and then grep for some of the log statements. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320993) Time Spent: 1h (was: 50m) > Add ability to configure RocksDB logs for Ozone Manager > --- > > Key: HDDS-1720 > URL: https://issues.apache.org/jira/browse/HDDS-1720 > Project: Hadoop Distributed Data Store > Issue Type: Task >Reporter: Aravindan Vijayan >Assignee: Aravindan Vijayan >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 1h > Remaining Estimate: 0h > > While doing performance testing, it was seen that there was no way to get > RocksDB logs for Ozone Manager. Along with Rocksdb metrics, this may be a > useful mechanism to understand the health of Rocksdb while investigating > large clusters. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14754) Erasure Coding : The number of Under-Replicated Blocks never reduced
[ https://issues.apache.org/jira/browse/HDFS-14754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-14754: --- Component/s: ec > Erasure Coding : The number of Under-Replicated Blocks never reduced > - > > Key: HDFS-14754 > URL: https://issues.apache.org/jira/browse/HDFS-14754 > Project: Hadoop HDFS > Issue Type: Bug > Components: ec >Reporter: hemanthboyina >Assignee: hemanthboyina >Priority: Critical > Fix For: 3.3.0 > > Attachments: HDFS-14754-addendum.001.patch, HDFS-14754.001.patch, > HDFS-14754.002.patch, HDFS-14754.003.patch, HDFS-14754.004.patch, > HDFS-14754.005.patch, HDFS-14754.006.patch, HDFS-14754.007.patch, > HDFS-14754.008.patch > > > Using EC RS-3-2, 6 DN > We came accross a scenario where in the EC 5 blocks , same block is > replicated thrice and two blocks got missing > Replicated block was not deleting and missing block is not able to ReConstruct -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10648) Expose Balancer metrics through Metrics2
[ https://issues.apache.org/jira/browse/HDFS-10648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941389#comment-16941389 ] Wei-Chiu Chuang commented on HDFS-10648: Thanks for doing this. Without metrics HDFS-13783 isn't much useful. > Expose Balancer metrics through Metrics2 > > > Key: HDFS-10648 > URL: https://issues.apache.org/jira/browse/HDFS-10648 > Project: Hadoop HDFS > Issue Type: New Feature > Components: balancer mover, metrics >Reporter: Mark Wagner >Assignee: Chen Zhang >Priority: Major > Labels: metrics > > The Balancer currently prints progress information to the console. For > deployments that run the balancer frequently, it would be helpful to collect > those metrics for publishing to the available sinks. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2169) Avoid buffer copies while submitting client requests in Ratis
[ https://issues.apache.org/jira/browse/HDDS-2169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941365#comment-16941365 ] Tsz-wo Sze commented on HDDS-2169: -- Please ignore the previous Jenkins build. It was testing an old patch (o2169_20190923.patch, just removed). > Avoid buffer copies while submitting client requests in Ratis > - > > Key: HDDS-2169 > URL: https://issues.apache.org/jira/browse/HDDS-2169 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Shashikant Banerjee >Assignee: Tsz-wo Sze >Priority: Major > Labels: pull-request-available > Time Spent: 1h 40m > Remaining Estimate: 0h > > Currently, while sending write requests to Ratis from ozone, a protobuf > object containing data encoded and then resultant protobuf is again > converted to a byteString which internally does a copy of the buffer embedded > inside the protobuf again so that it can be submitted over to Ratis client. > Again, while sending the appendRequest as well while building up the > appendRequestProto, it might be again copying the data. The idea here is to > provide client so pass the raw data(stateMachine data) separately to ratis > client without copying overhead. > > {code:java} > private CompletableFuture sendRequestAsync( > ContainerCommandRequestProto request) { > try (Scope scope = GlobalTracer.get() > .buildSpan("XceiverClientRatis." + request.getCmdType().name()) > .startActive(true)) { > ContainerCommandRequestProto finalPayload = > ContainerCommandRequestProto.newBuilder(request) > .setTraceID(TracingUtil.exportCurrentSpan()) > .build(); > boolean isReadOnlyRequest = HddsUtils.isReadOnly(finalPayload); > // finalPayload already has the byteString data embedded. > ByteString byteString = finalPayload.toByteString(); -> It involves a > copy again. > if (LOG.isDebugEnabled()) { > LOG.debug("sendCommandAsync {} {}", isReadOnlyRequest, > sanitizeForDebug(finalPayload)); > } > return isReadOnlyRequest ? > getClient().sendReadOnlyAsync(() -> byteString) : > getClient().sendAsync(() -> byteString); > } > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2169) Avoid buffer copies while submitting client requests in Ratis
[ https://issues.apache.org/jira/browse/HDDS-2169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz-wo Sze updated HDDS-2169: - Attachment: (was: o2169_20190923.patch) > Avoid buffer copies while submitting client requests in Ratis > - > > Key: HDDS-2169 > URL: https://issues.apache.org/jira/browse/HDDS-2169 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Shashikant Banerjee >Assignee: Tsz-wo Sze >Priority: Major > Labels: pull-request-available > Time Spent: 1h 40m > Remaining Estimate: 0h > > Currently, while sending write requests to Ratis from ozone, a protobuf > object containing data encoded and then resultant protobuf is again > converted to a byteString which internally does a copy of the buffer embedded > inside the protobuf again so that it can be submitted over to Ratis client. > Again, while sending the appendRequest as well while building up the > appendRequestProto, it might be again copying the data. The idea here is to > provide client so pass the raw data(stateMachine data) separately to ratis > client without copying overhead. > > {code:java} > private CompletableFuture sendRequestAsync( > ContainerCommandRequestProto request) { > try (Scope scope = GlobalTracer.get() > .buildSpan("XceiverClientRatis." + request.getCmdType().name()) > .startActive(true)) { > ContainerCommandRequestProto finalPayload = > ContainerCommandRequestProto.newBuilder(request) > .setTraceID(TracingUtil.exportCurrentSpan()) > .build(); > boolean isReadOnlyRequest = HddsUtils.isReadOnly(finalPayload); > // finalPayload already has the byteString data embedded. > ByteString byteString = finalPayload.toByteString(); -> It involves a > copy again. > if (LOG.isDebugEnabled()) { > LOG.debug("sendCommandAsync {} {}", isReadOnlyRequest, > sanitizeForDebug(finalPayload)); > } > return isReadOnlyRequest ? > getClient().sendReadOnlyAsync(() -> byteString) : > getClient().sendAsync(() -> byteString); > } > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2175) Propagate System Exceptions from the OzoneManager
[ https://issues.apache.org/jira/browse/HDDS-2175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941392#comment-16941392 ] Anu Engineer commented on HDDS-2175: bq. I feel that call stacks are invaluable when included in the bug report to the developer. I completely agree. As I mentioned in my comment in the Github, they are very useful tools for debugging. But we have to weigh the pros and cons of the approach. Here are some downsides, so I will list them out. 1. Code and Style Consistency - Generally, Errors are propagated via Error code and Message (Goland, C, etc) or Exceptions (Java, C++ etc). When we developed this interface, we choose to go with Error code and Message approach instead of Exceptions. So mixing these different approaches creates very inconsistent code flows. 2. Prevent Java server abstractions from leaking to client side - Java exceptions are very java specific; it is hard to parse these exceptions even when they are part of normal log files. It is difficult to read thru a printed stack to even understand the issue. This gets compounded when Exceptions stack. When we were writing this client interface, we wanted to make sure it is easy to write clients in other languages. A simple, Error code and a message is universal, that all languages understand and easy to write other language clients which can speak this protocol. 3. The current code experience - There are several parts of this code, where the clients print out these messages to the users. If we add exceptions to those strings, the human readability of those error messages goes down. 4. If we want to move to exceptions instead of error codes , it is possible (even though I think our future clients will suffer), but we need to move away from the error/message model. That is lot of work, with very little benefit, other than the fact that we will have a consistent experience and exceptions will flow to the client side. I had a chat with [~sdeka] and I said that I am all for increasing the fidelity of the error codes, that is we can add more error codes if we want to fine tune these messages. I am also all for logging more on the server side. So I am not against the patch, just wanted to avoid *server side Java exceptions crossing over to the client side*. I prefer a clear, simple contract between the server and client, I think it makes it easier for future clients to be developed more easily. > Propagate System Exceptions from the OzoneManager > - > > Key: HDDS-2175 > URL: https://issues.apache.org/jira/browse/HDDS-2175 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Manager >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > Exceptions encountered while processing requests on the OM are categorized as > business exceptions and system exceptions. All of the business exceptions are > captured as OMException and have an associated status code which is returned > to the client. The handling of these is not going to be changed. > Currently system exceptions are returned as INTERNAL ERROR to the client with > a 1 line message string from the exception. The scope of this jira is to > capture system exceptions and propagate the related information(including the > complete stack trace) back to the client. > There are 3 sub-tasks required to achieve this > 1. Separate capture and handling for OMException and the other > exceptions(IOException). For system exceptions, use Hadoop IPC > ServiceException mechanism to send the stack trace to the client. > 2. track and propagate exceptions inside Ratis OzoneManagerStateMachine and > propagate up to the OzoneManager layer (on the leader). Currently, these > exceptions are not being tracked. > 3. Handle and propagate exceptions from Ratis. > Will raise jira for each sub-task. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2213) Reduce key provider loading log level in OzoneFileSystem#getAdditionalTokenIssuers
Xiaoyu Yao created HDDS-2213: Summary: Reduce key provider loading log level in OzoneFileSystem#getAdditionalTokenIssuers Key: HDDS-2213 URL: https://issues.apache.org/jira/browse/HDDS-2213 Project: Hadoop Distributed Data Store Issue Type: Improvement Reporter: Vivek Ratnavel Subramanian OzoneFileSystem#getAdditionalTokenIssuers log an error when secure client tries to collect ozone delegation token to run MR/Spark jobs but ozone file system does not have a kms provider configured. In this case, we simply return null provider here in the code below. This is a benign error and we should reduce the log level to debug level. \{code} KeyProvider keyProvider; try { keyProvider = getKeyProvider(); } catch (IOException ioe) { LOG.error("Error retrieving KeyProvider.", ioe); return null; } {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDDS-2175) Propagate System Exceptions from the OzoneManager
[ https://issues.apache.org/jira/browse/HDDS-2175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941392#comment-16941392 ] Anu Engineer edited comment on HDDS-2175 at 9/30/19 10:54 PM: -- {quote}I feel that call stacks are invaluable when included in the bug report to the developer. {quote} I completely agree. As I mentioned in my comment in the Github, they are very useful tools for debugging. But we have to weigh the pros and cons of the approach. Here are some downsides, so I will list them out. 1. Code and Style Consistency - Generally, Errors are propagated via Error code and Message (Golang, C, etc) or Exceptions (Java, C++ etc). When we developed this interface, we choose to go with Error code and Message approach instead of Exceptions. So mixing these different approaches creates very inconsistent code flows. 2. Prevent Java server abstractions from leaking to client side - Java exceptions are very java specific; it is hard to parse these exceptions even when they are part of normal log files. It is difficult to read thru a printed stack to even understand the issue. This gets compounded when Exceptions stack. When we were writing this client interface, we wanted to make sure it is easy to write clients in other languages. A simple, Error code and a message is universal, that all languages understand and easy to write other language clients which can speak this protocol. 3. The current code experience - There are several parts of this code, where the clients print out these messages to the users. If we add exceptions to those strings, the human readability of those error messages goes down. 4. If we want to move to exceptions instead of error codes , it is possible (even though I think our future clients will suffer), but we need to move away from the error/message model. That is lot of work, with very little benefit, other than the fact that we will have a consistent experience and exceptions will flow to the client side. I had a chat with [~sdeka] and I said that I am all for increasing the fidelity of the error codes, that is we can add more error codes if we want to fine tune these messages. I am also all for logging more on the server side. So I am not against the patch, just wanted to avoid *server side Java exceptions crossing over to the client side*. I prefer a clear, simple contract between the server and client, I think it makes it easier for future clients to be developed more easily. was (Author: anu): bq. I feel that call stacks are invaluable when included in the bug report to the developer. I completely agree. As I mentioned in my comment in the Github, they are very useful tools for debugging. But we have to weigh the pros and cons of the approach. Here are some downsides, so I will list them out. 1. Code and Style Consistency - Generally, Errors are propagated via Error code and Message (Goland, C, etc) or Exceptions (Java, C++ etc). When we developed this interface, we choose to go with Error code and Message approach instead of Exceptions. So mixing these different approaches creates very inconsistent code flows. 2. Prevent Java server abstractions from leaking to client side - Java exceptions are very java specific; it is hard to parse these exceptions even when they are part of normal log files. It is difficult to read thru a printed stack to even understand the issue. This gets compounded when Exceptions stack. When we were writing this client interface, we wanted to make sure it is easy to write clients in other languages. A simple, Error code and a message is universal, that all languages understand and easy to write other language clients which can speak this protocol. 3. The current code experience - There are several parts of this code, where the clients print out these messages to the users. If we add exceptions to those strings, the human readability of those error messages goes down. 4. If we want to move to exceptions instead of error codes , it is possible (even though I think our future clients will suffer), but we need to move away from the error/message model. That is lot of work, with very little benefit, other than the fact that we will have a consistent experience and exceptions will flow to the client side. I had a chat with [~sdeka] and I said that I am all for increasing the fidelity of the error codes, that is we can add more error codes if we want to fine tune these messages. I am also all for logging more on the server side. So I am not against the patch, just wanted to avoid *server side Java exceptions crossing over to the client side*. I prefer a clear, simple contract between the server and client, I think it makes it easier for future clients to be developed more easily. > Propagate System Exceptions from the OzoneManager > - > >
[jira] [Assigned] (HDDS-2213) Reduce key provider loading log level in OzoneFileSystem#getAdditionalTokenIssuers
[ https://issues.apache.org/jira/browse/HDDS-2213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan reassigned HDDS-2213: --- Assignee: Shweta > Reduce key provider loading log level in > OzoneFileSystem#getAdditionalTokenIssuers > -- > > Key: HDDS-2213 > URL: https://issues.apache.org/jira/browse/HDDS-2213 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Vivek Ratnavel Subramanian >Assignee: Shweta >Priority: Minor > > OzoneFileSystem#getAdditionalTokenIssuers log an error when secure client > tries to collect ozone delegation token to run MR/Spark jobs but ozone file > system does not have a kms provider configured. In this case, we simply > return null provider here in the code below. This is a benign error and we > should reduce the log level to debug level. > {code:java} > KeyProvider keyProvider; > try { > keyProvider = getKeyProvider(); } > catch (IOException ioe) { > LOG.error("Error retrieving KeyProvider.", ioe); > return null; > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14814) RBF: RouterQuotaUpdateService supports inherited rule.
[ https://issues.apache.org/jira/browse/HDFS-14814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941451#comment-16941451 ] Jinglun commented on HDFS-14814: Thank [~elgoiri] for your fast reply ! Agree with your comments, especially the first one about setQuota(), it's very reasonable ! Only one question: {quote}I think that in the loop in getGlobalQuota, you could just do the ifs, and not do the if with the break, you will get the same number of comparissons. {quote} Do you mean the code below ? {code:java} Entry entry = pts.lastEntry(); while (entry != null) { String ppath = entry.getKey(); QuotaUsage quota = entry.getValue(); if (nQuota == HdfsConstants.QUOTA_RESET) { nQuota = quota.getQuota(); } if (sQuota == HdfsConstants.QUOTA_RESET) { sQuota = quota.getSpaceQuota(); } entry = pts.lowerEntry(ppath); }{code} In my understood, If I don't break I'll search all the entries even I already got the values for nQuota and sQuota. So I want to break to save some pts.lowerEntry(ppath). Correct me if i'm wrong. Thanks ! > RBF: RouterQuotaUpdateService supports inherited rule. > -- > > Key: HDFS-14814 > URL: https://issues.apache.org/jira/browse/HDFS-14814 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Jinglun >Assignee: Jinglun >Priority: Major > Attachments: HDFS-14814.001.patch, HDFS-14814.002.patch, > HDFS-14814.003.patch, HDFS-14814.004.patch, HDFS-14814.005.patch, > HDFS-14814.006.patch, HDFS-14814.007.patch, HDFS-14814.008.patch, > HDFS-14814.009.patch, HDFS-14814.010.patch > > > I want to add a rule *'The quota should be set the same as the nearest > parent'* to Global Quota. Supposing we have the mount table below. > M1: /dir-a ns0->/dir-a \{nquota=10,squota=20} > M2: /dir-a/dir-b ns1->/dir-b \{nquota=-1,squota=30} > M3: /dir-a/dir-b/dir-c ns2->/dir-c \{nquota=-1,squota=-1} > M4: /dir-d ns3->/dir-d \{nquota=-1,squota=-1} > > The quota for the remote locations on the namespaces should be: > ns0->/dir-a \{nquota=10,squota=20} > ns1->/dir-b \{nquota=10,squota=30} > ns2->/dir-c \{nquota=10,squota=30} > ns3->/dir-d \{nquota=-1,squota=-1} > > The quota of the remote location is set the same as the corresponding > MountTable, and if there is no quota of the MountTable then the quota is set > to the nearest parent MountTable with quota. > > It's easy to implement it. In RouterQuotaUpdateService each time we compute > the currentQuotaUsage, we can get the quota info for each MountTable. We can > do a > check and fix all the MountTable which's quota doesn't match the rule above. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2207) Update Ratis to latest snapshot
[ https://issues.apache.org/jira/browse/HDDS-2207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marton Elek updated HDDS-2207: -- Summary: Update Ratis to latest snapshot (was: Update Ratis to latest snnapshot) > Update Ratis to latest snapshot > --- > > Key: HDDS-2207 > URL: https://issues.apache.org/jira/browse/HDDS-2207 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Client >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 20m > Remaining Estimate: 0h > > This Jira aims to update ozone with latest ratis snapshot which has a crtical > fix for retry behaviour on getting not leader exception in client. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2183) Container and pipline subcommands of scmcli should be grouped
[ https://issues.apache.org/jira/browse/HDDS-2183?focusedWorklogId=320483=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320483 ] ASF GitHub Bot logged work on HDDS-2183: Author: ASF GitHub Bot Created on: 30/Sep/19 13:00 Start Date: 30/Sep/19 13:00 Worklog Time Spent: 10m Work Description: elek commented on pull request #1532: HDDS-2183. Container and pipline subcommands of scmcli should be grouped. URL: https://github.com/apache/hadoop/pull/1532 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320483) Time Spent: 50m (was: 40m) > Container and pipline subcommands of scmcli should be grouped > - > > Key: HDDS-2183 > URL: https://issues.apache.org/jira/browse/HDDS-2183 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Marton Elek >Assignee: YiSheng Lien >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > Once upon an time when we had only a few subcommands under `ozone scmcli` to > manage containers. > > Now we have many admin commands some of them are grouped to a subcommand (eg. > safemode, replicationmanager) some of are not. > > I propose to group the container and pipeline related commands: > > Instead of "ozone scmcli info" use "ozone scmcli container info" > Instead of "ozone scmcli list" use "ozone scmcli container list" > Instead of "ozone scmcli listPipelines" use "ozone scmcli pipeline list" > > And so on... -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2183) Container and pipline subcommands of scmcli should be grouped
[ https://issues.apache.org/jira/browse/HDDS-2183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marton Elek updated HDDS-2183: -- Fix Version/s: 0.5.0 Resolution: Fixed Status: Resolved (was: Patch Available) > Container and pipline subcommands of scmcli should be grouped > - > > Key: HDDS-2183 > URL: https://issues.apache.org/jira/browse/HDDS-2183 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Marton Elek >Assignee: YiSheng Lien >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 50m > Remaining Estimate: 0h > > Once upon an time when we had only a few subcommands under `ozone scmcli` to > manage containers. > > Now we have many admin commands some of them are grouped to a subcommand (eg. > safemode, replicationmanager) some of are not. > > I propose to group the container and pipeline related commands: > > Instead of "ozone scmcli info" use "ozone scmcli container info" > Instead of "ozone scmcli list" use "ozone scmcli container list" > Instead of "ozone scmcli listPipelines" use "ozone scmcli pipeline list" > > And so on... -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2183) Container and pipline subcommands of scmcli should be grouped
[ https://issues.apache.org/jira/browse/HDDS-2183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16940948#comment-16940948 ] Hudson commented on HDDS-2183: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17415 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/17415/]) HDDS-2183. Container and pipline subcommands of scmcli should be grouped (elek: rev d6b0a8da77916ed814c0b04bd5f3a46e8c59268f) * (edit) hadoop-hdds/tools/src/main/java/org/apache/hadoop/hdds/scm/cli/pipeline/ClosePipelineSubcommand.java * (edit) hadoop-hdds/tools/src/main/java/org/apache/hadoop/hdds/scm/cli/container/ListSubcommand.java * (edit) hadoop-hdds/tools/src/main/java/org/apache/hadoop/hdds/scm/cli/pipeline/ListPipelinesSubcommand.java * (edit) hadoop-hdds/tools/src/main/java/org/apache/hadoop/hdds/scm/cli/container/InfoSubcommand.java * (edit) hadoop-hdds/tools/src/main/java/org/apache/hadoop/hdds/scm/cli/pipeline/DeactivatePipelineSubcommand.java * (edit) hadoop-hdds/tools/src/main/java/org/apache/hadoop/hdds/scm/cli/container/DeleteSubcommand.java * (add) hadoop-hdds/tools/src/main/java/org/apache/hadoop/hdds/scm/cli/pipeline/PipelineCommands.java * (edit) hadoop-hdds/tools/src/main/java/org/apache/hadoop/hdds/scm/cli/pipeline/ActivatePipelineSubcommand.java * (add) hadoop-hdds/tools/src/main/java/org/apache/hadoop/hdds/scm/cli/container/ContainerCommands.java * (edit) hadoop-hdds/tools/src/main/java/org/apache/hadoop/hdds/scm/cli/container/CreateSubcommand.java * (edit) hadoop-hdds/tools/src/main/java/org/apache/hadoop/hdds/scm/cli/SCMCLI.java * (edit) hadoop-hdds/tools/src/main/java/org/apache/hadoop/hdds/scm/cli/container/CloseSubcommand.java > Container and pipline subcommands of scmcli should be grouped > - > > Key: HDDS-2183 > URL: https://issues.apache.org/jira/browse/HDDS-2183 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Marton Elek >Assignee: YiSheng Lien >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 50m > Remaining Estimate: 0h > > Once upon an time when we had only a few subcommands under `ozone scmcli` to > manage containers. > > Now we have many admin commands some of them are grouped to a subcommand (eg. > safemode, replicationmanager) some of are not. > > I propose to group the container and pipeline related commands: > > Instead of "ozone scmcli info" use "ozone scmcli container info" > Instead of "ozone scmcli list" use "ozone scmcli container list" > Instead of "ozone scmcli listPipelines" use "ozone scmcli pipeline list" > > And so on... -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-2209) Checkstyle issue in OmUtils on trunk
[ https://issues.apache.org/jira/browse/HDDS-2209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Chitlangia resolved HDDS-2209. - Resolution: Duplicate > Checkstyle issue in OmUtils on trunk > - > > Key: HDDS-2209 > URL: https://issues.apache.org/jira/browse/HDDS-2209 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Marton Elek >Priority: Trivial > Labels: newbie, newbie++ > > HDDS-2174 introduced a new checkstyle error: > {code:java} > hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/OmUtils.java > 49: Unused import - org.apache.hadoop.ozone.om.OMMetadataManager. {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2209) Checkstyle issue in OmUtils on trunk
[ https://issues.apache.org/jira/browse/HDDS-2209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16940959#comment-16940959 ] Dinesh Chitlangia commented on HDDS-2209: - This is addressed by HDDS-2202. Closing this a duplicate. > Checkstyle issue in OmUtils on trunk > - > > Key: HDDS-2209 > URL: https://issues.apache.org/jira/browse/HDDS-2209 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Marton Elek >Priority: Trivial > Labels: newbie, newbie++ > > HDDS-2174 introduced a new checkstyle error: > {code:java} > hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/OmUtils.java > 49: Unused import - org.apache.hadoop.ozone.om.OMMetadataManager. {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2199) In SCMNodeManager dnsToUuidMap cannot track multiple DNs on the same host
[ https://issues.apache.org/jira/browse/HDDS-2199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephen O'Donnell updated HDDS-2199: Status: Patch Available (was: Open) > In SCMNodeManager dnsToUuidMap cannot track multiple DNs on the same host > - > > Key: HDDS-2199 > URL: https://issues.apache.org/jira/browse/HDDS-2199 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Affects Versions: 0.5.0 >Reporter: Stephen O'Donnell >Assignee: Stephen O'Donnell >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Often in test clusters and tests, we start multiple datanodes on the same > host. > In SCMNodeManager.register() there is a map of hostname -> datanode UUID > called dnsToUuidMap. > If several DNs register from the same host, the entry in the map will be > overwritten and the last DN to register will 'win'. > This means that the method getNodeByAddress() does not return the correct > DatanodeDetails object when many hosts are registered from the same address. > This method is only used in SCMBlockProtocolServer.sortDatanodes() to allow > it to see if one of the nodes matches the client, but it need to be used by > the Decommission code. > Perhaps we could change the getNodeByAddress() method to returns a list of > DNs? In normal production clusters, there should only be one returned, but in > test clusters, there may be many. Any code looking for a specific DN entry > would need to iterate the list and match on the port number too, as host:port > would be the unique definition of a datanode. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2199) In SCMNodeManager dnsToUuidMap cannot track multiple DNs on the same host
[ https://issues.apache.org/jira/browse/HDDS-2199?focusedWorklogId=320540=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320540 ] ASF GitHub Bot logged work on HDDS-2199: Author: ASF GitHub Bot Created on: 30/Sep/19 14:54 Start Date: 30/Sep/19 14:54 Worklog Time Spent: 10m Work Description: sodonnel commented on issue #1551: HDDS-2199 In SCMNodeManager dnsToUuidMap cannot track multiple DNs on the same host URL: https://github.com/apache/hadoop/pull/1551#issuecomment-536599549 /label ozone This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320540) Time Spent: 20m (was: 10m) > In SCMNodeManager dnsToUuidMap cannot track multiple DNs on the same host > - > > Key: HDDS-2199 > URL: https://issues.apache.org/jira/browse/HDDS-2199 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Affects Versions: 0.5.0 >Reporter: Stephen O'Donnell >Assignee: Stephen O'Donnell >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Often in test clusters and tests, we start multiple datanodes on the same > host. > In SCMNodeManager.register() there is a map of hostname -> datanode UUID > called dnsToUuidMap. > If several DNs register from the same host, the entry in the map will be > overwritten and the last DN to register will 'win'. > This means that the method getNodeByAddress() does not return the correct > DatanodeDetails object when many hosts are registered from the same address. > This method is only used in SCMBlockProtocolServer.sortDatanodes() to allow > it to see if one of the nodes matches the client, but it need to be used by > the Decommission code. > Perhaps we could change the getNodeByAddress() method to returns a list of > DNs? In normal production clusters, there should only be one returned, but in > test clusters, there may be many. Any code looking for a specific DN entry > would need to iterate the list and match on the port number too, as host:port > would be the unique definition of a datanode. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2034) Async RATIS pipeline creation and destroy through heartbeat commands
[ https://issues.apache.org/jira/browse/HDDS-2034?focusedWorklogId=320541=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320541 ] ASF GitHub Bot logged work on HDDS-2034: Author: ASF GitHub Bot Created on: 30/Sep/19 14:54 Start Date: 30/Sep/19 14:54 Worklog Time Spent: 10m Work Description: lokeshj1703 commented on issue #1469: HDDS-2034. Async RATIS pipeline creation and destroy through heartbea… URL: https://github.com/apache/hadoop/pull/1469#issuecomment-536592541 > I think the purpose of safemode is to guarantee that Ozone cluster is ready to provide service to Ozone client once safemode is exited. @ChenSammi I agree with that. I think the problem occurs with OneReplicaPipelineSafeModeRule. This rule makes sure that atleast one datanode in the old pipeline is reported so that reads for OPEN containers can go through. Here I think that old pipelines need to be tracked separately. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320541) Time Spent: 11h 20m (was: 11h 10m) > Async RATIS pipeline creation and destroy through heartbeat commands > > > Key: HDDS-2034 > URL: https://issues.apache.org/jira/browse/HDDS-2034 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Sammi Chen >Assignee: Sammi Chen >Priority: Major > Labels: pull-request-available > Time Spent: 11h 20m > Remaining Estimate: 0h > > Currently, pipeline creation and destroy are synchronous operations. SCM > directly connect to each datanode of the pipeline through gRPC channel to > create the pipeline to destroy the pipeline. > This task is to remove the gRPC channel, send pipeline creation and destroy > action through heartbeat command to each datanode. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-2207) Update Ratis to latest snapshot
[ https://issues.apache.org/jira/browse/HDDS-2207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh resolved HDDS-2207. - Resolution: Fixed Thanks for working on this [~shashikant]. I have committed this to trunk. > Update Ratis to latest snapshot > --- > > Key: HDDS-2207 > URL: https://issues.apache.org/jira/browse/HDDS-2207 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Client >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 40m > Remaining Estimate: 0h > > This Jira aims to update ozone with latest ratis snapshot which has a crtical > fix for retry behaviour on getting not leader exception in client. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2187) ozone-mr test fails with No FileSystem for scheme "o3fs"
[ https://issues.apache.org/jira/browse/HDDS-2187?focusedWorklogId=320565=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320565 ] ASF GitHub Bot logged work on HDDS-2187: Author: ASF GitHub Bot Created on: 30/Sep/19 15:16 Start Date: 30/Sep/19 15:16 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on issue #1537: HDDS-2187. ozone-mr test fails with No FileSystem for scheme o3fs URL: https://github.com/apache/hadoop/pull/1537#issuecomment-536609986 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | 0 | reexec | 44 | Docker mode activated. | ||| _ Prechecks _ | | +1 | dupname | 1 | No case conflicting files found. | | 0 | yamllint | 0 | yamllint was not available. | | 0 | shelldocs | 0 | Shelldocs was not available. | | +1 | @author | 0 | The patch does not contain any @author tags. | | -1 | test4tests | 0 | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | ||| _ trunk Compile Tests _ | | 0 | mvndep | 16 | Maven dependency ordering for branch | | -1 | mvninstall | 37 | hadoop-hdds in trunk failed. | | -1 | mvninstall | 40 | hadoop-ozone in trunk failed. | | -1 | compile | 23 | hadoop-hdds in trunk failed. | | -1 | compile | 20 | hadoop-ozone in trunk failed. | | +1 | mvnsite | 0 | trunk passed | | +1 | shadedclient | 817 | branch has no errors when building and testing our client artifacts. | | -1 | javadoc | 25 | hadoop-hdds in trunk failed. | | -1 | javadoc | 24 | hadoop-ozone in trunk failed. | ||| _ Patch Compile Tests _ | | 0 | mvndep | 22 | Maven dependency ordering for patch | | -1 | mvninstall | 40 | hadoop-hdds in the patch failed. | | -1 | mvninstall | 41 | hadoop-ozone in the patch failed. | | -1 | compile | 30 | hadoop-hdds in the patch failed. | | -1 | compile | 24 | hadoop-ozone in the patch failed. | | -1 | javac | 30 | hadoop-hdds in the patch failed. | | -1 | javac | 24 | hadoop-ozone in the patch failed. | | +1 | mvnsite | 0 | the patch passed | | +1 | shellcheck | 0 | There were no new shellcheck issues. | | +1 | whitespace | 0 | The patch has no whitespace issues. | | +1 | shadedclient | 761 | patch has no errors when building and testing our client artifacts. | | -1 | javadoc | 27 | hadoop-hdds in the patch failed. | | -1 | javadoc | 23 | hadoop-ozone in the patch failed. | ||| _ Other Tests _ | | -1 | unit | 30 | hadoop-hdds in the patch failed. | | -1 | unit | 33 | hadoop-ozone in the patch failed. | | +1 | asflicense | 39 | The patch does not generate ASF License warnings. | | | | 2291 | | | Subsystem | Report/Notes | |--:|:-| | Docker | Client=19.03.1 Server=19.03.1 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1537/3/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/1537 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient yamllint shellcheck shelldocs | | uname | Linux 96c3e820363b 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | personality/hadoop.sh | | git revision | trunk / b46d823 | | Default Java | 1.8.0_222 | | mvninstall | https://builds.apache.org/job/hadoop-multibranch/job/PR-1537/3/artifact/out/branch-mvninstall-hadoop-hdds.txt | | mvninstall | https://builds.apache.org/job/hadoop-multibranch/job/PR-1537/3/artifact/out/branch-mvninstall-hadoop-ozone.txt | | compile | https://builds.apache.org/job/hadoop-multibranch/job/PR-1537/3/artifact/out/branch-compile-hadoop-hdds.txt | | compile | https://builds.apache.org/job/hadoop-multibranch/job/PR-1537/3/artifact/out/branch-compile-hadoop-ozone.txt | | javadoc | https://builds.apache.org/job/hadoop-multibranch/job/PR-1537/3/artifact/out/branch-javadoc-hadoop-hdds.txt | | javadoc | https://builds.apache.org/job/hadoop-multibranch/job/PR-1537/3/artifact/out/branch-javadoc-hadoop-ozone.txt | | mvninstall | https://builds.apache.org/job/hadoop-multibranch/job/PR-1537/3/artifact/out/patch-mvninstall-hadoop-hdds.txt | | mvninstall | https://builds.apache.org/job/hadoop-multibranch/job/PR-1537/3/artifact/out/patch-mvninstall-hadoop-ozone.txt | | compile | https://builds.apache.org/job/hadoop-multibranch/job/PR-1537/3/artifact/out/patch-compile-hadoop-hdds.txt | | compile | https://builds.apache.org/job/hadoop-multibranch/job/PR-1537/3/artifact/out/patch-compile-hadoop-ozone.txt | | javac |
[jira] [Commented] (HDDS-2207) Update Ratis to latest snapshot
[ https://issues.apache.org/jira/browse/HDDS-2207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941051#comment-16941051 ] Hudson commented on HDDS-2207: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17418 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/17418/]) HDDS-2207. Update Ratis to latest snapshot. Contributed by Shashikant (msingh: rev 98ca07ebed2ae3d7e41e5029b5bba6d089d41d43) * (edit) hadoop-hdds/pom.xml * (edit) hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/XceiverServerRatis.java * (edit) hadoop-ozone/pom.xml > Update Ratis to latest snapshot > --- > > Key: HDDS-2207 > URL: https://issues.apache.org/jira/browse/HDDS-2207 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Client >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 40m > Remaining Estimate: 0h > > This Jira aims to update ozone with latest ratis snapshot which has a crtical > fix for retry behaviour on getting not leader exception in client. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-2152) Ozone client fails with OOM while writing a large (~300MB) key.
[ https://issues.apache.org/jira/browse/HDDS-2152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shashikant Banerjee reassigned HDDS-2152: - Assignee: Shashikant Banerjee > Ozone client fails with OOM while writing a large (~300MB) key. > --- > > Key: HDDS-2152 > URL: https://issues.apache.org/jira/browse/HDDS-2152 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Client >Reporter: Aravindan Vijayan >Assignee: Shashikant Banerjee >Priority: Major > Attachments: largekey.png > > > {code} > dd if=/dev/zero of=testfile bs=1024 count=307200 > ozone sh key put /vol1/bucket1/key testfile > {code} > {code} > Exception in thread "main" java.lang.OutOfMemoryError: Java heap space at > java.nio.HeapByteBuffer.(HeapByteBuffer.java:57) at > java.nio.ByteBuffer.allocate(ByteBuffer.java:335) at > org.apache.hadoop.hdds.scm.storage.BufferPool.allocateBufferIfNeeded(BufferPool.java:66) > at > org.apache.hadoop.hdds.scm.storage.BlockOutputStream.write(BlockOutputStream.java:234) > at > org.apache.hadoop.ozone.client.io.BlockOutputStreamEntry.write(BlockOutputStreamEntry.java:129) > at > org.apache.hadoop.ozone.client.io.KeyOutputStream.handleWrite(KeyOutputStream.java:211) > at > org.apache.hadoop.ozone.client.io.KeyOutputStream.write(KeyOutputStream.java:193) > at > org.apache.hadoop.ozone.client.io.OzoneOutputStream.write(OzoneOutputStream.java:49) > at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:96) at > org.apache.hadoop.ozone.web.ozShell.keys.PutKeyHandler.call(PutKeyHandler.java:117) > at > org.apache.hadoop.ozone.web.ozShell.keys.PutKeyHandler.call(PutKeyHandler.java:55) > at picocli.CommandLine.execute(CommandLine.java:1173) at > picocli.CommandLine.access$800(CommandLine.java:141) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2207) Update Ratis to latest snapshot
[ https://issues.apache.org/jira/browse/HDDS-2207?focusedWorklogId=320549=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320549 ] ASF GitHub Bot logged work on HDDS-2207: Author: ASF GitHub Bot Created on: 30/Sep/19 14:58 Start Date: 30/Sep/19 14:58 Worklog Time Spent: 10m Work Description: lokeshj1703 commented on issue #1550: HDDS-2207. Update Ratis to latest snapshot. URL: https://github.com/apache/hadoop/pull/1550#issuecomment-536601731 @bshashikant Thanks for working on this! The changes look good to me. +1. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320549) Time Spent: 0.5h (was: 20m) > Update Ratis to latest snapshot > --- > > Key: HDDS-2207 > URL: https://issues.apache.org/jira/browse/HDDS-2207 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Client >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > This Jira aims to update ozone with latest ratis snapshot which has a crtical > fix for retry behaviour on getting not leader exception in client. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2207) Update Ratis to latest snapshot
[ https://issues.apache.org/jira/browse/HDDS-2207?focusedWorklogId=320555=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320555 ] ASF GitHub Bot logged work on HDDS-2207: Author: ASF GitHub Bot Created on: 30/Sep/19 15:06 Start Date: 30/Sep/19 15:06 Worklog Time Spent: 10m Work Description: mukul1987 commented on pull request #1550: HDDS-2207. Update Ratis to latest snapshot. URL: https://github.com/apache/hadoop/pull/1550 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320555) Time Spent: 40m (was: 0.5h) > Update Ratis to latest snapshot > --- > > Key: HDDS-2207 > URL: https://issues.apache.org/jira/browse/HDDS-2207 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Client >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 40m > Remaining Estimate: 0h > > This Jira aims to update ozone with latest ratis snapshot which has a crtical > fix for retry behaviour on getting not leader exception in client. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2199) In SCMNodeManager dnsToUuidMap cannot track multiple DNs on the same host
[ https://issues.apache.org/jira/browse/HDDS-2199?focusedWorklogId=320588=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320588 ] ASF GitHub Bot logged work on HDDS-2199: Author: ASF GitHub Bot Created on: 30/Sep/19 15:51 Start Date: 30/Sep/19 15:51 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on issue #1551: HDDS-2199 In SCMNodeManager dnsToUuidMap cannot track multiple DNs on the same host URL: https://github.com/apache/hadoop/pull/1551#issuecomment-536626155 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | 0 | reexec | 46 | Docker mode activated. | ||| _ Prechecks _ | | +1 | dupname | 1 | No case conflicting files found. | | +1 | @author | 0 | The patch does not contain any @author tags. | | +1 | test4tests | 0 | The patch appears to include 3 new or modified test files. | ||| _ trunk Compile Tests _ | | -1 | mvninstall | 48 | hadoop-hdds in trunk failed. | | -1 | mvninstall | 44 | hadoop-ozone in trunk failed. | | -1 | compile | 21 | hadoop-hdds in trunk failed. | | -1 | compile | 15 | hadoop-ozone in trunk failed. | | +1 | checkstyle | 61 | trunk passed | | +1 | mvnsite | 0 | trunk passed | | +1 | shadedclient | 866 | branch has no errors when building and testing our client artifacts. | | -1 | javadoc | 24 | hadoop-hdds in trunk failed. | | -1 | javadoc | 19 | hadoop-ozone in trunk failed. | | 0 | spotbugs | 978 | Used deprecated FindBugs config; considering switching to SpotBugs. | | -1 | findbugs | 46 | hadoop-hdds in trunk failed. | | -1 | findbugs | 18 | hadoop-ozone in trunk failed. | ||| _ Patch Compile Tests _ | | -1 | mvninstall | 32 | hadoop-hdds in the patch failed. | | -1 | mvninstall | 38 | hadoop-ozone in the patch failed. | | -1 | compile | 24 | hadoop-hdds in the patch failed. | | -1 | compile | 17 | hadoop-ozone in the patch failed. | | -1 | javac | 24 | hadoop-hdds in the patch failed. | | -1 | javac | 17 | hadoop-ozone in the patch failed. | | -0 | checkstyle | 30 | hadoop-hdds: The patch generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) | | +1 | mvnsite | 0 | the patch passed | | +1 | whitespace | 0 | The patch has no whitespace issues. | | +1 | shadedclient | 713 | patch has no errors when building and testing our client artifacts. | | -1 | javadoc | 21 | hadoop-hdds in the patch failed. | | -1 | javadoc | 21 | hadoop-ozone in the patch failed. | | -1 | findbugs | 30 | hadoop-hdds in the patch failed. | | -1 | findbugs | 22 | hadoop-ozone in the patch failed. | ||| _ Other Tests _ | | -1 | unit | 25 | hadoop-hdds in the patch failed. | | -1 | unit | 29 | hadoop-ozone in the patch failed. | | +1 | asflicense | 35 | The patch does not generate ASF License warnings. | | | | 2401 | | | Subsystem | Report/Notes | |--:|:-| | Docker | Client=19.03.1 Server=19.03.1 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1551/1/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/1551 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux e049700d492e 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | personality/hadoop.sh | | git revision | trunk / 98ca07e | | Default Java | 1.8.0_222 | | mvninstall | https://builds.apache.org/job/hadoop-multibranch/job/PR-1551/1/artifact/out/branch-mvninstall-hadoop-hdds.txt | | mvninstall | https://builds.apache.org/job/hadoop-multibranch/job/PR-1551/1/artifact/out/branch-mvninstall-hadoop-ozone.txt | | compile | https://builds.apache.org/job/hadoop-multibranch/job/PR-1551/1/artifact/out/branch-compile-hadoop-hdds.txt | | compile | https://builds.apache.org/job/hadoop-multibranch/job/PR-1551/1/artifact/out/branch-compile-hadoop-ozone.txt | | javadoc | https://builds.apache.org/job/hadoop-multibranch/job/PR-1551/1/artifact/out/branch-javadoc-hadoop-hdds.txt | | javadoc | https://builds.apache.org/job/hadoop-multibranch/job/PR-1551/1/artifact/out/branch-javadoc-hadoop-ozone.txt | | findbugs | https://builds.apache.org/job/hadoop-multibranch/job/PR-1551/1/artifact/out/branch-findbugs-hadoop-hdds.txt | | findbugs | https://builds.apache.org/job/hadoop-multibranch/job/PR-1551/1/artifact/out/branch-findbugs-hadoop-ozone.txt | | mvninstall | https://builds.apache.org/job/hadoop-multibranch/job/PR-1551/1/artifact/out/patch-mvninstall-hadoop-hdds.txt | | mvninstall | https://builds.apache.org/job/hadoop-multibranch/job/PR-1551/1/artifact/out/patch-mvninstall-hadoop-ozone.txt | |
[jira] [Work logged] (HDDS-2169) Avoid buffer copies while submitting client requests in Ratis
[ https://issues.apache.org/jira/browse/HDDS-2169?focusedWorklogId=320635=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320635 ] ASF GitHub Bot logged work on HDDS-2169: Author: ASF GitHub Bot Created on: 30/Sep/19 16:33 Start Date: 30/Sep/19 16:33 Worklog Time Spent: 10m Work Description: bshashikant commented on issue #1517: HDDS-2169 URL: https://github.com/apache/hadoop/pull/1517#issuecomment-536642924 Thanks @szetszwo for working on this. With the patch, while running the tests in TestDataValidateWithUnsafeByteOperations, the below issue is observed. `2019-09-30 21:58:02,745 [grpc-default-executor-2] ERROR segmented.SegmentedRaftLogWorker (SegmentedRaftLogWorker.java:(449)) - e4ab8454-30fe-420c-a1cf-40d223cb4898@group-D0335C23E8DA-SegmentedRaftLogWorker: writeStateMachineData failed for index 1, entry=(t:1, i:1), STATEMACHINELOGENTRY, client-6C45A0D09519, cid=8 java.lang.IndexOutOfBoundsException: End index: 135008824 >= 207 at org.apache.ratis.thirdparty.com.google.protobuf.ByteString.checkRange(ByteString.java:1233) at org.apache.ratis.thirdparty.com.google.protobuf.ByteString$LiteralByteString.substring(ByteString.java:1288) at org.apache.hadoop.hdds.ratis.ContainerCommandRequestMessage.toProto(ContainerCommandRequestMessage.java:66) at org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.getContainerCommandRequestProto(ContainerStateMachine.java:375) at org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.writeStateMachineData(ContainerStateMachine.java:494) at org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker$WriteLog.(SegmentedRaftLogWorker.java:447) at org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker.writeLogEntry(SegmentedRaftLogWorker.java:397) at org.apache.ratis.server.raftlog.segmented.SegmentedRaftLog.appendEntryImpl(SegmentedRaftLog.java:411) at org.apache.ratis.server.raftlog.RaftLog.lambda$appendEntry$10(RaftLog.java:359) at org.apache.ratis.server.raftlog.RaftLogSequentialOps$Runner.runSequentially(RaftLogSequentialOps.java:77) at org.apache.ratis.server.raftlog.RaftLog.appendEntry(RaftLog.java:359) at org.apache.ratis.server.raftlog.RaftLog.appendImpl(RaftLog.java:183) at org.apache.ratis.server.raftlog.RaftLog.lambda$append$2(RaftLog.java:159) at org.apache.ratis.server.raftlog.RaftLogSequentialOps$Runner.runSequentially(RaftLogSequentialOps.java:68) at org.apache.ratis.server.raftlog.RaftLog.append(RaftLog.java:159) at org.apache.ratis.server.impl.ServerState.appendLog(ServerState.java:282) at org.apache.ratis.server.impl.RaftServerImpl.appendTransaction(RaftServerImpl.java:505) at org.apache.ratis.server.impl.RaftServerImpl.submitClientRequestAsync(RaftServerImpl.java:576) at org.apache.ratis.server.impl.RaftServerProxy.lambda$submitClientRequestAsync$7(RaftServerProxy.java:333) at org.apache.ratis.server.impl.RaftServerProxy.lambda$null$5(RaftServerProxy.java:328) at org.apache.ratis.util.JavaUtils.callAsUnchecked(JavaUtils.java:109) at org.apache.ratis.server.impl.RaftServerProxy.lambda$submitRequest$6(RaftServerProxy.java:328) at java.util.concurrent.CompletableFuture.uniComposeStage(CompletableFuture.java:981) at java.util.concurrent.CompletableFuture.thenCompose(CompletableFuture.java:2124) at org.apache.ratis.server.impl.RaftServerProxy.submitRequest(RaftServerProxy.java:327) at org.apache.ratis.server.impl.RaftServerProxy.submitClientRequestAsync(RaftServerProxy.java:333) at org.apache.ratis.grpc.client.GrpcClientProtocolService$RequestStreamObserver.processClientRequest(GrpcClientProtocolService.java:220) at org.apache.ratis.grpc.client.GrpcClientProtocolService$OrderedRequestStreamObserver.processClientRequest(GrpcClientProtocolService.java:326) at org.apache.ratis.util.SlidingWindow$Server.processRequestsFromHead(SlidingWindow.java:429) at org.apache.ratis.util.SlidingWindow$Server.receivedRequest(SlidingWindow.java:421) at org.apache.ratis.grpc.client.GrpcClientProtocolService$OrderedRequestStreamObserver.processClientRequest(GrpcClientProtocolService.java:345) at org.apache.ratis.grpc.client.GrpcClientProtocolService$RequestStreamObserver.onNext(GrpcClientProtocolService.java:240) at org.apache.ratis.grpc.client.GrpcClientProtocolService$RequestStreamObserver.onNext(GrpcClientProtocolService.java:168) at org.apache.ratis.thirdparty.io.grpc.stub.ServerCalls$StreamingServerCallHandler$StreamingServerCallListener.onMessage(ServerCalls.java:248) at
[jira] [Work logged] (HDDS-2169) Avoid buffer copies while submitting client requests in Ratis
[ https://issues.apache.org/jira/browse/HDDS-2169?focusedWorklogId=320632=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320632 ] ASF GitHub Bot logged work on HDDS-2169: Author: ASF GitHub Bot Created on: 30/Sep/19 16:32 Start Date: 30/Sep/19 16:32 Worklog Time Spent: 10m Work Description: bshashikant commented on issue #1517: HDDS-2169 URL: https://github.com/apache/hadoop/pull/1517#issuecomment-536642924 Thanks @szetszwo for working on this. With the patch, while running the tests in TestDataValidateWithUnsafeByteOperations, the below issue is observed. `2019-09-30 21:58:02,745 [grpc-default-executor-2] ERROR segmented.SegmentedRaftLogWorker (SegmentedRaftLogWorker.java:(449)) - e4ab8454-30fe-420c-a1cf-40d223cb4898@group-D0335C23E8DA-SegmentedRaftLogWorker: writeStateMachineData failed for index 1, entry=(t:1, i:1), STATEMACHINELOGENTRY, client-6C45A0D09519, cid=8 java.lang.IndexOutOfBoundsException: End index: 135008824 >= 207 at org.apache.ratis.thirdparty.com.google.protobuf.ByteString.checkRange(ByteString.java:1233) at org.apache.ratis.thirdparty.com.google.protobuf.ByteString$LiteralByteString.substring(ByteString.java:1288) at org.apache.hadoop.hdds.ratis.ContainerCommandRequestMessage.toProto(ContainerCommandRequestMessage.java:66) at org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.getContainerCommandRequestProto(ContainerStateMachine.java:375) at org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.writeStateMachineData(ContainerStateMachine.java:494) at org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker$WriteLog.(SegmentedRaftLogWorker.java:447) at org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker.writeLogEntry(SegmentedRaftLogWorker.java:397) at org.apache.ratis.server.raftlog.segmented.SegmentedRaftLog.appendEntryImpl(SegmentedRaftLog.java:411) at org.apache.ratis.server.raftlog.RaftLog.lambda$appendEntry$10(RaftLog.java:359) at org.apache.ratis.server.raftlog.RaftLogSequentialOps$Runner.runSequentially(RaftLogSequentialOps.java:77) at org.apache.ratis.server.raftlog.RaftLog.appendEntry(RaftLog.java:359) at org.apache.ratis.server.raftlog.RaftLog.appendImpl(RaftLog.java:183) at org.apache.ratis.server.raftlog.RaftLog.lambda$append$2(RaftLog.java:159) at org.apache.ratis.server.raftlog.RaftLogSequentialOps$Runner.runSequentially(RaftLogSequentialOps.java:68) at org.apache.ratis.server.raftlog.RaftLog.append(RaftLog.java:159) at org.apache.ratis.server.impl.ServerState.appendLog(ServerState.java:282) at org.apache.ratis.server.impl.RaftServerImpl.appendTransaction(RaftServerImpl.java:505) at org.apache.ratis.server.impl.RaftServerImpl.submitClientRequestAsync(RaftServerImpl.java:576) at org.apache.ratis.server.impl.RaftServerProxy.lambda$submitClientRequestAsync$7(RaftServerProxy.java:333) at org.apache.ratis.server.impl.RaftServerProxy.lambda$null$5(RaftServerProxy.java:328) at org.apache.ratis.util.JavaUtils.callAsUnchecked(JavaUtils.java:109) at org.apache.ratis.server.impl.RaftServerProxy.lambda$submitRequest$6(RaftServerProxy.java:328) at java.util.concurrent.CompletableFuture.uniComposeStage(CompletableFuture.java:981) at java.util.concurrent.CompletableFuture.thenCompose(CompletableFuture.java:2124) at org.apache.ratis.server.impl.RaftServerProxy.submitRequest(RaftServerProxy.java:327) at org.apache.ratis.server.impl.RaftServerProxy.submitClientRequestAsync(RaftServerProxy.java:333) at org.apache.ratis.grpc.client.GrpcClientProtocolService$RequestStreamObserver.processClientRequest(GrpcClientProtocolService.java:220) at org.apache.ratis.grpc.client.GrpcClientProtocolService$OrderedRequestStreamObserver.processClientRequest(GrpcClientProtocolService.java:326) at org.apache.ratis.util.SlidingWindow$Server.processRequestsFromHead(SlidingWindow.java:429) at org.apache.ratis.util.SlidingWindow$Server.receivedRequest(SlidingWindow.java:421) at org.apache.ratis.grpc.client.GrpcClientProtocolService$OrderedRequestStreamObserver.processClientRequest(GrpcClientProtocolService.java:345) at org.apache.ratis.grpc.client.GrpcClientProtocolService$RequestStreamObserver.onNext(GrpcClientProtocolService.java:240) at org.apache.ratis.grpc.client.GrpcClientProtocolService$RequestStreamObserver.onNext(GrpcClientProtocolService.java:168) at org.apache.ratis.thirdparty.io.grpc.stub.ServerCalls$StreamingServerCallHandler$StreamingServerCallListener.onMessage(ServerCalls.java:248) at
[jira] [Created] (HDFS-14883) NPE when the second SNN is starting
Ranith Sardar created HDFS-14883: Summary: NPE when the second SNN is starting Key: HDFS-14883 URL: https://issues.apache.org/jira/browse/HDFS-14883 Project: Hadoop HDFS Issue Type: Bug Reporter: Ranith Sardar Assignee: Ranith Sardar {{2019-09-25 22:41:31,889 | WARN | qtp79782883-47 | /imagetransfer | ServletHandler.java:632 java.io.IOException: PutImage failed. java.lang.NullPointerException at org.apache.hadoop.hdfs.server.namenode.ImageServlet.validateRequest(ImageServlet.java:198) at org.apache.hadoop.hdfs.server.namenode.ImageServlet.doPut(ImageServlet.java:485) at javax.servlet.http.HttpServlet.service(HttpServlet.java:710) at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:848) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1772) at org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:644)}} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14883) NPE when the second SNN is starting
[ https://issues.apache.org/jira/browse/HDFS-14883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ranith Sardar updated HDFS-14883: - Description: {{| WARN | qtp79782883-47 | /imagetransfer | ServletHandler.java:632 java.io.IOException: PutImage failed. java.lang.NullPointerException at org.apache.hadoop.hdfs.server.namenode.ImageServlet.validateRequest(ImageServlet.java:198) at org.apache.hadoop.hdfs.server.namenode.ImageServlet.doPut(ImageServlet.java:485) at javax.servlet.http.HttpServlet.service(HttpServlet.java:710) at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:848) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1772) at org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:644)}} was: {{2019-09-25 22:41:31,889 | WARN | qtp79782883-47 | /imagetransfer | ServletHandler.java:632 java.io.IOException: PutImage failed. java.lang.NullPointerException at org.apache.hadoop.hdfs.server.namenode.ImageServlet.validateRequest(ImageServlet.java:198) at org.apache.hadoop.hdfs.server.namenode.ImageServlet.doPut(ImageServlet.java:485) at javax.servlet.http.HttpServlet.service(HttpServlet.java:710) at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:848) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1772) at org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:644)}} > NPE when the second SNN is starting > --- > > Key: HDFS-14883 > URL: https://issues.apache.org/jira/browse/HDFS-14883 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ranith Sardar >Assignee: Ranith Sardar >Priority: Major > > > {{| WARN | qtp79782883-47 | /imagetransfer | ServletHandler.java:632 > java.io.IOException: PutImage failed. java.lang.NullPointerException > at > org.apache.hadoop.hdfs.server.namenode.ImageServlet.validateRequest(ImageServlet.java:198) > at > org.apache.hadoop.hdfs.server.namenode.ImageServlet.doPut(ImageServlet.java:485) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:710) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) > at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:848) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1772) > at > org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:644)}} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14882) Consider DataNode load when #getBlockLocation
[ https://issues.apache.org/jira/browse/HDFS-14882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16940961#comment-16940961 ] Hadoop QA commented on HDFS-14882: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 56s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 4s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 44s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 13s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 13s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 38s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 3 new + 69 unchanged - 0 fixed = 72 total (was 69) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 8s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 10s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 96m 23s{color} | {color:green} hadoop-hdfs in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 35s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}158m 22s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=18.09.7 Server=18.09.7 Image:yetus/hadoop:efed4450bf1 | | JIRA Issue | HDFS-14882 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12981779/HDFS-14882.001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 301c67b6e3da 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 760b523 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/27988/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/27988/testReport/ | | Max. process+thread count | 2761 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/27988/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org |
[jira] [Commented] (HDDS-2153) Add a config to tune max pending requests in Ratis leader
[ https://issues.apache.org/jira/browse/HDDS-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16940969#comment-16940969 ] Hudson commented on HDDS-2153: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17416 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/17416/]) HDDS-2153. Add a config to tune max pending requests in Ratis leader (elek: rev a530ac3f50d71c608235168acefe2f8eb1753131) * (edit) hadoop-hdds/common/src/main/resources/ozone-default.xml * (edit) hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/scm/ScmConfigKeys.java * (edit) hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/XceiverServerRatis.java * (edit) hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/OzoneConfigKeys.java > Add a config to tune max pending requests in Ratis leader > - > > Key: HDDS-2153 > URL: https://issues.apache.org/jira/browse/HDDS-2153 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Affects Versions: 0.5.0 >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2202) Remove unused import in OmUtils
[ https://issues.apache.org/jira/browse/HDDS-2202?focusedWorklogId=320511=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320511 ] ASF GitHub Bot logged work on HDDS-2202: Author: ASF GitHub Bot Created on: 30/Sep/19 14:02 Start Date: 30/Sep/19 14:02 Worklog Time Spent: 10m Work Description: elek commented on pull request #1543: HDDS-2202. Remove unused import in OmUtils URL: https://github.com/apache/hadoop/pull/1543 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320511) Time Spent: 50m (was: 40m) > Remove unused import in OmUtils > --- > > Key: HDDS-2202 > URL: https://issues.apache.org/jira/browse/HDDS-2202 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Dinesh Chitlangia >Assignee: Dinesh Chitlangia >Priority: Minor > Labels: newbie, pull-request-available > Fix For: 0.5.0 > > Time Spent: 50m > Remaining Estimate: 0h > > Fix hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/OmUtils.java > Remove L49: Unused import - org.apache.hadoop.ozone.om.OMMetadataManager; > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2202) Remove unused import in OmUtils
[ https://issues.apache.org/jira/browse/HDDS-2202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marton Elek updated HDDS-2202: -- Fix Version/s: 0.5.0 Resolution: Fixed Status: Resolved (was: Patch Available) > Remove unused import in OmUtils > --- > > Key: HDDS-2202 > URL: https://issues.apache.org/jira/browse/HDDS-2202 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Dinesh Chitlangia >Assignee: Dinesh Chitlangia >Priority: Minor > Labels: newbie, pull-request-available > Fix For: 0.5.0 > > Time Spent: 50m > Remaining Estimate: 0h > > Fix hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/OmUtils.java > Remove L49: Unused import - org.apache.hadoop.ozone.om.OMMetadataManager; > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2199) In SCMNodeManager dnsToUuidMap cannot track multiple DNs on the same host
[ https://issues.apache.org/jira/browse/HDDS-2199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDDS-2199: - Labels: pull-request-available (was: ) > In SCMNodeManager dnsToUuidMap cannot track multiple DNs on the same host > - > > Key: HDDS-2199 > URL: https://issues.apache.org/jira/browse/HDDS-2199 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Affects Versions: 0.5.0 >Reporter: Stephen O'Donnell >Assignee: Stephen O'Donnell >Priority: Major > Labels: pull-request-available > > Often in test clusters and tests, we start multiple datanodes on the same > host. > In SCMNodeManager.register() there is a map of hostname -> datanode UUID > called dnsToUuidMap. > If several DNs register from the same host, the entry in the map will be > overwritten and the last DN to register will 'win'. > This means that the method getNodeByAddress() does not return the correct > DatanodeDetails object when many hosts are registered from the same address. > This method is only used in SCMBlockProtocolServer.sortDatanodes() to allow > it to see if one of the nodes matches the client, but it need to be used by > the Decommission code. > Perhaps we could change the getNodeByAddress() method to returns a list of > DNs? In normal production clusters, there should only be one returned, but in > test clusters, there may be many. Any code looking for a specific DN entry > would need to iterate the list and match on the port number too, as host:port > would be the unique definition of a datanode. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2199) In SCMNodeManager dnsToUuidMap cannot track multiple DNs on the same host
[ https://issues.apache.org/jira/browse/HDDS-2199?focusedWorklogId=320538=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320538 ] ASF GitHub Bot logged work on HDDS-2199: Author: ASF GitHub Bot Created on: 30/Sep/19 14:53 Start Date: 30/Sep/19 14:53 Worklog Time Spent: 10m Work Description: sodonnel commented on pull request #1551: HDDS-2199 In SCMNodeManager dnsToUuidMap cannot track multiple DNs on the same host URL: https://github.com/apache/hadoop/pull/1551 Often in test clusters and tests, we start multiple datanodes on the same host. In SCMNodeManager.register() there is a map of hostname -> datanode UUID called dnsToUuidMap. If several DNs register from the same host, the entry in the map will be overwritten and the last DN to register will 'win'. This means that the method getNodeByAddress() does not return the correct DatanodeDetails object when many hosts are registered from the same address. This method is only used in SCMBlockProtocolServer.sortDatanodes() to allow it to see if one of the nodes matches the client, but it need to be used by the Decommission code. Perhaps we could change the getNodeByAddress() method to returns a list of DNs? In normal production clusters, there should only be one returned, but in test clusters, there may be many. Any code looking for a specific DN entry would need to iterate the list and match on the port number too, as host:port would be the unique definition of a datanode. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320538) Remaining Estimate: 0h Time Spent: 10m > In SCMNodeManager dnsToUuidMap cannot track multiple DNs on the same host > - > > Key: HDDS-2199 > URL: https://issues.apache.org/jira/browse/HDDS-2199 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Affects Versions: 0.5.0 >Reporter: Stephen O'Donnell >Assignee: Stephen O'Donnell >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Often in test clusters and tests, we start multiple datanodes on the same > host. > In SCMNodeManager.register() there is a map of hostname -> datanode UUID > called dnsToUuidMap. > If several DNs register from the same host, the entry in the map will be > overwritten and the last DN to register will 'win'. > This means that the method getNodeByAddress() does not return the correct > DatanodeDetails object when many hosts are registered from the same address. > This method is only used in SCMBlockProtocolServer.sortDatanodes() to allow > it to see if one of the nodes matches the client, but it need to be used by > the Decommission code. > Perhaps we could change the getNodeByAddress() method to returns a list of > DNs? In normal production clusters, there should only be one returned, but in > test clusters, there may be many. Any code looking for a specific DN entry > would need to iterate the list and match on the port number too, as host:port > would be the unique definition of a datanode. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2205) checkstyle.sh reports wrong failure count
[ https://issues.apache.org/jira/browse/HDDS-2205?focusedWorklogId=320485=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320485 ] ASF GitHub Bot logged work on HDDS-2205: Author: ASF GitHub Bot Created on: 30/Sep/19 13:14 Start Date: 30/Sep/19 13:14 Worklog Time Spent: 10m Work Description: dineshchitlangia commented on issue #1548: HDDS-2205. checkstyle.sh reports wrong failure count URL: https://github.com/apache/hadoop/pull/1548#issuecomment-536555139 +1 LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320485) Time Spent: 40m (was: 0.5h) > checkstyle.sh reports wrong failure count > - > > Key: HDDS-2205 > URL: https://issues.apache.org/jira/browse/HDDS-2205 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: test >Reporter: Attila Doroszlai >Assignee: Attila Doroszlai >Priority: Trivial > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > {{checkstyle.sh}} outputs files with checkstyle violations and the violations > themselves on separate lines. It then reports line count as number of > failures. > {code:title=target/checkstyle/summary.txt} > hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/OmUtils.java > 49: Unused import - org.apache.hadoop.ozone.om.OMMetadataManager. > {code} > {code:title=target/checkstyle/failures} > 2 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2209) Checkstyle issue in OmUtils on trunk
Marton Elek created HDDS-2209: - Summary: Checkstyle issue in OmUtils on trunk Key: HDDS-2209 URL: https://issues.apache.org/jira/browse/HDDS-2209 Project: Hadoop Distributed Data Store Issue Type: Bug Reporter: Marton Elek HDDS-2174 introduced a new checkstyle error: {code:java} hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/OmUtils.java 49: Unused import - org.apache.hadoop.ozone.om.OMMetadataManager. {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2153) Add a config to tune max pending requests in Ratis leader
[ https://issues.apache.org/jira/browse/HDDS-2153?focusedWorklogId=320501=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320501 ] ASF GitHub Bot logged work on HDDS-2153: Author: ASF GitHub Bot Created on: 30/Sep/19 13:41 Start Date: 30/Sep/19 13:41 Worklog Time Spent: 10m Work Description: elek commented on pull request #1474: HDDS-2153. Add a config to tune max pending requests in Ratis leader. URL: https://github.com/apache/hadoop/pull/1474 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320501) Time Spent: 50m (was: 40m) > Add a config to tune max pending requests in Ratis leader > - > > Key: HDDS-2153 > URL: https://issues.apache.org/jira/browse/HDDS-2153 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Affects Versions: 0.5.0 >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2153) Add a config to tune max pending requests in Ratis leader
[ https://issues.apache.org/jira/browse/HDDS-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marton Elek updated HDDS-2153: -- Resolution: Fixed Status: Resolved (was: Patch Available) > Add a config to tune max pending requests in Ratis leader > - > > Key: HDDS-2153 > URL: https://issues.apache.org/jira/browse/HDDS-2153 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Affects Versions: 0.5.0 >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2209) Checkstyle issue in OmUtils on trunk
[ https://issues.apache.org/jira/browse/HDDS-2209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16940962#comment-16940962 ] Marton Elek commented on HDDS-2209: --- Thanks, I missed it. I searched for checkstyle only in the titles. Let me commit that one in this case... > Checkstyle issue in OmUtils on trunk > - > > Key: HDDS-2209 > URL: https://issues.apache.org/jira/browse/HDDS-2209 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Marton Elek >Priority: Trivial > Labels: newbie, newbie++ > > HDDS-2174 introduced a new checkstyle error: > {code:java} > hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/OmUtils.java > 49: Unused import - org.apache.hadoop.ozone.om.OMMetadataManager. {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2202) Remove unused import in OmUtils
[ https://issues.apache.org/jira/browse/HDDS-2202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16940981#comment-16940981 ] Hudson commented on HDDS-2202: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17417 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/17417/]) HDDS-2202. Remove unused import in OmUtils (elek: rev b46d82339f73534efa35c60f74eec1cdce9fd4b3) * (edit) hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/OmUtils.java > Remove unused import in OmUtils > --- > > Key: HDDS-2202 > URL: https://issues.apache.org/jira/browse/HDDS-2202 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Dinesh Chitlangia >Assignee: Dinesh Chitlangia >Priority: Minor > Labels: newbie, pull-request-available > Fix For: 0.5.0 > > Time Spent: 1h > Remaining Estimate: 0h > > Fix hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/OmUtils.java > Remove L49: Unused import - org.apache.hadoop.ozone.om.OMMetadataManager; > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2203) Race condition in ByteStringHelper.init()
[ https://issues.apache.org/jira/browse/HDDS-2203?focusedWorklogId=320502=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320502 ] ASF GitHub Bot logged work on HDDS-2203: Author: ASF GitHub Bot Created on: 30/Sep/19 13:42 Start Date: 30/Sep/19 13:42 Worklog Time Spent: 10m Work Description: bshashikant commented on issue #1544: HDDS-2203 Race condition in ByteStringHelper.init() URL: https://github.com/apache/hadoop/pull/1544#issuecomment-536566435 The changes look god to me . I am +1 on this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320502) Time Spent: 1h (was: 50m) > Race condition in ByteStringHelper.init() > - > > Key: HDDS-2203 > URL: https://issues.apache.org/jira/browse/HDDS-2203 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Client, SCM >Reporter: Istvan Fajth >Assignee: Istvan Fajth >Priority: Critical > Labels: pull-request-available, pull-requests-available > Time Spent: 1h > Remaining Estimate: 0h > > The current init method: > {code} > public static void init(boolean isUnsafeByteOperation) { > final boolean set = INITIALIZED.compareAndSet(false, true); > if (set) { > ByteStringHelper.isUnsafeByteOperationsEnabled = >isUnsafeByteOperation; >} else { > // already initialized, check values > Preconditions.checkState(isUnsafeByteOperationsEnabled >== isUnsafeByteOperation); >} > } > {code} > In a scenario when two thread accesses this method, and the execution order > is the following, then the second thread runs into an exception from > PreCondition.checkState() in the else branch. > In an unitialized state: > - T1 thread arrives to the method with true as the parameter, the class > initialises the isUnsafeByteOperationsEnabled to false > - T1 sets INITIALIZED true > - T2 arrives to the method with true as the parameter > - T2 reads the INITALIZED value and as it is not false goes to else branch > - T2 tries to check if the internal boolean property is the same true as it > wanted to set, and as T1 still to set the value, the checkState throws an > IllegalArgumentException. > This happens in certain Hive query cases, as it came from that testing, the > exception we see there is the following: > {code} > Error: Error while processing statement: FAILED: Execution Error, return code > 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, > vertexName=Map 2, vertexId=vertex_1569486223160_0334_1_02, > diagnostics=[Vertex vertex_1569486223160_0334_1_02 [Map 2] killed/failed > due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: item initializer failed, > vertex=vertex_1569486223160_0334_1_02 [Map 2], java.io.IOException: Couldn't > create RpcClient protocol > at > org.apache.hadoop.ozone.client.OzoneClientFactory.getClientProtocol(OzoneClientFactory.java:263) > at > org.apache.hadoop.ozone.client.OzoneClientFactory.getClientProtocol(OzoneClientFactory.java:239) > at > org.apache.hadoop.ozone.client.OzoneClientFactory.getRpcClient(OzoneClientFactory.java:203) > at > org.apache.hadoop.ozone.client.OzoneClientFactory.getRpcClient(OzoneClientFactory.java:165) > at > org.apache.hadoop.fs.ozone.BasicOzoneClientAdapterImpl.(BasicOzoneClientAdapterImpl.java:158) > at > org.apache.hadoop.fs.ozone.OzoneClientAdapterImpl.(OzoneClientAdapterImpl.java:50) > at > org.apache.hadoop.fs.ozone.OzoneFileSystem.createAdapter(OzoneFileSystem.java:102) > at > org.apache.hadoop.fs.ozone.BasicOzoneFileSystem.initialize(BasicOzoneFileSystem.java:155) > at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3315) > at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:136) > at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3364) > at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3332) > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:491) > at org.apache.hadoop.fs.Path.getFileSystem(Path.java:361) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1821) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:2002) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:524) > at >
[jira] [Comment Edited] (HDDS-2209) Checkstyle issue in OmUtils on trunk
[ https://issues.apache.org/jira/browse/HDDS-2209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16940959#comment-16940959 ] Dinesh Chitlangia edited comment on HDDS-2209 at 9/30/19 1:34 PM: -- [~elek] This is addressed by HDDS-2202. Closing this as duplicate. was (Author: dineshchitlangia): [~elek] This is addressed by HDDS-2202. Closing this a duplicate. > Checkstyle issue in OmUtils on trunk > - > > Key: HDDS-2209 > URL: https://issues.apache.org/jira/browse/HDDS-2209 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Marton Elek >Priority: Trivial > Labels: newbie, newbie++ > > HDDS-2174 introduced a new checkstyle error: > {code:java} > hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/OmUtils.java > 49: Unused import - org.apache.hadoop.ozone.om.OMMetadataManager. {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDDS-2209) Checkstyle issue in OmUtils on trunk
[ https://issues.apache.org/jira/browse/HDDS-2209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16940959#comment-16940959 ] Dinesh Chitlangia edited comment on HDDS-2209 at 9/30/19 1:34 PM: -- [~elek] This is addressed by HDDS-2202. Closing this a duplicate. was (Author: dineshchitlangia): This is addressed by HDDS-2202. Closing this a duplicate. > Checkstyle issue in OmUtils on trunk > - > > Key: HDDS-2209 > URL: https://issues.apache.org/jira/browse/HDDS-2209 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Marton Elek >Priority: Trivial > Labels: newbie, newbie++ > > HDDS-2174 introduced a new checkstyle error: > {code:java} > hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/OmUtils.java > 49: Unused import - org.apache.hadoop.ozone.om.OMMetadataManager. {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2202) Remove unused import in OmUtils
[ https://issues.apache.org/jira/browse/HDDS-2202?focusedWorklogId=320515=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320515 ] ASF GitHub Bot logged work on HDDS-2202: Author: ASF GitHub Bot Created on: 30/Sep/19 14:12 Start Date: 30/Sep/19 14:12 Worklog Time Spent: 10m Work Description: dineshchitlangia commented on issue #1543: HDDS-2202. Remove unused import in OmUtils URL: https://github.com/apache/hadoop/pull/1543#issuecomment-536579854 Thanks @adoroszlai & @elek This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320515) Time Spent: 1h (was: 50m) > Remove unused import in OmUtils > --- > > Key: HDDS-2202 > URL: https://issues.apache.org/jira/browse/HDDS-2202 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Dinesh Chitlangia >Assignee: Dinesh Chitlangia >Priority: Minor > Labels: newbie, pull-request-available > Fix For: 0.5.0 > > Time Spent: 1h > Remaining Estimate: 0h > > Fix hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/OmUtils.java > Remove L49: Unused import - org.apache.hadoop.ozone.om.OMMetadataManager; > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2034) Async RATIS pipeline creation and destroy through heartbeat commands
[ https://issues.apache.org/jira/browse/HDDS-2034?focusedWorklogId=320526=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320526 ] ASF GitHub Bot logged work on HDDS-2034: Author: ASF GitHub Bot Created on: 30/Sep/19 14:38 Start Date: 30/Sep/19 14:38 Worklog Time Spent: 10m Work Description: lokeshj1703 commented on issue #1469: HDDS-2034. Async RATIS pipeline creation and destroy through heartbea… URL: https://github.com/apache/hadoop/pull/1469#issuecomment-536592541 > I think the purpose of safemode is to guarantee that Ozone cluster is ready to provide service to Ozone client once safemode is exited. @ChenSammi I agree with that. I think the problem occurs with OneReplicaPipelineSafeModeRule. This rule makes sure that atleast one datanode in the old pipeline is reported so that reads for OPEN containers can go through. Here I think that old pipelines need to be tracked separately. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320526) Time Spent: 11h 10m (was: 11h) > Async RATIS pipeline creation and destroy through heartbeat commands > > > Key: HDDS-2034 > URL: https://issues.apache.org/jira/browse/HDDS-2034 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Sammi Chen >Assignee: Sammi Chen >Priority: Major > Labels: pull-request-available > Time Spent: 11h 10m > Remaining Estimate: 0h > > Currently, pipeline creation and destroy are synchronous operations. SCM > directly connect to each datanode of the pipeline through gRPC channel to > create the pipeline to destroy the pipeline. > This task is to remove the gRPC channel, send pipeline creation and destroy > action through heartbeat command to each datanode. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2158) Fix Json Injection in JsonUtils
[ https://issues.apache.org/jira/browse/HDDS-2158?focusedWorklogId=320660=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320660 ] ASF GitHub Bot logged work on HDDS-2158: Author: ASF GitHub Bot Created on: 30/Sep/19 16:57 Start Date: 30/Sep/19 16:57 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on issue #1486: HDDS-2158. Fixing Json Injection Issue in JsonUtils. URL: https://github.com/apache/hadoop/pull/1486#issuecomment-536652675 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | 0 | reexec | 0 | Docker mode activated. | | -1 | patch | 11 | https://github.com/apache/hadoop/pull/1486 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. | | Subsystem | Report/Notes | |--:|:-| | GITHUB PR | https://github.com/apache/hadoop/pull/1486 | | Console output | https://builds.apache.org/job/hadoop-multibranch/job/PR-1486/3/console | | versions | git=2.17.1 | | Powered by | Apache Yetus 0.10.0 http://yetus.apache.org | This message was automatically generated. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320660) Time Spent: 1h (was: 50m) > Fix Json Injection in JsonUtils > --- > > Key: HDDS-2158 > URL: https://issues.apache.org/jira/browse/HDDS-2158 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Hanisha Koneru >Assignee: Hanisha Koneru >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > JsonUtils#toJsonStringWithDefaultPrettyPrinter() does not validate the Json > String before serializing it which could result in Json Injection. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14882) Consider DataNode load when #getBlockLocation
[ https://issues.apache.org/jira/browse/HDFS-14882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941143#comment-16941143 ] Lukas Majercak commented on HDFS-14882: --- Overall looks okay to me, seems like an improvement of dfs.namenode.avoid.read.highload.datanode (+ .threshold). I only wish we could also use some sort of an estimate of load that we've already scheduled on each DN, not just xceivers reported by them. > Consider DataNode load when #getBlockLocation > - > > Key: HDFS-14882 > URL: https://issues.apache.org/jira/browse/HDFS-14882 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Xiaoqiao He >Assignee: Xiaoqiao He >Priority: Major > Attachments: HDFS-14882.001.patch > > > Currently, we consider load of datanode when #chooseTarget for writer, > however not consider it for reader. Thus, the process slot of datanode could > be occupied by #BlockSender for reader, and disk/network will be busy > workload, then meet some slow node exception. IIRC same case is reported > times. Based on the fact, I propose to consider load for reader same as it > did #chooseTarget for writer. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-1984) Fix listBucket API
[ https://issues.apache.org/jira/browse/HDDS-1984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] YiSheng Lien reassigned HDDS-1984: -- Assignee: (was: YiSheng Lien) > Fix listBucket API > -- > > Key: HDDS-1984 > URL: https://issues.apache.org/jira/browse/HDDS-1984 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Bharat Viswanadham >Priority: Major > > This Jira is to fix listBucket API in HA code path. > In HA, we have an in-memory cache, where we put the result to in-memory cache > and return the response, later it will be picked by double buffer thread and > it will flush to disk. So, now when do listBuckets, it should use both > in-memory cache and rocksdb bucket table to list buckets in a volume. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-1984) Fix listBucket API
[ https://issues.apache.org/jira/browse/HDDS-1984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] YiSheng Lien reassigned HDDS-1984: -- Assignee: YiSheng Lien > Fix listBucket API > -- > > Key: HDDS-1984 > URL: https://issues.apache.org/jira/browse/HDDS-1984 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Bharat Viswanadham >Assignee: YiSheng Lien >Priority: Major > > This Jira is to fix listBucket API in HA code path. > In HA, we have an in-memory cache, where we put the result to in-memory cache > and return the response, later it will be picked by double buffer thread and > it will flush to disk. So, now when do listBuckets, it should use both > in-memory cache and rocksdb bucket table to list buckets in a volume. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14305) Serial number in BlockTokenSecretManager could overlap between different namenodes
[ https://issues.apache.org/jira/browse/HDFS-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941217#comment-16941217 ] Konstantin Shvachko commented on HDFS-14305: Hey [~hexiaoqiao], I don't think I understand what you mean. The original bug was that the ranges are not disjoint, so they could cause collision of block tokens issued by different NameNodes. Both v06 and v07 patches solve this problem. We can still have a collision if we add new NameNodes to the cluster and restart them in arbitrary order. As I suggested we should try to solve this problem in a follow up jira. v06 patch introduced smaller ranges, so upgrading to this version will create collisions even if one keeps the number of NameNodes unchanged. v07 patch just fixes the arithmetic bug, and keeps the ranges as they were before. Hope this makes sense. > Serial number in BlockTokenSecretManager could overlap between different > namenodes > -- > > Key: HDFS-14305 > URL: https://issues.apache.org/jira/browse/HDFS-14305 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode, security >Reporter: Chao Sun >Assignee: Xiaoqiao He >Priority: Major > Fix For: 3.0.4, 3.3.0, 3.2.1, 3.1.3 > > Attachments: HDFS-14305-007.patch, HDFS-14305.001.patch, > HDFS-14305.002.patch, HDFS-14305.003.patch, HDFS-14305.004.patch, > HDFS-14305.005.patch, HDFS-14305.006.patch > > > Currently, a {{BlockTokenSecretManager}} starts with a random integer as the > initial serial number, and then use this formula to rotate it: > {code:java} > this.intRange = Integer.MAX_VALUE / numNNs; > this.nnRangeStart = intRange * nnIndex; > this.serialNo = (this.serialNo % intRange) + (nnRangeStart); > {code} > while {{numNNs}} is the total number of NameNodes in the cluster, and > {{nnIndex}} is the index of the current NameNode specified in the > configuration {{dfs.ha.namenodes.}}. > However, with this approach, different NameNode could have overlapping ranges > for serial number. For simplicity, let's assume {{Integer.MAX_VALUE}} is 100, > and we have 2 NameNodes {{nn1}} and {{nn2}} in configuration. Then the ranges > for these two are: > {code} > nn1 -> [-49, 49] > nn2 -> [1, 99] > {code} > This is because the initial serial number could be any negative integer. > Moreover, when the keys are updated, the serial number will again be updated > with the formula: > {code} > this.serialNo = (this.serialNo % intRange) + (nnRangeStart); > {code} > which means the new serial number could be updated to a range that belongs to > a different NameNode, thus increasing the chance of collision again. > When the collision happens, DataNodes could overwrite an existing key which > will cause clients to fail because of {{InvalidToken}} error. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2210) ContainerStateMachine should not be marked unhealthy if applyTransaction fails with closed container exception
Shashikant Banerjee created HDDS-2210: - Summary: ContainerStateMachine should not be marked unhealthy if applyTransaction fails with closed container exception Key: HDDS-2210 URL: https://issues.apache.org/jira/browse/HDDS-2210 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Datanode Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Fix For: 0.5.0 Currently, if applyTransaction fails, the stateMachine is marked unhealthy and next snapshot creation will fail. As a result of which the the raftServer will close down leading to pipeline failure. ClosedContainer exception should be ignored while marking the stateMachine unhealthy. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-1720) Add ability to configure RocksDB logs for Ozone Manager
[ https://issues.apache.org/jira/browse/HDDS-1720?focusedWorklogId=320708=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320708 ] ASF GitHub Bot logged work on HDDS-1720: Author: ASF GitHub Bot Created on: 30/Sep/19 17:52 Start Date: 30/Sep/19 17:52 Worklog Time Spent: 10m Work Description: avijayanhwx commented on issue #1538: HDDS-1720 : Add ability to configure RocksDB logs for Ozone Manager. URL: https://github.com/apache/hadoop/pull/1538#issuecomment-536674819 /retest This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320708) Time Spent: 40m (was: 0.5h) > Add ability to configure RocksDB logs for Ozone Manager > --- > > Key: HDDS-1720 > URL: https://issues.apache.org/jira/browse/HDDS-1720 > Project: Hadoop Distributed Data Store > Issue Type: Task >Reporter: Aravindan Vijayan >Assignee: Aravindan Vijayan >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 40m > Remaining Estimate: 0h > > While doing performance testing, it was seen that there was no way to get > RocksDB logs for Ozone Manager. Along with Rocksdb metrics, this may be a > useful mechanism to understand the health of Rocksdb while investigating > large clusters. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2187) ozone-mr test fails with No FileSystem for scheme "o3fs"
[ https://issues.apache.org/jira/browse/HDDS-2187?focusedWorklogId=320715=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320715 ] ASF GitHub Bot logged work on HDDS-2187: Author: ASF GitHub Bot Created on: 30/Sep/19 17:58 Start Date: 30/Sep/19 17:58 Worklog Time Spent: 10m Work Description: elek commented on issue #1537: HDDS-2187. ozone-mr test fails with No FileSystem for scheme o3fs URL: https://github.com/apache/hadoop/pull/1537#issuecomment-536677464 Thanks the update @adoroszlai Looks good. Good go have `ozone fs` fixed One problem: datanodes in ozonesecure-mr are not started AFAIK https://github.com/elek/ozone-ci-q4/blob/master/pr/pr-hdds-2187-2nl4x/acceptance/output.log This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320715) Time Spent: 1h 20m (was: 1h 10m) > ozone-mr test fails with No FileSystem for scheme "o3fs" > > > Key: HDDS-2187 > URL: https://issues.apache.org/jira/browse/HDDS-2187 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: test >Reporter: Attila Doroszlai >Assignee: Attila Doroszlai >Priority: Major > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > > HDDS-2101 changed how Ozone filesystem provider is configured. {{ozone-mr}} > tests [started > failing|https://github.com/elek/ozone-ci/blob/2f2c99652af6b26a95f08eece9e545f0d72ccf45/pr/pr-hdds-2101-rtz55/acceptance/output.log#L255-L263], > but it [wasn't > noticed|https://github.com/elek/ozone-ci/blob/master/pr/pr-hdds-2101-rtz55/acceptance/result] > due to HDDS-2185. > {code} > Running command 'ozone fs -mkdir /user' > ${output} = mkdir: No FileSystem for scheme "o3fs" > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14775) Add Timestamp for longest FSN write/read lock held log
[ https://issues.apache.org/jira/browse/HDFS-14775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941132#comment-16941132 ] Íñigo Goiri commented on HDFS-14775: For the hashCode and the equals, I usually prefer using HashCodeBuilder and EqualsBuilder as it takes care of it. I wonder if by moving from ReadLockHeldInfo to LockHeldInfo we are losing some of the semantic here. > Add Timestamp for longest FSN write/read lock held log > -- > > Key: HDFS-14775 > URL: https://issues.apache.org/jira/browse/HDFS-14775 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Chen Zhang >Assignee: Chen Zhang >Priority: Major > Attachments: HDFS-14775.001.patch, HDFS-14775.002.patch, > HDFS-14775.003.patch > > > HDFS-13946 improved the log for longest read/write lock held time, it's very > useful improvement. > In some condition, we need to locate the detailed call information(user, ip, > path, etc.) for longest lock holder, but the default throttle interval(10s) > is too long to find the corresponding audit log. I think we should add the > timestamp for the {{longestWriteLockHeldStackTrace}} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2001) Update Ratis version to 0.4.0
[ https://issues.apache.org/jira/browse/HDDS-2001?focusedWorklogId=320671=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320671 ] ASF GitHub Bot logged work on HDDS-2001: Author: ASF GitHub Bot Created on: 30/Sep/19 17:00 Start Date: 30/Sep/19 17:00 Worklog Time Spent: 10m Work Description: nandakumar131 commented on issue #1497: HDDS-2001. Update Ratis version to 0.4.0. URL: https://github.com/apache/hadoop/pull/1497#issuecomment-536654104 /retest This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320671) Time Spent: 2h 10m (was: 2h) > Update Ratis version to 0.4.0 > - > > Key: HDDS-2001 > URL: https://issues.apache.org/jira/browse/HDDS-2001 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Nanda kumar >Assignee: Nanda kumar >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 2h 10m > Remaining Estimate: 0h > > Update Ratis version to 0.4.0 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org