[jira] [Commented] (HDFS-14885) UI: Fix a typo on WebUI of DataNode.
[ https://issues.apache.org/jira/browse/HDFS-14885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941542#comment-16941542 ] Hadoop QA commented on HDFS-14885: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 22s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 54s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 36m 54s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 15s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 28s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 53m 42s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=18.09.7 Server=18.09.7 Image:yetus/hadoop:efed4450bf1 | | JIRA Issue | HDFS-14885 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12981848/HDFS-14885.patch | | Optional Tests | dupname asflicense shadedclient | | uname | Linux 459dc065a713 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 137546a | | maven | version: Apache Maven 3.3.9 | | Max. process+thread count | 342 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/27990/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > UI: Fix a typo on WebUI of DataNode. > > > Key: HDFS-14885 > URL: https://issues.apache.org/jira/browse/HDFS-14885 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, ui >Reporter: Xieming Li >Assignee: Xieming Li >Priority: Trivial > Attachments: HDFS-14885.patch, Screen Shot 2019-10-01 at 12.40.29.png > > > A Period('.') should be added to the end of following sentence on WebUI of > DataNode. > "No nodes are decommissioning" > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14373) EC : Decoding is failing when block group last incomplete cell fall in to AlignedStripe
[ https://issues.apache.org/jira/browse/HDFS-14373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Surendra Singh Lilhore updated HDFS-14373: -- Component/s: ec Priority: Critical (was: Major) > EC : Decoding is failing when block group last incomplete cell fall in to > AlignedStripe > --- > > Key: HDFS-14373 > URL: https://issues.apache.org/jira/browse/HDFS-14373 > Project: Hadoop HDFS > Issue Type: Bug > Components: ec, hdfs-client >Affects Versions: 3.1.1 >Reporter: Surendra Singh Lilhore >Assignee: Surendra Singh Lilhore >Priority: Critical > Attachments: HDFS-14373.001.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14885) UI: Fix a typo on WebUI of DataNode.
[ https://issues.apache.org/jira/browse/HDFS-14885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xieming Li updated HDFS-14885: -- Attachment: HDFS-14885.patch Status: Patch Available (was: Open) > UI: Fix a typo on WebUI of DataNode. > > > Key: HDFS-14885 > URL: https://issues.apache.org/jira/browse/HDFS-14885 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, ui >Reporter: Xieming Li >Assignee: Xieming Li >Priority: Trivial > Attachments: HDFS-14885.patch, Screen Shot 2019-10-01 at 12.40.29.png > > > A Period('.') should be added to the end of following sentence on WebUI of > DataNode. > "No nodes are decommissioning" > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2001) Update Ratis version to 0.4.0
[ https://issues.apache.org/jira/browse/HDDS-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941516#comment-16941516 ] Nanda kumar commented on HDDS-2001: --- The change in ozone-0.4.1 branch is done as part of HDDS-2020 > Update Ratis version to 0.4.0 > - > > Key: HDDS-2001 > URL: https://issues.apache.org/jira/browse/HDDS-2001 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Nanda kumar >Assignee: Nanda kumar >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 2.5h > Remaining Estimate: 0h > > Update Ratis version to 0.4.0 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-2001) Update Ratis version to 0.4.0
[ https://issues.apache.org/jira/browse/HDDS-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar resolved HDDS-2001. --- Resolution: Fixed > Update Ratis version to 0.4.0 > - > > Key: HDDS-2001 > URL: https://issues.apache.org/jira/browse/HDDS-2001 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Nanda kumar >Assignee: Nanda kumar >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 2.5h > Remaining Estimate: 0h > > Update Ratis version to 0.4.0 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14885) UI: Fix a typo on WebUI of DataNode.
[ https://issues.apache.org/jira/browse/HDFS-14885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xieming Li updated HDFS-14885: -- Summary: UI: Fix a typo on WebUI of DataNode. (was: UI: Fix a typo in ) > UI: Fix a typo on WebUI of DataNode. > > > Key: HDFS-14885 > URL: https://issues.apache.org/jira/browse/HDFS-14885 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, ui >Reporter: Xieming Li >Assignee: Xieming Li >Priority: Trivial > Attachments: Screen Shot 2019-10-01 at 12.40.29.png > > > A Period('.') should be added to the end of following sentence on WebUI of > DataNode. > "No nodes are decommissioning" > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-14885) UI: Fix a typo in
Xieming Li created HDFS-14885: - Summary: UI: Fix a typo in Key: HDFS-14885 URL: https://issues.apache.org/jira/browse/HDFS-14885 Project: Hadoop HDFS Issue Type: Bug Components: datanode, ui Reporter: Xieming Li Assignee: Xieming Li Attachments: Screen Shot 2019-10-01 at 12.40.29.png A Period('.') should be added to the end of following sentence on WebUI of DataNode. "No nodes are decommissioning" -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2169) Avoid buffer copies while submitting client requests in Ratis
[ https://issues.apache.org/jira/browse/HDDS-2169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941498#comment-16941498 ] Hadoop QA commented on HDDS-2169: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 36s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 20s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 31s{color} | {color:red} hadoop-hdds in trunk failed. {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 34s{color} | {color:red} hadoop-ozone in trunk failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 19s{color} | {color:red} hadoop-hdds in trunk failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 13s{color} | {color:red} hadoop-ozone in trunk failed. {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 11s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 18s{color} | {color:red} hadoop-hdds in trunk failed. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 16s{color} | {color:red} hadoop-ozone in trunk failed. {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 17m 38s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 31s{color} | {color:red} hadoop-hdds in trunk failed. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 17s{color} | {color:red} hadoop-ozone in trunk failed. {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 33s{color} | {color:red} hadoop-hdds in the patch failed. {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 36s{color} | {color:red} hadoop-ozone in the patch failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 21s{color} | {color:red} hadoop-hdds in the patch failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 16s{color} | {color:red} hadoop-ozone in the patch failed. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 21s{color} | {color:red} hadoop-hdds in the patch failed. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 17s{color} | {color:red} hadoop-ozone in the patch failed. {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 59s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 19s{color} | {color:red} hadoop-hdds in the patch failed. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 16s{color} | {color:red} hadoop-ozone in the patch failed. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 28s{color} | {color:red} hadoop-hdds in the patch failed. {color} | |
[jira] [Work logged] (HDDS-2169) Avoid buffer copies while submitting client requests in Ratis
[ https://issues.apache.org/jira/browse/HDDS-2169?focusedWorklogId=321044=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-321044 ] ASF GitHub Bot logged work on HDDS-2169: Author: ASF GitHub Bot Created on: 01/Oct/19 03:31 Start Date: 01/Oct/19 03:31 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on issue #1517: HDDS-2169 URL: https://github.com/apache/hadoop/pull/1517#issuecomment-536845581 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | 0 | reexec | 96 | Docker mode activated. | ||| _ Prechecks _ | | +1 | dupname | 0 | No case conflicting files found. | | +1 | @author | 0 | The patch does not contain any @author tags. | | +1 | test4tests | 0 | The patch appears to include 1 new or modified test files. | ||| _ trunk Compile Tests _ | | 0 | mvndep | 20 | Maven dependency ordering for branch | | -1 | mvninstall | 31 | hadoop-hdds in trunk failed. | | -1 | mvninstall | 34 | hadoop-ozone in trunk failed. | | -1 | compile | 19 | hadoop-hdds in trunk failed. | | -1 | compile | 13 | hadoop-ozone in trunk failed. | | +1 | checkstyle | 59 | trunk passed | | +1 | mvnsite | 0 | trunk passed | | +1 | shadedclient | 971 | branch has no errors when building and testing our client artifacts. | | -1 | javadoc | 18 | hadoop-hdds in trunk failed. | | -1 | javadoc | 16 | hadoop-ozone in trunk failed. | | 0 | spotbugs | 1058 | Used deprecated FindBugs config; considering switching to SpotBugs. | | -1 | findbugs | 31 | hadoop-hdds in trunk failed. | | -1 | findbugs | 17 | hadoop-ozone in trunk failed. | ||| _ Patch Compile Tests _ | | 0 | mvndep | 14 | Maven dependency ordering for patch | | -1 | mvninstall | 33 | hadoop-hdds in the patch failed. | | -1 | mvninstall | 36 | hadoop-ozone in the patch failed. | | -1 | compile | 21 | hadoop-hdds in the patch failed. | | -1 | compile | 16 | hadoop-ozone in the patch failed. | | -1 | javac | 21 | hadoop-hdds in the patch failed. | | -1 | javac | 17 | hadoop-ozone in the patch failed. | | +1 | checkstyle | 54 | the patch passed | | +1 | mvnsite | 0 | the patch passed | | +1 | whitespace | 0 | The patch has no whitespace issues. | | +1 | shadedclient | 779 | patch has no errors when building and testing our client artifacts. | | -1 | javadoc | 19 | hadoop-hdds in the patch failed. | | -1 | javadoc | 16 | hadoop-ozone in the patch failed. | | -1 | findbugs | 28 | hadoop-hdds in the patch failed. | | -1 | findbugs | 18 | hadoop-ozone in the patch failed. | ||| _ Other Tests _ | | -1 | unit | 24 | hadoop-hdds in the patch failed. | | -1 | unit | 23 | hadoop-ozone in the patch failed. | | +1 | asflicense | 29 | The patch does not generate ASF License warnings. | | | | 2550 | | | Subsystem | Report/Notes | |--:|:-| | Docker | Client=19.03.0 Server=19.03.0 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1517/7/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/1517 | | JIRA Issue | HDDS-2169 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux f83947a622e3 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | personality/hadoop.sh | | git revision | trunk / b3275ab | | Default Java | 1.8.0_222 | | mvninstall | https://builds.apache.org/job/hadoop-multibranch/job/PR-1517/7/artifact/out/branch-mvninstall-hadoop-hdds.txt | | mvninstall | https://builds.apache.org/job/hadoop-multibranch/job/PR-1517/7/artifact/out/branch-mvninstall-hadoop-ozone.txt | | compile | https://builds.apache.org/job/hadoop-multibranch/job/PR-1517/7/artifact/out/branch-compile-hadoop-hdds.txt | | compile | https://builds.apache.org/job/hadoop-multibranch/job/PR-1517/7/artifact/out/branch-compile-hadoop-ozone.txt | | javadoc | https://builds.apache.org/job/hadoop-multibranch/job/PR-1517/7/artifact/out/branch-javadoc-hadoop-hdds.txt | | javadoc | https://builds.apache.org/job/hadoop-multibranch/job/PR-1517/7/artifact/out/branch-javadoc-hadoop-ozone.txt | | findbugs | https://builds.apache.org/job/hadoop-multibranch/job/PR-1517/7/artifact/out/branch-findbugs-hadoop-hdds.txt | | findbugs | https://builds.apache.org/job/hadoop-multibranch/job/PR-1517/7/artifact/out/branch-findbugs-hadoop-ozone.txt | | mvninstall | https://builds.apache.org/job/hadoop-multibranch/job/PR-1517/7/artifact/out/patch-mvninstall-hadoop-hdds.txt | | mvninstall | https://builds.apache.org/job/hadoop-multibranch/job/PR-1517/7/artifact/out/patch-mvninstall-hadoop-ozone.txt
[jira] [Work logged] (HDDS-1984) Fix listBucket API
[ https://issues.apache.org/jira/browse/HDDS-1984?focusedWorklogId=321037=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-321037 ] ASF GitHub Bot logged work on HDDS-1984: Author: ASF GitHub Bot Created on: 01/Oct/19 02:56 Start Date: 01/Oct/19 02:56 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on issue #1555: HDDS-1984. Fix listBucket API. URL: https://github.com/apache/hadoop/pull/1555#issuecomment-536837875 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | 0 | reexec | 40 | Docker mode activated. | ||| _ Prechecks _ | | +1 | dupname | 0 | No case conflicting files found. | | +1 | @author | 0 | The patch does not contain any @author tags. | | +1 | test4tests | 0 | The patch appears to include 1 new or modified test files. | ||| _ trunk Compile Tests _ | | 0 | mvndep | 25 | Maven dependency ordering for branch | | -1 | mvninstall | 31 | hadoop-hdds in trunk failed. | | -1 | mvninstall | 36 | hadoop-ozone in trunk failed. | | -1 | compile | 20 | hadoop-hdds in trunk failed. | | -1 | compile | 15 | hadoop-ozone in trunk failed. | | +1 | checkstyle | 51 | trunk passed | | +1 | mvnsite | 0 | trunk passed | | +1 | shadedclient | 818 | branch has no errors when building and testing our client artifacts. | | -1 | javadoc | 25 | hadoop-hdds in trunk failed. | | -1 | javadoc | 16 | hadoop-ozone in trunk failed. | | 0 | spotbugs | 919 | Used deprecated FindBugs config; considering switching to SpotBugs. | | -1 | findbugs | 34 | hadoop-hdds in trunk failed. | | -1 | findbugs | 22 | hadoop-ozone in trunk failed. | ||| _ Patch Compile Tests _ | | 0 | mvndep | 28 | Maven dependency ordering for patch | | -1 | mvninstall | 37 | hadoop-hdds in the patch failed. | | -1 | mvninstall | 39 | hadoop-ozone in the patch failed. | | -1 | compile | 25 | hadoop-hdds in the patch failed. | | -1 | compile | 16 | hadoop-ozone in the patch failed. | | -1 | javac | 25 | hadoop-hdds in the patch failed. | | -1 | javac | 16 | hadoop-ozone in the patch failed. | | -0 | checkstyle | 25 | hadoop-hdds: The patch generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) | | +1 | mvnsite | 0 | the patch passed | | +1 | whitespace | 0 | The patch has no whitespace issues. | | +1 | shadedclient | 707 | patch has no errors when building and testing our client artifacts. | | -1 | javadoc | 21 | hadoop-hdds in the patch failed. | | -1 | javadoc | 20 | hadoop-ozone in the patch failed. | | -1 | findbugs | 32 | hadoop-hdds in the patch failed. | | -1 | findbugs | 20 | hadoop-ozone in the patch failed. | ||| _ Other Tests _ | | -1 | unit | 30 | hadoop-hdds in the patch failed. | | -1 | unit | 28 | hadoop-ozone in the patch failed. | | +1 | asflicense | 34 | The patch does not generate ASF License warnings. | | | | 2360 | | | Subsystem | Report/Notes | |--:|:-| | Docker | Client=19.03.1 Server=19.03.1 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/2/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/1555 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 883182c3dde4 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | personality/hadoop.sh | | git revision | trunk / b3275ab | | Default Java | 1.8.0_222 | | mvninstall | https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/2/artifact/out/branch-mvninstall-hadoop-hdds.txt | | mvninstall | https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/2/artifact/out/branch-mvninstall-hadoop-ozone.txt | | compile | https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/2/artifact/out/branch-compile-hadoop-hdds.txt | | compile | https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/2/artifact/out/branch-compile-hadoop-ozone.txt | | javadoc | https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/2/artifact/out/branch-javadoc-hadoop-hdds.txt | | javadoc | https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/2/artifact/out/branch-javadoc-hadoop-ozone.txt | | findbugs | https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/2/artifact/out/branch-findbugs-hadoop-hdds.txt | | findbugs | https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/2/artifact/out/branch-findbugs-hadoop-ozone.txt | | mvninstall | https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/2/artifact/out/patch-mvninstall-hadoop-hdds.txt | | mvninstall |
[jira] [Work logged] (HDDS-1984) Fix listBucket API
[ https://issues.apache.org/jira/browse/HDDS-1984?focusedWorklogId=321036=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-321036 ] ASF GitHub Bot logged work on HDDS-1984: Author: ASF GitHub Bot Created on: 01/Oct/19 02:56 Start Date: 01/Oct/19 02:56 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on issue #1555: HDDS-1984. Fix listBucket API. URL: https://github.com/apache/hadoop/pull/1555#issuecomment-536837796 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | 0 | reexec | 42 | Docker mode activated. | ||| _ Prechecks _ | | +1 | dupname | 1 | No case conflicting files found. | | +1 | @author | 0 | The patch does not contain any @author tags. | | +1 | test4tests | 0 | The patch appears to include 1 new or modified test files. | ||| _ trunk Compile Tests _ | | 0 | mvndep | 68 | Maven dependency ordering for branch | | -1 | mvninstall | 44 | hadoop-hdds in trunk failed. | | -1 | mvninstall | 39 | hadoop-ozone in trunk failed. | | -1 | compile | 21 | hadoop-hdds in trunk failed. | | -1 | compile | 14 | hadoop-ozone in trunk failed. | | +1 | checkstyle | 62 | trunk passed | | +1 | mvnsite | 0 | trunk passed | | +1 | shadedclient | 851 | branch has no errors when building and testing our client artifacts. | | -1 | javadoc | 21 | hadoop-hdds in trunk failed. | | -1 | javadoc | 18 | hadoop-ozone in trunk failed. | | 0 | spotbugs | 955 | Used deprecated FindBugs config; considering switching to SpotBugs. | | -1 | findbugs | 38 | hadoop-hdds in trunk failed. | | -1 | findbugs | 22 | hadoop-ozone in trunk failed. | ||| _ Patch Compile Tests _ | | 0 | mvndep | 30 | Maven dependency ordering for patch | | -1 | mvninstall | 36 | hadoop-hdds in the patch failed. | | -1 | mvninstall | 40 | hadoop-ozone in the patch failed. | | -1 | compile | 26 | hadoop-hdds in the patch failed. | | -1 | compile | 18 | hadoop-ozone in the patch failed. | | -1 | javac | 26 | hadoop-hdds in the patch failed. | | -1 | javac | 18 | hadoop-ozone in the patch failed. | | -0 | checkstyle | 28 | hadoop-hdds: The patch generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) | | +1 | mvnsite | 0 | the patch passed | | +1 | whitespace | 0 | The patch has no whitespace issues. | | +1 | shadedclient | 715 | patch has no errors when building and testing our client artifacts. | | -1 | javadoc | 24 | hadoop-hdds in the patch failed. | | -1 | javadoc | 18 | hadoop-ozone in the patch failed. | | -1 | findbugs | 32 | hadoop-hdds in the patch failed. | | -1 | findbugs | 20 | hadoop-ozone in the patch failed. | ||| _ Other Tests _ | | -1 | unit | 28 | hadoop-hdds in the patch failed. | | -1 | unit | 29 | hadoop-ozone in the patch failed. | | +1 | asflicense | 35 | The patch does not generate ASF License warnings. | | | | 2467 | | | Subsystem | Report/Notes | |--:|:-| | Docker | Client=19.03.1 Server=19.03.1 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/1/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/1555 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 08929fca86df 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | personality/hadoop.sh | | git revision | trunk / b3275ab | | Default Java | 1.8.0_222 | | mvninstall | https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/1/artifact/out/branch-mvninstall-hadoop-hdds.txt | | mvninstall | https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/1/artifact/out/branch-mvninstall-hadoop-ozone.txt | | compile | https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/1/artifact/out/branch-compile-hadoop-hdds.txt | | compile | https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/1/artifact/out/branch-compile-hadoop-ozone.txt | | javadoc | https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/1/artifact/out/branch-javadoc-hadoop-hdds.txt | | javadoc | https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/1/artifact/out/branch-javadoc-hadoop-ozone.txt | | findbugs | https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/1/artifact/out/branch-findbugs-hadoop-hdds.txt | | findbugs | https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/1/artifact/out/branch-findbugs-hadoop-ozone.txt | | mvninstall | https://builds.apache.org/job/hadoop-multibranch/job/PR-1555/1/artifact/out/patch-mvninstall-hadoop-hdds.txt | | mvninstall |
[jira] [Work logged] (HDDS-2169) Avoid buffer copies while submitting client requests in Ratis
[ https://issues.apache.org/jira/browse/HDDS-2169?focusedWorklogId=321035=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-321035 ] ASF GitHub Bot logged work on HDDS-2169: Author: ASF GitHub Bot Created on: 01/Oct/19 02:50 Start Date: 01/Oct/19 02:50 Worklog Time Spent: 10m Work Description: szetszwo commented on issue #1517: HDDS-2169 URL: https://github.com/apache/hadoop/pull/1517#issuecomment-536836755 > Thanks @szetszwo for working on this. With the patch, while running the tests in TestDataValidateWithUnsafeByteOperations, the below issue is observed. > > `2019-09-30 21:58:02,745 [grpc-default-executor-2] ERROR segmented.SegmentedRaftLogWorker (SegmentedRaftLogWorker.java:(449)) - e4ab8454-30fe-420c-a1cf-40d223cb4898@group-D0335C23E8DA-SegmentedRaftLogWorker: writeStateMachineData failed for index 1, entry=(t:1, i:1), STATEMACHINELOGENTRY, client-6C45A0D09519, cid=8 java.lang.IndexOutOfBoundsException: End index: 135008824 >= 207 at org.apache.ratis.thirdparty.com.google.protobuf.ByteString.checkRange(ByteString.java:1233) at org.apache.ratis.thirdparty.com.google.protobuf.ByteString$LiteralByteString.substring(ByteString.java:1288) at org.apache.hadoop.hdds.ratis.ContainerCommandRequestMessage.toProto(ContainerCommandRequestMessage.java:66) at org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.getContainerCommandRequestProto(ContainerStateMachine.java:375) at org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.writeStateMachineData(ContainerStateMachine.java:494) at org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker$WriteLog.(SegmentedRaftLogWorker.java:447) at org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker.writeLogEntry(SegmentedRaftLogWorker.java:397) at org.apache.ratis.server.raftlog.segmented.SegmentedRaftLog.appendEntryImpl(SegmentedRaftLog.java:411) at org.apache.ratis.server.raftlog.RaftLog.lambda$appendEntry$10(RaftLog.java:359) at org.apache.ratis.server.raftlog.RaftLogSequentialOps$Runner.runSequentially(RaftLogSequentialOps.java:77) at org.apache.ratis.server.raftlog.RaftLog.appendEntry(RaftLog.java:359) at org.apache.ratis.server.raftlog.RaftLog.appendImpl(RaftLog.java:183) at org.apache.ratis.server.raftlog.RaftLog.lambda$append$2(RaftLog.java:159) at org.apache.ratis.server.raftlog.RaftLogSequentialOps$Runner.runSequentially(RaftLogSequentialOps.java:68) at org.apache.ratis.server.raftlog.RaftLog.append(RaftLog.java:159) at org.apache.ratis.server.impl.ServerState.appendLog(ServerState.java:282) at org.apache.ratis.server.impl.RaftServerImpl.appendTransaction(RaftServerImpl.java:505) at org.apache.ratis.server.impl.RaftServerImpl.submitClientRequestAsync(RaftServerImpl.java:576) at org.apache.ratis.server.impl.RaftServerProxy.lambda$submitClientRequestAsync$7(RaftServerProxy.java:333) at org.apache.ratis.server.impl.RaftServerProxy.lambda$null$5(RaftServerProxy.java:328) at org.apache.ratis.util.JavaUtils.callAsUnchecked(JavaUtils.java:109) at org.apache.ratis.server.impl.RaftServerProxy.lambda$submitRequest$6(RaftServerProxy.java:328) at java.util.concurrent.CompletableFuture.uniComposeStage(CompletableFuture.java:981) at java.util.concurrent.CompletableFuture.thenCompose(CompletableFuture.java:2124) at org.apache.ratis.server.impl.RaftServerProxy.submitRequest(RaftServerProxy.java:327) at org.apache.ratis.server.impl.RaftServerProxy.submitClientRequestAsync(RaftServerProxy.java:333) at org.apache.ratis.grpc.client.GrpcClientProtocolService$RequestStreamObserver.processClientRequest(GrpcClientProtocolService.java:220) at org.apache.ratis.grpc.client.GrpcClientProtocolService$OrderedRequestStreamObserver.processClientRequest(GrpcClientProtocolService.java:326) at org.apache.ratis.util.SlidingWindow$Server.processRequestsFromHead(SlidingWindow.java:429) at org.apache.ratis.util.SlidingWindow$Server.receivedRequest(SlidingWindow.java:421) at org.apache.ratis.grpc.client.GrpcClientProtocolService$OrderedRequestStreamObserver.processClientRequest(GrpcClientProtocolService.java:345) at org.apache.ratis.grpc.client.GrpcClientProtocolService$RequestStreamObserver.onNext(GrpcClientProtocolService.java:240) at org.apache.ratis.grpc.client.GrpcClientProtocolService$RequestStreamObserver.onNext(GrpcClientProtocolService.java:168) at org.apache.ratis.thirdparty.io.grpc.stub.ServerCalls$StreamingServerCallHandler$StreamingServerCallListener.onMessage(ServerCalls.java:248) at org.apache.ratis.thirdparty.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailable(ServerCallImpl.java:263) at org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1MessagesAvailable.runInContext(ServerImpl.java:686) at
[jira] [Commented] (HDDS-2175) Propagate System Exceptions from the OzoneManager
[ https://issues.apache.org/jira/browse/HDDS-2175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941476#comment-16941476 ] Anu Engineer commented on HDDS-2175: It is something that I disagree with. But if you feel strongly about this; please go ahead. > Propagate System Exceptions from the OzoneManager > - > > Key: HDDS-2175 > URL: https://issues.apache.org/jira/browse/HDDS-2175 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Manager >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > Exceptions encountered while processing requests on the OM are categorized as > business exceptions and system exceptions. All of the business exceptions are > captured as OMException and have an associated status code which is returned > to the client. The handling of these is not going to be changed. > Currently system exceptions are returned as INTERNAL ERROR to the client with > a 1 line message string from the exception. The scope of this jira is to > capture system exceptions and propagate the related information(including the > complete stack trace) back to the client. > There are 3 sub-tasks required to achieve this > 1. Separate capture and handling for OMException and the other > exceptions(IOException). For system exceptions, use Hadoop IPC > ServiceException mechanism to send the stack trace to the client. > 2. track and propagate exceptions inside Ratis OzoneManagerStateMachine and > propagate up to the OzoneManager layer (on the leader). Currently, these > exceptions are not being tracked. > 3. Handle and propagate exceptions from Ratis. > Will raise jira for each sub-task. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1984) Fix listBucket API
[ https://issues.apache.org/jira/browse/HDDS-1984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941471#comment-16941471 ] Bharat Viswanadham commented on HDDS-1984: -- Just posted an initial PR, still thinking about it how can I improve this. (As with posted patch, every time I will iterate entire map) Posted a starter patch. (Will need further look in to, how can I improve further) > Fix listBucket API > -- > > Key: HDDS-1984 > URL: https://issues.apache.org/jira/browse/HDDS-1984 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > This Jira is to fix listBucket API in HA code path. > In HA, we have an in-memory cache, where we put the result to in-memory cache > and return the response, later it will be picked by double buffer thread and > it will flush to disk. So, now when do listBuckets, it should use both > in-memory cache and rocksdb bucket table to list buckets in a volume. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-1984) Fix listBucket API
[ https://issues.apache.org/jira/browse/HDDS-1984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] YiSheng Lien reassigned HDDS-1984: -- Assignee: Bharat Viswanadham > Fix listBucket API > -- > > Key: HDDS-1984 > URL: https://issues.apache.org/jira/browse/HDDS-1984 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > This Jira is to fix listBucket API in HA code path. > In HA, we have an in-memory cache, where we put the result to in-memory cache > and return the response, later it will be picked by double buffer thread and > it will flush to disk. So, now when do listBuckets, it should use both > in-memory cache and rocksdb bucket table to list buckets in a volume. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1984) Fix listBucket API
[ https://issues.apache.org/jira/browse/HDDS-1984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941469#comment-16941469 ] YiSheng Lien commented on HDDS-1984: Hello [~bharat] Thanks for the comment, Never mind about it, I can learn more about it with your PR :) > Fix listBucket API > -- > > Key: HDDS-1984 > URL: https://issues.apache.org/jira/browse/HDDS-1984 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Bharat Viswanadham >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > This Jira is to fix listBucket API in HA code path. > In HA, we have an in-memory cache, where we put the result to in-memory cache > and return the response, later it will be picked by double buffer thread and > it will flush to disk. So, now when do listBuckets, it should use both > in-memory cache and rocksdb bucket table to list buckets in a volume. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-1984) Fix listBucket API
[ https://issues.apache.org/jira/browse/HDDS-1984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] YiSheng Lien reassigned HDDS-1984: -- Assignee: (was: YiSheng Lien) > Fix listBucket API > -- > > Key: HDDS-1984 > URL: https://issues.apache.org/jira/browse/HDDS-1984 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Bharat Viswanadham >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > This Jira is to fix listBucket API in HA code path. > In HA, we have an in-memory cache, where we put the result to in-memory cache > and return the response, later it will be picked by double buffer thread and > it will flush to disk. So, now when do listBuckets, it should use both > in-memory cache and rocksdb bucket table to list buckets in a volume. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-1984) Fix listBucket API
[ https://issues.apache.org/jira/browse/HDDS-1984?focusedWorklogId=321027=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-321027 ] ASF GitHub Bot logged work on HDDS-1984: Author: ASF GitHub Bot Created on: 01/Oct/19 02:14 Start Date: 01/Oct/19 02:14 Worklog Time Spent: 10m Work Description: bharatviswa504 commented on pull request #1555: HDDS-1984. Fix listBucket API. URL: https://github.com/apache/hadoop/pull/1555 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 321027) Remaining Estimate: 0h Time Spent: 10m > Fix listBucket API > -- > > Key: HDDS-1984 > URL: https://issues.apache.org/jira/browse/HDDS-1984 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Bharat Viswanadham >Assignee: YiSheng Lien >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > This Jira is to fix listBucket API in HA code path. > In HA, we have an in-memory cache, where we put the result to in-memory cache > and return the response, later it will be picked by double buffer thread and > it will flush to disk. So, now when do listBuckets, it should use both > in-memory cache and rocksdb bucket table to list buckets in a volume. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1984) Fix listBucket API
[ https://issues.apache.org/jira/browse/HDDS-1984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDDS-1984: - Labels: pull-request-available (was: ) > Fix listBucket API > -- > > Key: HDDS-1984 > URL: https://issues.apache.org/jira/browse/HDDS-1984 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Bharat Viswanadham >Assignee: YiSheng Lien >Priority: Major > Labels: pull-request-available > > This Jira is to fix listBucket API in HA code path. > In HA, we have an in-memory cache, where we put the result to in-memory cache > and return the response, later it will be picked by double buffer thread and > it will flush to disk. So, now when do listBuckets, it should use both > in-memory cache and rocksdb bucket table to list buckets in a volume. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1984) Fix listBucket API
[ https://issues.apache.org/jira/browse/HDDS-1984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941466#comment-16941466 ] Bharat Viswanadham commented on HDDS-1984: -- Hey Hi [~cxorm] Thanks for taking this up. I have missed that you have assigned this Jira to yourself. I have started working on this and have a patch for this. Sorry for this. You can take up other list APIs which are similar kind of this Jira. > Fix listBucket API > -- > > Key: HDDS-1984 > URL: https://issues.apache.org/jira/browse/HDDS-1984 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Bharat Viswanadham >Assignee: YiSheng Lien >Priority: Major > > This Jira is to fix listBucket API in HA code path. > In HA, we have an in-memory cache, where we put the result to in-memory cache > and return the response, later it will be picked by double buffer thread and > it will flush to disk. So, now when do listBuckets, it should use both > in-memory cache and rocksdb bucket table to list buckets in a volume. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2175) Propagate System Exceptions from the OzoneManager
[ https://issues.apache.org/jira/browse/HDDS-2175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941456#comment-16941456 ] Arpit Agarwal commented on HDDS-2175: - bq. it is hard to parse these exceptions even when they are part of normal log files. And yet these exceptions are a godsend. I would rather see one exception than 10 obscure log messages since it tells me exactly when something 'exceptional' happened and the code path leading to the occurrence. bq. If we add exceptions to those strings, the human readability of those error messages goes down. The readability goes up. You now actually get a sense for what actually went wrong instead of some generic message. bq. I had a chat with Supratim Deka and I said that I am all for increasing the fidelity of the error codes, that is we can add more error codes if we want to fine tune these messages. Lot more work with inferior results. Error codes are terrible in layered systems [since multiple layers will often wind up translating codes|https://twitter.com/Obdurodon/status/1161700056740876289]. The only way to maintain full fidelity is add a new error code for every single failure path, an impossible task. Instead just present the original exception as it happened. This is friendlier for your end users and painless for developers. bq. I prefer a clear, simple contract between the server and client, I think it makes it easier for future clients to be developed more easily. Exceptions as added here will make development of future clients super easy. Since the exception is stringified and propagated over the wire, all the client has to do is print the string without any interpretation. The fears seems unfounded to me. > Propagate System Exceptions from the OzoneManager > - > > Key: HDDS-2175 > URL: https://issues.apache.org/jira/browse/HDDS-2175 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Manager >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > Exceptions encountered while processing requests on the OM are categorized as > business exceptions and system exceptions. All of the business exceptions are > captured as OMException and have an associated status code which is returned > to the client. The handling of these is not going to be changed. > Currently system exceptions are returned as INTERNAL ERROR to the client with > a 1 line message string from the exception. The scope of this jira is to > capture system exceptions and propagate the related information(including the > complete stack trace) back to the client. > There are 3 sub-tasks required to achieve this > 1. Separate capture and handling for OMException and the other > exceptions(IOException). For system exceptions, use Hadoop IPC > ServiceException mechanism to send the stack trace to the client. > 2. track and propagate exceptions inside Ratis OzoneManagerStateMachine and > propagate up to the OzoneManager layer (on the leader). Currently, these > exceptions are not being tracked. > 3. Handle and propagate exceptions from Ratis. > Will raise jira for each sub-task. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14814) RBF: RouterQuotaUpdateService supports inherited rule.
[ https://issues.apache.org/jira/browse/HDFS-14814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941451#comment-16941451 ] Jinglun commented on HDFS-14814: Thank [~elgoiri] for your fast reply ! Agree with your comments, especially the first one about setQuota(), it's very reasonable ! Only one question: {quote}I think that in the loop in getGlobalQuota, you could just do the ifs, and not do the if with the break, you will get the same number of comparissons. {quote} Do you mean the code below ? {code:java} Entry entry = pts.lastEntry(); while (entry != null) { String ppath = entry.getKey(); QuotaUsage quota = entry.getValue(); if (nQuota == HdfsConstants.QUOTA_RESET) { nQuota = quota.getQuota(); } if (sQuota == HdfsConstants.QUOTA_RESET) { sQuota = quota.getSpaceQuota(); } entry = pts.lowerEntry(ppath); }{code} In my understood, If I don't break I'll search all the entries even I already got the values for nQuota and sQuota. So I want to break to save some pts.lowerEntry(ppath). Correct me if i'm wrong. Thanks ! > RBF: RouterQuotaUpdateService supports inherited rule. > -- > > Key: HDFS-14814 > URL: https://issues.apache.org/jira/browse/HDFS-14814 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Jinglun >Assignee: Jinglun >Priority: Major > Attachments: HDFS-14814.001.patch, HDFS-14814.002.patch, > HDFS-14814.003.patch, HDFS-14814.004.patch, HDFS-14814.005.patch, > HDFS-14814.006.patch, HDFS-14814.007.patch, HDFS-14814.008.patch, > HDFS-14814.009.patch, HDFS-14814.010.patch > > > I want to add a rule *'The quota should be set the same as the nearest > parent'* to Global Quota. Supposing we have the mount table below. > M1: /dir-a ns0->/dir-a \{nquota=10,squota=20} > M2: /dir-a/dir-b ns1->/dir-b \{nquota=-1,squota=30} > M3: /dir-a/dir-b/dir-c ns2->/dir-c \{nquota=-1,squota=-1} > M4: /dir-d ns3->/dir-d \{nquota=-1,squota=-1} > > The quota for the remote locations on the namespaces should be: > ns0->/dir-a \{nquota=10,squota=20} > ns1->/dir-b \{nquota=10,squota=30} > ns2->/dir-c \{nquota=10,squota=30} > ns3->/dir-d \{nquota=-1,squota=-1} > > The quota of the remote location is set the same as the corresponding > MountTable, and if there is no quota of the MountTable then the quota is set > to the nearest parent MountTable with quota. > > It's easy to implement it. In RouterQuotaUpdateService each time we compute > the currentQuotaUsage, we can get the quota info for each MountTable. We can > do a > check and fix all the MountTable which's quota doesn't match the rule above. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14305) Serial number in BlockTokenSecretManager could overlap between different namenodes
[ https://issues.apache.org/jira/browse/HDFS-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated HDFS-14305: --- Fix Version/s: (was: 3.1.3) (was: 3.2.1) (was: 3.0.4) 3.2.2 3.1.4 2.10.0 Hadoop Flags: Reviewed Assignee: Konstantin Shvachko (was: Xiaoqiao He) Resolution: Fixed Status: Resolved (was: Patch Available) [~vagarychen] you are absolutely correct, thanks for the review. I just committed this to trunk, branch-3.2, branch-3.1, and branch-2. Updated fix versions. > Serial number in BlockTokenSecretManager could overlap between different > namenodes > -- > > Key: HDFS-14305 > URL: https://issues.apache.org/jira/browse/HDFS-14305 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode, security >Reporter: Chao Sun >Assignee: Konstantin Shvachko >Priority: Major > Labels: multi-sbnn, release-blocker > Fix For: 2.10.0, 3.3.0, 3.1.4, 3.2.2 > > Attachments: HDFS-14305-007.patch, HDFS-14305-008.patch, > HDFS-14305.001.patch, HDFS-14305.002.patch, HDFS-14305.003.patch, > HDFS-14305.004.patch, HDFS-14305.005.patch, HDFS-14305.006.patch > > > Currently, a {{BlockTokenSecretManager}} starts with a random integer as the > initial serial number, and then use this formula to rotate it: > {code:java} > this.intRange = Integer.MAX_VALUE / numNNs; > this.nnRangeStart = intRange * nnIndex; > this.serialNo = (this.serialNo % intRange) + (nnRangeStart); > {code} > while {{numNNs}} is the total number of NameNodes in the cluster, and > {{nnIndex}} is the index of the current NameNode specified in the > configuration {{dfs.ha.namenodes.}}. > However, with this approach, different NameNode could have overlapping ranges > for serial number. For simplicity, let's assume {{Integer.MAX_VALUE}} is 100, > and we have 2 NameNodes {{nn1}} and {{nn2}} in configuration. Then the ranges > for these two are: > {code} > nn1 -> [-49, 49] > nn2 -> [1, 99] > {code} > This is because the initial serial number could be any negative integer. > Moreover, when the keys are updated, the serial number will again be updated > with the formula: > {code} > this.serialNo = (this.serialNo % intRange) + (nnRangeStart); > {code} > which means the new serial number could be updated to a range that belongs to > a different NameNode, thus increasing the chance of collision again. > When the collision happens, DataNodes could overwrite an existing key which > will cause clients to fail because of {{InvalidToken}} error. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-1720) Add ability to configure RocksDB logs for Ozone Manager
[ https://issues.apache.org/jira/browse/HDDS-1720?focusedWorklogId=320994=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320994 ] ASF GitHub Bot logged work on HDDS-1720: Author: ASF GitHub Bot Created on: 01/Oct/19 00:22 Start Date: 01/Oct/19 00:22 Worklog Time Spent: 10m Work Description: anuengineer commented on issue #1538: HDDS-1720 : Add ability to configure RocksDB logs for Ozone Manager. URL: https://github.com/apache/hadoop/pull/1538#issuecomment-536804802 Can you please add a test case that proves that RocksDB actually produces logs that we can see. There is a log listener class in the Hadoop. You can use them, or create a test and then grep for some of the log statements. Otherwise the change looks quite good to me. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320994) Time Spent: 1h 10m (was: 1h) > Add ability to configure RocksDB logs for Ozone Manager > --- > > Key: HDDS-1720 > URL: https://issues.apache.org/jira/browse/HDDS-1720 > Project: Hadoop Distributed Data Store > Issue Type: Task >Reporter: Aravindan Vijayan >Assignee: Aravindan Vijayan >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > While doing performance testing, it was seen that there was no way to get > RocksDB logs for Ozone Manager. Along with Rocksdb metrics, this may be a > useful mechanism to understand the health of Rocksdb while investigating > large clusters. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-1720) Add ability to configure RocksDB logs for Ozone Manager
[ https://issues.apache.org/jira/browse/HDDS-1720?focusedWorklogId=320993=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320993 ] ASF GitHub Bot logged work on HDDS-1720: Author: ASF GitHub Bot Created on: 01/Oct/19 00:21 Start Date: 01/Oct/19 00:21 Worklog Time Spent: 10m Work Description: anuengineer commented on issue #1538: HDDS-1720 : Add ability to configure RocksDB logs for Ozone Manager. URL: https://github.com/apache/hadoop/pull/1538#issuecomment-536804802 Can you please add a test case that proves that RocksDB actually produces logs that we can see. There is a log listener class in the Hadoop. You can use them, or create a test and then grep for some of the log statements. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320993) Time Spent: 1h (was: 50m) > Add ability to configure RocksDB logs for Ozone Manager > --- > > Key: HDDS-1720 > URL: https://issues.apache.org/jira/browse/HDDS-1720 > Project: Hadoop Distributed Data Store > Issue Type: Task >Reporter: Aravindan Vijayan >Assignee: Aravindan Vijayan >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 1h > Remaining Estimate: 0h > > While doing performance testing, it was seen that there was no way to get > RocksDB logs for Ozone Manager. Along with Rocksdb metrics, this may be a > useful mechanism to understand the health of Rocksdb while investigating > large clusters. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14305) Serial number in BlockTokenSecretManager could overlap between different namenodes
[ https://issues.apache.org/jira/browse/HDFS-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941420#comment-16941420 ] Hudson commented on HDFS-14305: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17421 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/17421/]) HDFS-14305. Fix serial number calculation in BlockTokenSecretManager to (shv: rev b3275ab1f2f4546ba4bdc0e48cfa60b5b05071b9) * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/security/token/block/TestBlockToken.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestFailoverWithBlockTokensEnabled.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/security/token/block/BlockTokenSecretManager.java > Serial number in BlockTokenSecretManager could overlap between different > namenodes > -- > > Key: HDFS-14305 > URL: https://issues.apache.org/jira/browse/HDFS-14305 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode, security >Reporter: Chao Sun >Assignee: Xiaoqiao He >Priority: Major > Labels: multi-sbnn, release-blocker > Fix For: 3.0.4, 3.3.0, 3.2.1, 3.1.3 > > Attachments: HDFS-14305-007.patch, HDFS-14305-008.patch, > HDFS-14305.001.patch, HDFS-14305.002.patch, HDFS-14305.003.patch, > HDFS-14305.004.patch, HDFS-14305.005.patch, HDFS-14305.006.patch > > > Currently, a {{BlockTokenSecretManager}} starts with a random integer as the > initial serial number, and then use this formula to rotate it: > {code:java} > this.intRange = Integer.MAX_VALUE / numNNs; > this.nnRangeStart = intRange * nnIndex; > this.serialNo = (this.serialNo % intRange) + (nnRangeStart); > {code} > while {{numNNs}} is the total number of NameNodes in the cluster, and > {{nnIndex}} is the index of the current NameNode specified in the > configuration {{dfs.ha.namenodes.}}. > However, with this approach, different NameNode could have overlapping ranges > for serial number. For simplicity, let's assume {{Integer.MAX_VALUE}} is 100, > and we have 2 NameNodes {{nn1}} and {{nn2}} in configuration. Then the ranges > for these two are: > {code} > nn1 -> [-49, 49] > nn2 -> [1, 99] > {code} > This is because the initial serial number could be any negative integer. > Moreover, when the keys are updated, the serial number will again be updated > with the formula: > {code} > this.serialNo = (this.serialNo % intRange) + (nnRangeStart); > {code} > which means the new serial number could be updated to a range that belongs to > a different NameNode, thus increasing the chance of collision again. > When the collision happens, DataNodes could overwrite an existing key which > will cause clients to fail because of {{InvalidToken}} error. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14235) Handle ArrayIndexOutOfBoundsException in DataNodeDiskMetrics#slowDiskDetectionDaemon
[ https://issues.apache.org/jira/browse/HDFS-14235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941411#comment-16941411 ] Wei-Chiu Chuang commented on HDFS-14235: Commit applies cleanly in branch-3.1. Updated fix version > Handle ArrayIndexOutOfBoundsException in > DataNodeDiskMetrics#slowDiskDetectionDaemon > - > > Key: HDFS-14235 > URL: https://issues.apache.org/jira/browse/HDFS-14235 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Surendra Singh Lilhore >Assignee: Ranith Sardar >Priority: Major > Fix For: 3.3.0, 3.2.1, 3.1.4 > > Attachments: HDFS-14235.000.patch, HDFS-14235.001.patch, > HDFS-14235.002.patch, HDFS-14235.003.patch, NPE.png, exception.png > > > below code throwing exception because {{volumeIterator.next()}} called two > time without checking hashNext(). > {code:java} > while (volumeIterator.hasNext()) { > FsVolumeSpi volume = volumeIterator.next(); > DataNodeVolumeMetrics metrics = volumeIterator.next().getMetrics(); > String volumeName = volume.getBaseURI().getPath(); > metadataOpStats.put(volumeName, > metrics.getMetadataOperationMean()); > readIoStats.put(volumeName, metrics.getReadIoMean()); > writeIoStats.put(volumeName, metrics.getWriteIoMean()); > }{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14235) Handle ArrayIndexOutOfBoundsException in DataNodeDiskMetrics#slowDiskDetectionDaemon
[ https://issues.apache.org/jira/browse/HDFS-14235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-14235: --- Fix Version/s: 3.1.4 > Handle ArrayIndexOutOfBoundsException in > DataNodeDiskMetrics#slowDiskDetectionDaemon > - > > Key: HDFS-14235 > URL: https://issues.apache.org/jira/browse/HDFS-14235 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Surendra Singh Lilhore >Assignee: Ranith Sardar >Priority: Major > Fix For: 3.3.0, 3.2.1, 3.1.4 > > Attachments: HDFS-14235.000.patch, HDFS-14235.001.patch, > HDFS-14235.002.patch, HDFS-14235.003.patch, NPE.png, exception.png > > > below code throwing exception because {{volumeIterator.next()}} called two > time without checking hashNext(). > {code:java} > while (volumeIterator.hasNext()) { > FsVolumeSpi volume = volumeIterator.next(); > DataNodeVolumeMetrics metrics = volumeIterator.next().getMetrics(); > String volumeName = volume.getBaseURI().getPath(); > metadataOpStats.put(volumeName, > metrics.getMetadataOperationMean()); > readIoStats.put(volumeName, metrics.getReadIoMean()); > writeIoStats.put(volumeName, metrics.getWriteIoMean()); > }{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2205) checkstyle.sh reports wrong failure count
[ https://issues.apache.org/jira/browse/HDDS-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941409#comment-16941409 ] Hudson commented on HDDS-2205: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17420 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/17420/]) HDDS-2205. checkstyle.sh reports wrong failure count (aengineer: rev e5bba592a84a94e0545479b668e6925eb4b8858c) * (edit) hadoop-ozone/dev-support/checks/checkstyle.sh > checkstyle.sh reports wrong failure count > - > > Key: HDDS-2205 > URL: https://issues.apache.org/jira/browse/HDDS-2205 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: test >Reporter: Attila Doroszlai >Assignee: Attila Doroszlai >Priority: Trivial > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 1h > Remaining Estimate: 0h > > {{checkstyle.sh}} outputs files with checkstyle violations and the violations > themselves on separate lines. It then reports line count as number of > failures. > {code:title=target/checkstyle/summary.txt} > hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/OmUtils.java > 49: Unused import - org.apache.hadoop.ozone.om.OMMetadataManager. > {code} > {code:title=target/checkstyle/failures} > 2 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-1615) ManagedChannel references are being leaked in ReplicationSupervisor.java
[ https://issues.apache.org/jira/browse/HDDS-1615?focusedWorklogId=320976=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320976 ] ASF GitHub Bot logged work on HDDS-1615: Author: ASF GitHub Bot Created on: 30/Sep/19 23:40 Start Date: 30/Sep/19 23:40 Worklog Time Spent: 10m Work Description: anuengineer commented on issue #1547: HDDS-1615. ManagedChannel references are being leaked in ReplicationS… URL: https://github.com/apache/hadoop/pull/1547#issuecomment-536795997 +1. LGTM. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320976) Time Spent: 0.5h (was: 20m) > ManagedChannel references are being leaked in ReplicationSupervisor.java > > > Key: HDDS-1615 > URL: https://issues.apache.org/jira/browse/HDDS-1615 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Affects Versions: 0.4.0 >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh >Priority: Major > Labels: MiniOzoneChaosCluster, pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > ManagedChannel references are being leaked in ReplicationSupervisor.java > {code} > May 30, 2019 8:10:56 AM > org.apache.ratis.thirdparty.io.grpc.internal.ManagedChannelOrphanWrapper$ManagedChannelReference > cleanQueue > SEVERE: *~*~*~ Channel ManagedChannelImpl{logId=1495, > target=192.168.0.3:49868} was not shutdown properly!!! ~*~*~* > Make sure to call shutdown()/shutdownNow() and wait until > awaitTermination() returns true. > java.lang.RuntimeException: ManagedChannel allocation site > at > org.apache.ratis.thirdparty.io.grpc.internal.ManagedChannelOrphanWrapper$ManagedChannelReference.(ManagedChannelOrphanWrapper.java:103) > at > org.apache.ratis.thirdparty.io.grpc.internal.ManagedChannelOrphanWrapper.(ManagedChannelOrphanWrapper.java:53) > at > org.apache.ratis.thirdparty.io.grpc.internal.ManagedChannelOrphanWrapper.(ManagedChannelOrphanWrapper.java:44) > at > org.apache.ratis.thirdparty.io.grpc.internal.AbstractManagedChannelImplBuilder.build(AbstractManagedChannelImplBuilder.java:411) > at > org.apache.hadoop.ozone.container.replication.GrpcReplicationClient.(GrpcReplicationClient.java:65) > at > org.apache.hadoop.ozone.container.replication.SimpleContainerDownloader.getContainerDataFromReplicas(SimpleContainerDownloader.java:87) > at > org.apache.hadoop.ozone.container.replication.DownloadAndImportReplicator.replicate(DownloadAndImportReplicator.java:118) > at > org.apache.hadoop.ozone.container.replication.ReplicationSupervisor$TaskRunner.run(ReplicationSupervisor.java:115) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-1720) Add ability to configure RocksDB logs for Ozone Manager
[ https://issues.apache.org/jira/browse/HDDS-1720?focusedWorklogId=320977=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320977 ] ASF GitHub Bot logged work on HDDS-1720: Author: ASF GitHub Bot Created on: 30/Sep/19 23:40 Start Date: 30/Sep/19 23:40 Worklog Time Spent: 10m Work Description: anuengineer commented on issue #1538: HDDS-1720 : Add ability to configure RocksDB logs for Ozone Manager. URL: https://github.com/apache/hadoop/pull/1538#issuecomment-536796152 /retest This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320977) Time Spent: 50m (was: 40m) > Add ability to configure RocksDB logs for Ozone Manager > --- > > Key: HDDS-1720 > URL: https://issues.apache.org/jira/browse/HDDS-1720 > Project: Hadoop Distributed Data Store > Issue Type: Task >Reporter: Aravindan Vijayan >Assignee: Aravindan Vijayan >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 50m > Remaining Estimate: 0h > > While doing performance testing, it was seen that there was no way to get > RocksDB logs for Ozone Manager. Along with Rocksdb metrics, this may be a > useful mechanism to understand the health of Rocksdb while investigating > large clusters. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2205) checkstyle.sh reports wrong failure count
[ https://issues.apache.org/jira/browse/HDDS-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer updated HDDS-2205: --- Fix Version/s: 0.5.0 Resolution: Fixed Status: Resolved (was: Patch Available) > checkstyle.sh reports wrong failure count > - > > Key: HDDS-2205 > URL: https://issues.apache.org/jira/browse/HDDS-2205 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: test >Reporter: Attila Doroszlai >Assignee: Attila Doroszlai >Priority: Trivial > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 1h > Remaining Estimate: 0h > > {{checkstyle.sh}} outputs files with checkstyle violations and the violations > themselves on separate lines. It then reports line count as number of > failures. > {code:title=target/checkstyle/summary.txt} > hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/OmUtils.java > 49: Unused import - org.apache.hadoop.ozone.om.OMMetadataManager. > {code} > {code:title=target/checkstyle/failures} > 2 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2205) checkstyle.sh reports wrong failure count
[ https://issues.apache.org/jira/browse/HDDS-2205?focusedWorklogId=320975=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320975 ] ASF GitHub Bot logged work on HDDS-2205: Author: ASF GitHub Bot Created on: 30/Sep/19 23:37 Start Date: 30/Sep/19 23:37 Worklog Time Spent: 10m Work Description: anuengineer commented on pull request #1548: HDDS-2205. checkstyle.sh reports wrong failure count URL: https://github.com/apache/hadoop/pull/1548 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320975) Time Spent: 1h (was: 50m) > checkstyle.sh reports wrong failure count > - > > Key: HDDS-2205 > URL: https://issues.apache.org/jira/browse/HDDS-2205 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: test >Reporter: Attila Doroszlai >Assignee: Attila Doroszlai >Priority: Trivial > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > {{checkstyle.sh}} outputs files with checkstyle violations and the violations > themselves on separate lines. It then reports line count as number of > failures. > {code:title=target/checkstyle/summary.txt} > hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/OmUtils.java > 49: Unused import - org.apache.hadoop.ozone.om.OMMetadataManager. > {code} > {code:title=target/checkstyle/failures} > 2 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2205) checkstyle.sh reports wrong failure count
[ https://issues.apache.org/jira/browse/HDDS-2205?focusedWorklogId=320974=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320974 ] ASF GitHub Bot logged work on HDDS-2205: Author: ASF GitHub Bot Created on: 30/Sep/19 23:37 Start Date: 30/Sep/19 23:37 Worklog Time Spent: 10m Work Description: anuengineer commented on issue #1548: HDDS-2205. checkstyle.sh reports wrong failure count URL: https://github.com/apache/hadoop/pull/1548#issuecomment-536795581 Thank you for the contribution. I have committed this patch to the trunk. @dineshchitlangia Thank you for the review. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320974) Time Spent: 50m (was: 40m) > checkstyle.sh reports wrong failure count > - > > Key: HDDS-2205 > URL: https://issues.apache.org/jira/browse/HDDS-2205 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: test >Reporter: Attila Doroszlai >Assignee: Attila Doroszlai >Priority: Trivial > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > {{checkstyle.sh}} outputs files with checkstyle violations and the violations > themselves on separate lines. It then reports line count as number of > failures. > {code:title=target/checkstyle/summary.txt} > hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/OmUtils.java > 49: Unused import - org.apache.hadoop.ozone.om.OMMetadataManager. > {code} > {code:title=target/checkstyle/failures} > 2 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2203) Race condition in ByteStringHelper.init()
[ https://issues.apache.org/jira/browse/HDDS-2203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941398#comment-16941398 ] Anu Engineer commented on HDDS-2203: makes sense. Do you want this patch committed? or just move to the new model ? > Race condition in ByteStringHelper.init() > - > > Key: HDDS-2203 > URL: https://issues.apache.org/jira/browse/HDDS-2203 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Client, SCM >Reporter: Istvan Fajth >Assignee: Istvan Fajth >Priority: Critical > Labels: pull-request-available, pull-requests-available > Time Spent: 1h > Remaining Estimate: 0h > > The current init method: > {code} > public static void init(boolean isUnsafeByteOperation) { > final boolean set = INITIALIZED.compareAndSet(false, true); > if (set) { > ByteStringHelper.isUnsafeByteOperationsEnabled = >isUnsafeByteOperation; >} else { > // already initialized, check values > Preconditions.checkState(isUnsafeByteOperationsEnabled >== isUnsafeByteOperation); >} > } > {code} > In a scenario when two thread accesses this method, and the execution order > is the following, then the second thread runs into an exception from > PreCondition.checkState() in the else branch. > In an unitialized state: > - T1 thread arrives to the method with true as the parameter, the class > initialises the isUnsafeByteOperationsEnabled to false > - T1 sets INITIALIZED true > - T2 arrives to the method with true as the parameter > - T2 reads the INITALIZED value and as it is not false goes to else branch > - T2 tries to check if the internal boolean property is the same true as it > wanted to set, and as T1 still to set the value, the checkState throws an > IllegalArgumentException. > This happens in certain Hive query cases, as it came from that testing, the > exception we see there is the following: > {code} > Error: Error while processing statement: FAILED: Execution Error, return code > 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, > vertexName=Map 2, vertexId=vertex_1569486223160_0334_1_02, > diagnostics=[Vertex vertex_1569486223160_0334_1_02 [Map 2] killed/failed > due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: item initializer failed, > vertex=vertex_1569486223160_0334_1_02 [Map 2], java.io.IOException: Couldn't > create RpcClient protocol > at > org.apache.hadoop.ozone.client.OzoneClientFactory.getClientProtocol(OzoneClientFactory.java:263) > at > org.apache.hadoop.ozone.client.OzoneClientFactory.getClientProtocol(OzoneClientFactory.java:239) > at > org.apache.hadoop.ozone.client.OzoneClientFactory.getRpcClient(OzoneClientFactory.java:203) > at > org.apache.hadoop.ozone.client.OzoneClientFactory.getRpcClient(OzoneClientFactory.java:165) > at > org.apache.hadoop.fs.ozone.BasicOzoneClientAdapterImpl.(BasicOzoneClientAdapterImpl.java:158) > at > org.apache.hadoop.fs.ozone.OzoneClientAdapterImpl.(OzoneClientAdapterImpl.java:50) > at > org.apache.hadoop.fs.ozone.OzoneFileSystem.createAdapter(OzoneFileSystem.java:102) > at > org.apache.hadoop.fs.ozone.BasicOzoneFileSystem.initialize(BasicOzoneFileSystem.java:155) > at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3315) > at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:136) > at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3364) > at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3332) > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:491) > at org.apache.hadoop.fs.Path.getFileSystem(Path.java:361) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1821) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:2002) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:524) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:781) > at > org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:243) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:278) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:269) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:269) >
[jira] [Assigned] (HDDS-2213) Reduce key provider loading log level in OzoneFileSystem#getAdditionalTokenIssuers
[ https://issues.apache.org/jira/browse/HDDS-2213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan reassigned HDDS-2213: --- Assignee: Shweta > Reduce key provider loading log level in > OzoneFileSystem#getAdditionalTokenIssuers > -- > > Key: HDDS-2213 > URL: https://issues.apache.org/jira/browse/HDDS-2213 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Vivek Ratnavel Subramanian >Assignee: Shweta >Priority: Minor > > OzoneFileSystem#getAdditionalTokenIssuers log an error when secure client > tries to collect ozone delegation token to run MR/Spark jobs but ozone file > system does not have a kms provider configured. In this case, we simply > return null provider here in the code below. This is a benign error and we > should reduce the log level to debug level. > {code:java} > KeyProvider keyProvider; > try { > keyProvider = getKeyProvider(); } > catch (IOException ioe) { > LOG.error("Error retrieving KeyProvider.", ioe); > return null; > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14305) Serial number in BlockTokenSecretManager could overlap between different namenodes
[ https://issues.apache.org/jira/browse/HDFS-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941393#comment-16941393 ] Chen Liang commented on HDFS-14305: --- Looks like the key idea of v8 patch is that calling {{nextInt(int bound)}} which gives a non-negative value, instead of {{nextInt()}} which can return negative value. So that the range start is never negative, and so we avoid the overlapping ranges. Assuming we will address the potential confliction issue separately, +1 for the v08 patch. > Serial number in BlockTokenSecretManager could overlap between different > namenodes > -- > > Key: HDFS-14305 > URL: https://issues.apache.org/jira/browse/HDFS-14305 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode, security >Reporter: Chao Sun >Assignee: Xiaoqiao He >Priority: Major > Labels: multi-sbnn, release-blocker > Fix For: 3.0.4, 3.3.0, 3.2.1, 3.1.3 > > Attachments: HDFS-14305-007.patch, HDFS-14305-008.patch, > HDFS-14305.001.patch, HDFS-14305.002.patch, HDFS-14305.003.patch, > HDFS-14305.004.patch, HDFS-14305.005.patch, HDFS-14305.006.patch > > > Currently, a {{BlockTokenSecretManager}} starts with a random integer as the > initial serial number, and then use this formula to rotate it: > {code:java} > this.intRange = Integer.MAX_VALUE / numNNs; > this.nnRangeStart = intRange * nnIndex; > this.serialNo = (this.serialNo % intRange) + (nnRangeStart); > {code} > while {{numNNs}} is the total number of NameNodes in the cluster, and > {{nnIndex}} is the index of the current NameNode specified in the > configuration {{dfs.ha.namenodes.}}. > However, with this approach, different NameNode could have overlapping ranges > for serial number. For simplicity, let's assume {{Integer.MAX_VALUE}} is 100, > and we have 2 NameNodes {{nn1}} and {{nn2}} in configuration. Then the ranges > for these two are: > {code} > nn1 -> [-49, 49] > nn2 -> [1, 99] > {code} > This is because the initial serial number could be any negative integer. > Moreover, when the keys are updated, the serial number will again be updated > with the formula: > {code} > this.serialNo = (this.serialNo % intRange) + (nnRangeStart); > {code} > which means the new serial number could be updated to a range that belongs to > a different NameNode, thus increasing the chance of collision again. > When the collision happens, DataNodes could overwrite an existing key which > will cause clients to fail because of {{InvalidToken}} error. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2213) Reduce key provider loading log level in OzoneFileSystem#getAdditionalTokenIssuers
[ https://issues.apache.org/jira/browse/HDDS-2213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoyu Yao updated HDDS-2213: - Description: OzoneFileSystem#getAdditionalTokenIssuers log an error when secure client tries to collect ozone delegation token to run MR/Spark jobs but ozone file system does not have a kms provider configured. In this case, we simply return null provider here in the code below. This is a benign error and we should reduce the log level to debug level. {code:java} KeyProvider keyProvider; try { keyProvider = getKeyProvider(); } catch (IOException ioe) { LOG.error("Error retrieving KeyProvider.", ioe); return null; } {code} was: OzoneFileSystem#getAdditionalTokenIssuers log an error when secure client tries to collect ozone delegation token to run MR/Spark jobs but ozone file system does not have a kms provider configured. In this case, we simply return null provider here in the code below. This is a benign error and we should reduce the log level to debug level. \{code} KeyProvider keyProvider; try { keyProvider = getKeyProvider(); } catch (IOException ioe) { LOG.error("Error retrieving KeyProvider.", ioe); return null; } {code} > Reduce key provider loading log level in > OzoneFileSystem#getAdditionalTokenIssuers > -- > > Key: HDDS-2213 > URL: https://issues.apache.org/jira/browse/HDDS-2213 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Vivek Ratnavel Subramanian >Priority: Minor > > OzoneFileSystem#getAdditionalTokenIssuers log an error when secure client > tries to collect ozone delegation token to run MR/Spark jobs but ozone file > system does not have a kms provider configured. In this case, we simply > return null provider here in the code below. This is a benign error and we > should reduce the log level to debug level. > {code:java} > KeyProvider keyProvider; > try { > keyProvider = getKeyProvider(); } > catch (IOException ioe) { > LOG.error("Error retrieving KeyProvider.", ioe); > return null; > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2175) Propagate System Exceptions from the OzoneManager
[ https://issues.apache.org/jira/browse/HDDS-2175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941392#comment-16941392 ] Anu Engineer commented on HDDS-2175: bq. I feel that call stacks are invaluable when included in the bug report to the developer. I completely agree. As I mentioned in my comment in the Github, they are very useful tools for debugging. But we have to weigh the pros and cons of the approach. Here are some downsides, so I will list them out. 1. Code and Style Consistency - Generally, Errors are propagated via Error code and Message (Goland, C, etc) or Exceptions (Java, C++ etc). When we developed this interface, we choose to go with Error code and Message approach instead of Exceptions. So mixing these different approaches creates very inconsistent code flows. 2. Prevent Java server abstractions from leaking to client side - Java exceptions are very java specific; it is hard to parse these exceptions even when they are part of normal log files. It is difficult to read thru a printed stack to even understand the issue. This gets compounded when Exceptions stack. When we were writing this client interface, we wanted to make sure it is easy to write clients in other languages. A simple, Error code and a message is universal, that all languages understand and easy to write other language clients which can speak this protocol. 3. The current code experience - There are several parts of this code, where the clients print out these messages to the users. If we add exceptions to those strings, the human readability of those error messages goes down. 4. If we want to move to exceptions instead of error codes , it is possible (even though I think our future clients will suffer), but we need to move away from the error/message model. That is lot of work, with very little benefit, other than the fact that we will have a consistent experience and exceptions will flow to the client side. I had a chat with [~sdeka] and I said that I am all for increasing the fidelity of the error codes, that is we can add more error codes if we want to fine tune these messages. I am also all for logging more on the server side. So I am not against the patch, just wanted to avoid *server side Java exceptions crossing over to the client side*. I prefer a clear, simple contract between the server and client, I think it makes it easier for future clients to be developed more easily. > Propagate System Exceptions from the OzoneManager > - > > Key: HDDS-2175 > URL: https://issues.apache.org/jira/browse/HDDS-2175 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Manager >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > Exceptions encountered while processing requests on the OM are categorized as > business exceptions and system exceptions. All of the business exceptions are > captured as OMException and have an associated status code which is returned > to the client. The handling of these is not going to be changed. > Currently system exceptions are returned as INTERNAL ERROR to the client with > a 1 line message string from the exception. The scope of this jira is to > capture system exceptions and propagate the related information(including the > complete stack trace) back to the client. > There are 3 sub-tasks required to achieve this > 1. Separate capture and handling for OMException and the other > exceptions(IOException). For system exceptions, use Hadoop IPC > ServiceException mechanism to send the stack trace to the client. > 2. track and propagate exceptions inside Ratis OzoneManagerStateMachine and > propagate up to the OzoneManager layer (on the leader). Currently, these > exceptions are not being tracked. > 3. Handle and propagate exceptions from Ratis. > Will raise jira for each sub-task. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDDS-2175) Propagate System Exceptions from the OzoneManager
[ https://issues.apache.org/jira/browse/HDDS-2175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941392#comment-16941392 ] Anu Engineer edited comment on HDDS-2175 at 9/30/19 10:54 PM: -- {quote}I feel that call stacks are invaluable when included in the bug report to the developer. {quote} I completely agree. As I mentioned in my comment in the Github, they are very useful tools for debugging. But we have to weigh the pros and cons of the approach. Here are some downsides, so I will list them out. 1. Code and Style Consistency - Generally, Errors are propagated via Error code and Message (Golang, C, etc) or Exceptions (Java, C++ etc). When we developed this interface, we choose to go with Error code and Message approach instead of Exceptions. So mixing these different approaches creates very inconsistent code flows. 2. Prevent Java server abstractions from leaking to client side - Java exceptions are very java specific; it is hard to parse these exceptions even when they are part of normal log files. It is difficult to read thru a printed stack to even understand the issue. This gets compounded when Exceptions stack. When we were writing this client interface, we wanted to make sure it is easy to write clients in other languages. A simple, Error code and a message is universal, that all languages understand and easy to write other language clients which can speak this protocol. 3. The current code experience - There are several parts of this code, where the clients print out these messages to the users. If we add exceptions to those strings, the human readability of those error messages goes down. 4. If we want to move to exceptions instead of error codes , it is possible (even though I think our future clients will suffer), but we need to move away from the error/message model. That is lot of work, with very little benefit, other than the fact that we will have a consistent experience and exceptions will flow to the client side. I had a chat with [~sdeka] and I said that I am all for increasing the fidelity of the error codes, that is we can add more error codes if we want to fine tune these messages. I am also all for logging more on the server side. So I am not against the patch, just wanted to avoid *server side Java exceptions crossing over to the client side*. I prefer a clear, simple contract between the server and client, I think it makes it easier for future clients to be developed more easily. was (Author: anu): bq. I feel that call stacks are invaluable when included in the bug report to the developer. I completely agree. As I mentioned in my comment in the Github, they are very useful tools for debugging. But we have to weigh the pros and cons of the approach. Here are some downsides, so I will list them out. 1. Code and Style Consistency - Generally, Errors are propagated via Error code and Message (Goland, C, etc) or Exceptions (Java, C++ etc). When we developed this interface, we choose to go with Error code and Message approach instead of Exceptions. So mixing these different approaches creates very inconsistent code flows. 2. Prevent Java server abstractions from leaking to client side - Java exceptions are very java specific; it is hard to parse these exceptions even when they are part of normal log files. It is difficult to read thru a printed stack to even understand the issue. This gets compounded when Exceptions stack. When we were writing this client interface, we wanted to make sure it is easy to write clients in other languages. A simple, Error code and a message is universal, that all languages understand and easy to write other language clients which can speak this protocol. 3. The current code experience - There are several parts of this code, where the clients print out these messages to the users. If we add exceptions to those strings, the human readability of those error messages goes down. 4. If we want to move to exceptions instead of error codes , it is possible (even though I think our future clients will suffer), but we need to move away from the error/message model. That is lot of work, with very little benefit, other than the fact that we will have a consistent experience and exceptions will flow to the client side. I had a chat with [~sdeka] and I said that I am all for increasing the fidelity of the error codes, that is we can add more error codes if we want to fine tune these messages. I am also all for logging more on the server side. So I am not against the patch, just wanted to avoid *server side Java exceptions crossing over to the client side*. I prefer a clear, simple contract between the server and client, I think it makes it easier for future clients to be developed more easily. > Propagate System Exceptions from the OzoneManager > - > >
[jira] [Created] (HDDS-2213) Reduce key provider loading log level in OzoneFileSystem#getAdditionalTokenIssuers
Xiaoyu Yao created HDDS-2213: Summary: Reduce key provider loading log level in OzoneFileSystem#getAdditionalTokenIssuers Key: HDDS-2213 URL: https://issues.apache.org/jira/browse/HDDS-2213 Project: Hadoop Distributed Data Store Issue Type: Improvement Reporter: Vivek Ratnavel Subramanian OzoneFileSystem#getAdditionalTokenIssuers log an error when secure client tries to collect ozone delegation token to run MR/Spark jobs but ozone file system does not have a kms provider configured. In this case, we simply return null provider here in the code below. This is a benign error and we should reduce the log level to debug level. \{code} KeyProvider keyProvider; try { keyProvider = getKeyProvider(); } catch (IOException ioe) { LOG.error("Error retrieving KeyProvider.", ioe); return null; } {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10648) Expose Balancer metrics through Metrics2
[ https://issues.apache.org/jira/browse/HDFS-10648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941389#comment-16941389 ] Wei-Chiu Chuang commented on HDFS-10648: Thanks for doing this. Without metrics HDFS-13783 isn't much useful. > Expose Balancer metrics through Metrics2 > > > Key: HDFS-10648 > URL: https://issues.apache.org/jira/browse/HDFS-10648 > Project: Hadoop HDFS > Issue Type: New Feature > Components: balancer mover, metrics >Reporter: Mark Wagner >Assignee: Chen Zhang >Priority: Major > Labels: metrics > > The Balancer currently prints progress information to the console. For > deployments that run the balancer frequently, it would be helpful to collect > those metrics for publishing to the available sinks. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14305) Serial number in BlockTokenSecretManager could overlap between different namenodes
[ https://issues.apache.org/jira/browse/HDFS-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941381#comment-16941381 ] Hadoop QA commented on HDFS-14305: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 2m 4s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 54s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 47s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 16s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 5s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 11s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}107m 47s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 33s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}174m 33s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean | | | hadoop.hdfs.tools.TestDFSZKFailoverController | \\ \\ || Subsystem || Report/Notes || | Docker | Client=18.09.7 Server=18.09.7 Image:yetus/hadoop:efed4450bf1 | | JIRA Issue | HDFS-14305 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12981828/HDFS-14305-008.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux bc13cb7fa98b 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 4d3c580 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/27989/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/27989/testReport/ | | Max. process+thread count | 2864 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/27989/console | | Powered
[jira] [Updated] (HDFS-14808) EC: Improper size values for corrupt ec block in LOG
[ https://issues.apache.org/jira/browse/HDFS-14808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-14808: --- Component/s: ec > EC: Improper size values for corrupt ec block in LOG > - > > Key: HDFS-14808 > URL: https://issues.apache.org/jira/browse/HDFS-14808 > Project: Hadoop HDFS > Issue Type: Bug > Components: ec >Reporter: Harshakiran Reddy >Assignee: Ayush Saxena >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14808-01.patch > > > If the block corruption reason is size mismatch the log. The values shown and > compared are ambiguous. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Reopened] (HDFS-7134) Replication count for a block should not update till the blocks have settled on Datanodes
[ https://issues.apache.org/jira/browse/HDFS-7134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang reopened HDFS-7134: --- > Replication count for a block should not update till the blocks have settled > on Datanodes > - > > Key: HDFS-7134 > URL: https://issues.apache.org/jira/browse/HDFS-7134 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, hdfs >Affects Versions: 1.2.1, 2.6.0, 2.7.3 > Environment: Linux nn1.cluster1.com 2.6.32-431.20.3.el6.x86_64 #1 SMP > Thu Jun 19 21:14:45 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux > [hadoop@nn1 conf]$ cat /etc/redhat-release > CentOS release 6.5 (Final) >Reporter: gurmukh singh >Priority: Critical > Labels: HDFS > Fix For: 3.1.0 > > > The count for the number of replica's for a block should not change till the > blocks have settled on the datanodes. > Test Case: > Hadoop Cluster with 1 namenode and 3 datanodes. > nn1.cluster1.com(192.168.1.70) > dn1.cluster1.com(192.168.1.72) > dn2.cluster1.com(192.168.1.73) > dn3.cluster1.com(192.168.1.74) > Cluster up and running fine with replication set to "1" for parameter > "dfs.replication on all nodes" > > dfs.replication > 1 > > To reduce the wait time, have reduced the dfs.heartbeat and recheck > parameters. > on datanode2 (192.168.1.72) > [hadoop@dn2 ~]$ hadoop fs -Ddfs.replication=2 -put from_dn2 / > [hadoop@dn2 ~]$ hadoop fs -ls /from_dn2 > Found 1 items > -rw-r--r-- 2 hadoop supergroup 17 2014-09-23 13:33 /from_dn2 > On Namenode > === > As expected, copy was done from datanode2, one copy will go locally. > [hadoop@nn1 conf]$ hadoop fsck /from_dn2 -files -blocks -locations > FSCK started by hadoop from /192.168.1.70 for path /from_dn2 at Tue Sep 23 > 13:53:16 IST 2014 > /from_dn2 17 bytes, 1 block(s): OK > 0. blk_8132629811771280764_1175 len=17 repl=2 [192.168.1.74:50010, > 192.168.1.73:50010] > Can see the blocks on the data nodes disks as well under the "current" > directory. > Now, shutdown datanode2(192.168.1.73) and as expected block moves to another > datanode to maintain a replication of 2 > [hadoop@nn1 conf]$ hadoop fsck /from_dn2 -files -blocks -locations > FSCK started by hadoop from /192.168.1.70 for path /from_dn2 at Tue Sep 23 > 13:54:21 IST 2014 > /from_dn2 17 bytes, 1 block(s): OK > 0. blk_8132629811771280764_1175 len=17 repl=2 [192.168.1.74:50010, > 192.168.1.72:50010] > But, now if i bring back the datanode2, and although the namenode see that > this block is at 3 places now and fires a invalidate command for > datanode1(192.168.1.72) but the replication on the namenode is bumped to 3 > immediately. > [hadoop@nn1 conf]$ hadoop fsck /from_dn2 -files -blocks -locations > FSCK started by hadoop from /192.168.1.70 for path /from_dn2 at Tue Sep 23 > 13:56:12 IST 2014 > /from_dn2 17 bytes, 1 block(s): OK > 0. blk_8132629811771280764_1175 len=17 repl=3 [192.168.1.74:50010, > 192.168.1.72:50010, 192.168.1.73:50010] > on Datanode1 - The invalidate command has been fired immediately and the > block deleted. > = > 2014-09-23 13:54:17,483 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > Receiving blk_8132629811771280764_1175 src: /192.168.1.74:38099 dest: > /192.168.1.72:50010 > 2014-09-23 13:54:17,502 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > Received blk_8132629811771280764_1175 src: /192.168.1.74:38099 dest: > /192.168.1.72:50010 size 17 > 2014-09-23 13:55:28,720 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > Scheduling blk_8132629811771280764_1175 file > /space/disk1/current/blk_8132629811771280764 for deletion > 2014-09-23 13:55:28,721 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > Deleted blk_8132629811771280764_1175 at file > /space/disk1/current/blk_8132629811771280764 > The namenode still shows 3 replica's. even if one has been deleted, even > after more then 30 mins. > [hadoop@nn1 conf]$ hadoop fsck /from_dn2 -files -blocks -locations > FSCK started by hadoop from /192.168.1.70 for path /from_dn2 at Tue Sep 23 > 14:21:27 IST 2014 > /from_dn2 17 bytes, 1 block(s): OK > 0. blk_8132629811771280764_1175 len=17 repl=3 [192.168.1.74:50010, > 192.168.1.72:50010, 192.168.1.73:50010] > This could be a dangerous, if someone remove or other 2 datanodes fail. > On Datanode 1 > = > Before, the datanode1 is brought back > [hadoop@dn1 conf]$ ls -l /space/disk*/current > /space/disk1/current: > total 28 > -rw-rw-r-- 1 hadoop hadoop 13 Sep 21 09:09 blk_2278001646987517832 > -rw-rw-r-- 1 hadoop hadoop 11 Sep 21 09:09 blk_2278001646987517832_1171.meta > -rw-rw-r-- 1 hadoop hadoop 17 Sep 23 13:54 blk_8132629811771280764 > -rw-rw-r-- 1 hadoop hadoop 11 Sep 23 13:54 blk_8132629811771280764_1175.meta >
[jira] [Resolved] (HDFS-7134) Replication count for a block should not update till the blocks have settled on Datanodes
[ https://issues.apache.org/jira/browse/HDFS-7134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang resolved HDFS-7134. --- Resolution: Cannot Reproduce Resolve as cannot reproduce. > Replication count for a block should not update till the blocks have settled > on Datanodes > - > > Key: HDFS-7134 > URL: https://issues.apache.org/jira/browse/HDFS-7134 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, hdfs >Affects Versions: 1.2.1, 2.6.0, 2.7.3 > Environment: Linux nn1.cluster1.com 2.6.32-431.20.3.el6.x86_64 #1 SMP > Thu Jun 19 21:14:45 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux > [hadoop@nn1 conf]$ cat /etc/redhat-release > CentOS release 6.5 (Final) >Reporter: gurmukh singh >Priority: Critical > Labels: HDFS > Fix For: 3.1.0 > > > The count for the number of replica's for a block should not change till the > blocks have settled on the datanodes. > Test Case: > Hadoop Cluster with 1 namenode and 3 datanodes. > nn1.cluster1.com(192.168.1.70) > dn1.cluster1.com(192.168.1.72) > dn2.cluster1.com(192.168.1.73) > dn3.cluster1.com(192.168.1.74) > Cluster up and running fine with replication set to "1" for parameter > "dfs.replication on all nodes" > > dfs.replication > 1 > > To reduce the wait time, have reduced the dfs.heartbeat and recheck > parameters. > on datanode2 (192.168.1.72) > [hadoop@dn2 ~]$ hadoop fs -Ddfs.replication=2 -put from_dn2 / > [hadoop@dn2 ~]$ hadoop fs -ls /from_dn2 > Found 1 items > -rw-r--r-- 2 hadoop supergroup 17 2014-09-23 13:33 /from_dn2 > On Namenode > === > As expected, copy was done from datanode2, one copy will go locally. > [hadoop@nn1 conf]$ hadoop fsck /from_dn2 -files -blocks -locations > FSCK started by hadoop from /192.168.1.70 for path /from_dn2 at Tue Sep 23 > 13:53:16 IST 2014 > /from_dn2 17 bytes, 1 block(s): OK > 0. blk_8132629811771280764_1175 len=17 repl=2 [192.168.1.74:50010, > 192.168.1.73:50010] > Can see the blocks on the data nodes disks as well under the "current" > directory. > Now, shutdown datanode2(192.168.1.73) and as expected block moves to another > datanode to maintain a replication of 2 > [hadoop@nn1 conf]$ hadoop fsck /from_dn2 -files -blocks -locations > FSCK started by hadoop from /192.168.1.70 for path /from_dn2 at Tue Sep 23 > 13:54:21 IST 2014 > /from_dn2 17 bytes, 1 block(s): OK > 0. blk_8132629811771280764_1175 len=17 repl=2 [192.168.1.74:50010, > 192.168.1.72:50010] > But, now if i bring back the datanode2, and although the namenode see that > this block is at 3 places now and fires a invalidate command for > datanode1(192.168.1.72) but the replication on the namenode is bumped to 3 > immediately. > [hadoop@nn1 conf]$ hadoop fsck /from_dn2 -files -blocks -locations > FSCK started by hadoop from /192.168.1.70 for path /from_dn2 at Tue Sep 23 > 13:56:12 IST 2014 > /from_dn2 17 bytes, 1 block(s): OK > 0. blk_8132629811771280764_1175 len=17 repl=3 [192.168.1.74:50010, > 192.168.1.72:50010, 192.168.1.73:50010] > on Datanode1 - The invalidate command has been fired immediately and the > block deleted. > = > 2014-09-23 13:54:17,483 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > Receiving blk_8132629811771280764_1175 src: /192.168.1.74:38099 dest: > /192.168.1.72:50010 > 2014-09-23 13:54:17,502 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > Received blk_8132629811771280764_1175 src: /192.168.1.74:38099 dest: > /192.168.1.72:50010 size 17 > 2014-09-23 13:55:28,720 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > Scheduling blk_8132629811771280764_1175 file > /space/disk1/current/blk_8132629811771280764 for deletion > 2014-09-23 13:55:28,721 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > Deleted blk_8132629811771280764_1175 at file > /space/disk1/current/blk_8132629811771280764 > The namenode still shows 3 replica's. even if one has been deleted, even > after more then 30 mins. > [hadoop@nn1 conf]$ hadoop fsck /from_dn2 -files -blocks -locations > FSCK started by hadoop from /192.168.1.70 for path /from_dn2 at Tue Sep 23 > 14:21:27 IST 2014 > /from_dn2 17 bytes, 1 block(s): OK > 0. blk_8132629811771280764_1175 len=17 repl=3 [192.168.1.74:50010, > 192.168.1.72:50010, 192.168.1.73:50010] > This could be a dangerous, if someone remove or other 2 datanodes fail. > On Datanode 1 > = > Before, the datanode1 is brought back > [hadoop@dn1 conf]$ ls -l /space/disk*/current > /space/disk1/current: > total 28 > -rw-rw-r-- 1 hadoop hadoop 13 Sep 21 09:09 blk_2278001646987517832 > -rw-rw-r-- 1 hadoop hadoop 11 Sep 21 09:09 blk_2278001646987517832_1171.meta > -rw-rw-r-- 1 hadoop hadoop 17 Sep 23 13:54 blk_8132629811771280764 > -rw-rw-r-- 1 hadoop hadoop
[jira] [Commented] (HDFS-14754) Erasure Coding : The number of Under-Replicated Blocks never reduced
[ https://issues.apache.org/jira/browse/HDFS-14754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941376#comment-16941376 ] Wei-Chiu Chuang commented on HDFS-14754: Too bad this one didn't land in 3.2.1 and 3.1.3. Let's make sure they get added to lower releases. > Erasure Coding : The number of Under-Replicated Blocks never reduced > - > > Key: HDFS-14754 > URL: https://issues.apache.org/jira/browse/HDFS-14754 > Project: Hadoop HDFS > Issue Type: Bug > Components: ec >Reporter: hemanthboyina >Assignee: hemanthboyina >Priority: Critical > Fix For: 3.3.0 > > Attachments: HDFS-14754-addendum.001.patch, HDFS-14754.001.patch, > HDFS-14754.002.patch, HDFS-14754.003.patch, HDFS-14754.004.patch, > HDFS-14754.005.patch, HDFS-14754.006.patch, HDFS-14754.007.patch, > HDFS-14754.008.patch > > > Using EC RS-3-2, 6 DN > We came accross a scenario where in the EC 5 blocks , same block is > replicated thrice and two blocks got missing > Replicated block was not deleting and missing block is not able to ReConstruct -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-14754) Erasure Coding : The number of Under-Replicated Blocks never reduced
[ https://issues.apache.org/jira/browse/HDFS-14754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941376#comment-16941376 ] Wei-Chiu Chuang edited comment on HDFS-14754 at 9/30/19 10:20 PM: -- Too bad this one didn't land in 3.2.1 and 3.1.3. Let's make sure it gets added to lower releases. was (Author: jojochuang): Too bad this one didn't land in 3.2.1 and 3.1.3. Let's make sure they get added to lower releases. > Erasure Coding : The number of Under-Replicated Blocks never reduced > - > > Key: HDFS-14754 > URL: https://issues.apache.org/jira/browse/HDFS-14754 > Project: Hadoop HDFS > Issue Type: Bug > Components: ec >Reporter: hemanthboyina >Assignee: hemanthboyina >Priority: Critical > Fix For: 3.3.0 > > Attachments: HDFS-14754-addendum.001.patch, HDFS-14754.001.patch, > HDFS-14754.002.patch, HDFS-14754.003.patch, HDFS-14754.004.patch, > HDFS-14754.005.patch, HDFS-14754.006.patch, HDFS-14754.007.patch, > HDFS-14754.008.patch > > > Using EC RS-3-2, 6 DN > We came accross a scenario where in the EC 5 blocks , same block is > replicated thrice and two blocks got missing > Replicated block was not deleting and missing block is not able to ReConstruct -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14754) Erasure Coding : The number of Under-Replicated Blocks never reduced
[ https://issues.apache.org/jira/browse/HDFS-14754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-14754: --- Component/s: ec > Erasure Coding : The number of Under-Replicated Blocks never reduced > - > > Key: HDFS-14754 > URL: https://issues.apache.org/jira/browse/HDFS-14754 > Project: Hadoop HDFS > Issue Type: Bug > Components: ec >Reporter: hemanthboyina >Assignee: hemanthboyina >Priority: Critical > Fix For: 3.3.0 > > Attachments: HDFS-14754-addendum.001.patch, HDFS-14754.001.patch, > HDFS-14754.002.patch, HDFS-14754.003.patch, HDFS-14754.004.patch, > HDFS-14754.005.patch, HDFS-14754.006.patch, HDFS-14754.007.patch, > HDFS-14754.008.patch > > > Using EC RS-3-2, 6 DN > We came accross a scenario where in the EC 5 blocks , same block is > replicated thrice and two blocks got missing > Replicated block was not deleting and missing block is not able to ReConstruct -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2169) Avoid buffer copies while submitting client requests in Ratis
[ https://issues.apache.org/jira/browse/HDDS-2169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941365#comment-16941365 ] Tsz-wo Sze commented on HDDS-2169: -- Please ignore the previous Jenkins build. It was testing an old patch (o2169_20190923.patch, just removed). > Avoid buffer copies while submitting client requests in Ratis > - > > Key: HDDS-2169 > URL: https://issues.apache.org/jira/browse/HDDS-2169 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Shashikant Banerjee >Assignee: Tsz-wo Sze >Priority: Major > Labels: pull-request-available > Time Spent: 1h 40m > Remaining Estimate: 0h > > Currently, while sending write requests to Ratis from ozone, a protobuf > object containing data encoded and then resultant protobuf is again > converted to a byteString which internally does a copy of the buffer embedded > inside the protobuf again so that it can be submitted over to Ratis client. > Again, while sending the appendRequest as well while building up the > appendRequestProto, it might be again copying the data. The idea here is to > provide client so pass the raw data(stateMachine data) separately to ratis > client without copying overhead. > > {code:java} > private CompletableFuture sendRequestAsync( > ContainerCommandRequestProto request) { > try (Scope scope = GlobalTracer.get() > .buildSpan("XceiverClientRatis." + request.getCmdType().name()) > .startActive(true)) { > ContainerCommandRequestProto finalPayload = > ContainerCommandRequestProto.newBuilder(request) > .setTraceID(TracingUtil.exportCurrentSpan()) > .build(); > boolean isReadOnlyRequest = HddsUtils.isReadOnly(finalPayload); > // finalPayload already has the byteString data embedded. > ByteString byteString = finalPayload.toByteString(); -> It involves a > copy again. > if (LOG.isDebugEnabled()) { > LOG.debug("sendCommandAsync {} {}", isReadOnlyRequest, > sanitizeForDebug(finalPayload)); > } > return isReadOnlyRequest ? > getClient().sendReadOnlyAsync(() -> byteString) : > getClient().sendAsync(() -> byteString); > } > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2169) Avoid buffer copies while submitting client requests in Ratis
[ https://issues.apache.org/jira/browse/HDDS-2169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz-wo Sze updated HDDS-2169: - Attachment: (was: o2169_20190923.patch) > Avoid buffer copies while submitting client requests in Ratis > - > > Key: HDDS-2169 > URL: https://issues.apache.org/jira/browse/HDDS-2169 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Shashikant Banerjee >Assignee: Tsz-wo Sze >Priority: Major > Labels: pull-request-available > Time Spent: 1h 40m > Remaining Estimate: 0h > > Currently, while sending write requests to Ratis from ozone, a protobuf > object containing data encoded and then resultant protobuf is again > converted to a byteString which internally does a copy of the buffer embedded > inside the protobuf again so that it can be submitted over to Ratis client. > Again, while sending the appendRequest as well while building up the > appendRequestProto, it might be again copying the data. The idea here is to > provide client so pass the raw data(stateMachine data) separately to ratis > client without copying overhead. > > {code:java} > private CompletableFuture sendRequestAsync( > ContainerCommandRequestProto request) { > try (Scope scope = GlobalTracer.get() > .buildSpan("XceiverClientRatis." + request.getCmdType().name()) > .startActive(true)) { > ContainerCommandRequestProto finalPayload = > ContainerCommandRequestProto.newBuilder(request) > .setTraceID(TracingUtil.exportCurrentSpan()) > .build(); > boolean isReadOnlyRequest = HddsUtils.isReadOnly(finalPayload); > // finalPayload already has the byteString data embedded. > ByteString byteString = finalPayload.toByteString(); -> It involves a > copy again. > if (LOG.isDebugEnabled()) { > LOG.debug("sendCommandAsync {} {}", isReadOnlyRequest, > sanitizeForDebug(finalPayload)); > } > return isReadOnlyRequest ? > getClient().sendReadOnlyAsync(() -> byteString) : > getClient().sendAsync(() -> byteString); > } > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2212) Genconf tool should generate config files for secure cluster setup
[ https://issues.apache.org/jira/browse/HDDS-2212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Chitlangia updated HDDS-2212: Component/s: Tools > Genconf tool should generate config files for secure cluster setup > -- > > Key: HDDS-2212 > URL: https://issues.apache.org/jira/browse/HDDS-2212 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Tools >Reporter: Dinesh Chitlangia >Priority: Major > Labels: newbie > > Ozone Genconf tool currently generates a minimal ozone-site.xml file. > [~raje2411] was trying out a secure ozone setup over existing HDP-2.x cluster > and found the config set up was not as straight forward. > This jira proposes to extend the Genconf tool so we can generate required > template config files for a secure setup. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2212) Genconf tool should generate config files for secure cluster setup
Dinesh Chitlangia created HDDS-2212: --- Summary: Genconf tool should generate config files for secure cluster setup Key: HDDS-2212 URL: https://issues.apache.org/jira/browse/HDDS-2212 Project: Hadoop Distributed Data Store Issue Type: Improvement Reporter: Dinesh Chitlangia Ozone Genconf tool currently generates a minimal ozone-site.xml file. [~raje2411] was trying out a secure ozone setup over existing HDP-2.x cluster and found the config set up was not as straight forward. This jira proposes to extend the Genconf tool so we can generate required template config files for a secure setup. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2210) ContainerStateMachine should not be marked unhealthy if applyTransaction fails with closed container exception
[ https://issues.apache.org/jira/browse/HDDS-2210?focusedWorklogId=320844=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320844 ] ASF GitHub Bot logged work on HDDS-2210: Author: ASF GitHub Bot Created on: 30/Sep/19 20:35 Start Date: 30/Sep/19 20:35 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on issue #1552: HDDS-2210. ContainerStateMachine should not be marked unhealthy if ap… URL: https://github.com/apache/hadoop/pull/1552#issuecomment-536740211 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | 0 | reexec | 1773 | Docker mode activated. | ||| _ Prechecks _ | | +1 | dupname | 0 | No case conflicting files found. | | +1 | @author | 0 | The patch does not contain any @author tags. | | +1 | test4tests | 0 | The patch appears to include 1 new or modified test files. | ||| _ trunk Compile Tests _ | | 0 | mvndep | 25 | Maven dependency ordering for branch | | -1 | mvninstall | 44 | hadoop-hdds in trunk failed. | | -1 | mvninstall | 38 | hadoop-ozone in trunk failed. | | -1 | compile | 19 | hadoop-hdds in trunk failed. | | -1 | compile | 12 | hadoop-ozone in trunk failed. | | +1 | checkstyle | 58 | trunk passed | | +1 | mvnsite | 0 | trunk passed | | +1 | shadedclient | 937 | branch has no errors when building and testing our client artifacts. | | -1 | javadoc | 19 | hadoop-hdds in trunk failed. | | -1 | javadoc | 17 | hadoop-ozone in trunk failed. | | 0 | spotbugs | 1027 | Used deprecated FindBugs config; considering switching to SpotBugs. | | -1 | findbugs | 33 | hadoop-hdds in trunk failed. | | -1 | findbugs | 17 | hadoop-ozone in trunk failed. | ||| _ Patch Compile Tests _ | | 0 | mvndep | 26 | Maven dependency ordering for patch | | -1 | mvninstall | 33 | hadoop-hdds in the patch failed. | | -1 | mvninstall | 35 | hadoop-ozone in the patch failed. | | -1 | compile | 21 | hadoop-hdds in the patch failed. | | -1 | compile | 15 | hadoop-ozone in the patch failed. | | -1 | javac | 21 | hadoop-hdds in the patch failed. | | -1 | javac | 15 | hadoop-ozone in the patch failed. | | +1 | checkstyle | 53 | the patch passed | | +1 | mvnsite | 0 | the patch passed | | -1 | whitespace | 0 | The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply | | +1 | shadedclient | 800 | patch has no errors when building and testing our client artifacts. | | -1 | javadoc | 19 | hadoop-hdds in the patch failed. | | -1 | javadoc | 16 | hadoop-ozone in the patch failed. | | -1 | findbugs | 30 | hadoop-hdds in the patch failed. | | -1 | findbugs | 17 | hadoop-ozone in the patch failed. | ||| _ Other Tests _ | | -1 | unit | 24 | hadoop-hdds in the patch failed. | | -1 | unit | 24 | hadoop-ozone in the patch failed. | | +1 | asflicense | 29 | The patch does not generate ASF License warnings. | | | | 4251 | | | Subsystem | Report/Notes | |--:|:-| | Docker | Client=19.03.1 Server=19.03.1 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1552/1/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/1552 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 1dedef292033 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | personality/hadoop.sh | | git revision | trunk / 4d3c580 | | Default Java | 1.8.0_222 | | mvninstall | https://builds.apache.org/job/hadoop-multibranch/job/PR-1552/1/artifact/out/branch-mvninstall-hadoop-hdds.txt | | mvninstall | https://builds.apache.org/job/hadoop-multibranch/job/PR-1552/1/artifact/out/branch-mvninstall-hadoop-ozone.txt | | compile | https://builds.apache.org/job/hadoop-multibranch/job/PR-1552/1/artifact/out/branch-compile-hadoop-hdds.txt | | compile | https://builds.apache.org/job/hadoop-multibranch/job/PR-1552/1/artifact/out/branch-compile-hadoop-ozone.txt | | javadoc | https://builds.apache.org/job/hadoop-multibranch/job/PR-1552/1/artifact/out/branch-javadoc-hadoop-hdds.txt | | javadoc | https://builds.apache.org/job/hadoop-multibranch/job/PR-1552/1/artifact/out/branch-javadoc-hadoop-ozone.txt | | findbugs | https://builds.apache.org/job/hadoop-multibranch/job/PR-1552/1/artifact/out/branch-findbugs-hadoop-hdds.txt | | findbugs | https://builds.apache.org/job/hadoop-multibranch/job/PR-1552/1/artifact/out/branch-findbugs-hadoop-ozone.txt | | mvninstall | https://builds.apache.org/job/hadoop-multibranch/job/PR-1552/1/artifact/out/patch-mvninstall-hadoop-hdds.txt | |
[jira] [Work logged] (HDDS-2210) ContainerStateMachine should not be marked unhealthy if applyTransaction fails with closed container exception
[ https://issues.apache.org/jira/browse/HDDS-2210?focusedWorklogId=320843=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320843 ] ASF GitHub Bot logged work on HDDS-2210: Author: ASF GitHub Bot Created on: 30/Sep/19 20:35 Start Date: 30/Sep/19 20:35 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #1552: HDDS-2210. ContainerStateMachine should not be marked unhealthy if ap… URL: https://github.com/apache/hadoop/pull/1552#discussion_r329774113 ## File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/ContainerStateMachine.java ## @@ -270,12 +270,16 @@ public void persistContainerSet(OutputStream out) throws IOException { // container happens outside of Ratis. IOUtils.write(builder.build().toByteArray(), out); } + Review comment: whitespace:end of line This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320843) Time Spent: 20m (was: 10m) > ContainerStateMachine should not be marked unhealthy if applyTransaction > fails with closed container exception > -- > > Key: HDDS-2210 > URL: https://issues.apache.org/jira/browse/HDDS-2210 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Currently, if applyTransaction fails, the stateMachine is marked unhealthy > and next snapshot creation will fail. As a result of which the the raftServer > will close down leading to pipeline failure. ClosedContainer exception should > be ignored while marking the stateMachine unhealthy. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2211) Collect docker logs if env fails to start
e Spent: 0.5h > Remaining Estimate: 0h > > Occasionally some acceptance test docker environment fails to start up > properly. We need docker logs for analysis, but they are not being collected. > https://github.com/elek/ozone-ci-q4/blob/master/trunk/trunk-nightly-extra-20190930-74rp4/acceptance/output.log#L3765-L3768 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14305) Serial number in BlockTokenSecretManager could overlap between different namenodes
[ https://issues.apache.org/jira/browse/HDFS-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hung updated HDFS-14305: - Target Version/s: 2.10.0 Labels: multi-sbnn release-blocker (was: multi-sbnn) > Serial number in BlockTokenSecretManager could overlap between different > namenodes > -- > > Key: HDFS-14305 > URL: https://issues.apache.org/jira/browse/HDFS-14305 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode, security >Reporter: Chao Sun >Assignee: Xiaoqiao He >Priority: Major > Labels: multi-sbnn, release-blocker > Fix For: 3.0.4, 3.3.0, 3.2.1, 3.1.3 > > Attachments: HDFS-14305-007.patch, HDFS-14305-008.patch, > HDFS-14305.001.patch, HDFS-14305.002.patch, HDFS-14305.003.patch, > HDFS-14305.004.patch, HDFS-14305.005.patch, HDFS-14305.006.patch > > > Currently, a {{BlockTokenSecretManager}} starts with a random integer as the > initial serial number, and then use this formula to rotate it: > {code:java} > this.intRange = Integer.MAX_VALUE / numNNs; > this.nnRangeStart = intRange * nnIndex; > this.serialNo = (this.serialNo % intRange) + (nnRangeStart); > {code} > while {{numNNs}} is the total number of NameNodes in the cluster, and > {{nnIndex}} is the index of the current NameNode specified in the > configuration {{dfs.ha.namenodes.}}. > However, with this approach, different NameNode could have overlapping ranges > for serial number. For simplicity, let's assume {{Integer.MAX_VALUE}} is 100, > and we have 2 NameNodes {{nn1}} and {{nn2}} in configuration. Then the ranges > for these two are: > {code} > nn1 -> [-49, 49] > nn2 -> [1, 99] > {code} > This is because the initial serial number could be any negative integer. > Moreover, when the keys are updated, the serial number will again be updated > with the formula: > {code} > this.serialNo = (this.serialNo % intRange) + (nnRangeStart); > {code} > which means the new serial number could be updated to a range that belongs to > a different NameNode, thus increasing the chance of collision again. > When the collision happens, DataNodes could overwrite an existing key which > will cause clients to fail because of {{InvalidToken}} error. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14814) RBF: RouterQuotaUpdateService supports inherited rule.
[ https://issues.apache.org/jira/browse/HDFS-14814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941301#comment-16941301 ] Íñigo Goiri commented on HDFS-14814: Regarding {{setQuota()}}, keep in mind we should keep the signature from ClientProtocol: {code} @Idempotent void setQuota(String path, long namespaceQuota, long storagespaceQuota, StorageType type) throws IOException; {code} We cannot set Override because this is not a full impl. So let's keep the old method and add the extra one just for RouterQuotaUpdateService. We could potentially also make it package protected instead of public. (Same for all the aux methods.) I think that in thel loop in getGlobalQuota, you could just do the ifs, and not do the if with the break, you will get the same number of comparissons. In addition, I don't think getGlobalQuota() needs the check for operation as it's not supposed to be invoked through RPC directly. As I mentioned before, let's make all these aux methods package private to distinguish them from the public API ones. Maybe even adding some annotation? For the log in RouterQuotaUpdateService#fixGlobalQuota we should fully leverage logger. {code} 162 LOG.info("[Fix Quota] src={} dst={} oldQuota={}/{} newQuota={}/{}", 163 location.getSrc(), location, 164 remoteQuota.getQuota(), remoteQuota.getSpaceQuota(), 165 gQuota.getQuota(), gQuota.getSpaceQuota()); {code} > RBF: RouterQuotaUpdateService supports inherited rule. > -- > > Key: HDFS-14814 > URL: https://issues.apache.org/jira/browse/HDFS-14814 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Jinglun >Assignee: Jinglun >Priority: Major > Attachments: HDFS-14814.001.patch, HDFS-14814.002.patch, > HDFS-14814.003.patch, HDFS-14814.004.patch, HDFS-14814.005.patch, > HDFS-14814.006.patch, HDFS-14814.007.patch, HDFS-14814.008.patch, > HDFS-14814.009.patch, HDFS-14814.010.patch > > > I want to add a rule *'The quota should be set the same as the nearest > parent'* to Global Quota. Supposing we have the mount table below. > M1: /dir-a ns0->/dir-a \{nquota=10,squota=20} > M2: /dir-a/dir-b ns1->/dir-b \{nquota=-1,squota=30} > M3: /dir-a/dir-b/dir-c ns2->/dir-c \{nquota=-1,squota=-1} > M4: /dir-d ns3->/dir-d \{nquota=-1,squota=-1} > > The quota for the remote locations on the namespaces should be: > ns0->/dir-a \{nquota=10,squota=20} > ns1->/dir-b \{nquota=10,squota=30} > ns2->/dir-c \{nquota=10,squota=30} > ns3->/dir-d \{nquota=-1,squota=-1} > > The quota of the remote location is set the same as the corresponding > MountTable, and if there is no quota of the MountTable then the quota is set > to the nearest parent MountTable with quota. > > It's easy to implement it. In RouterQuotaUpdateService each time we compute > the currentQuotaUsage, we can get the quota info for each MountTable. We can > do a > check and fix all the MountTable which's quota doesn't match the rule above. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2199) In SCMNodeManager dnsToUuidMap cannot track multiple DNs on the same host
[ https://issues.apache.org/jira/browse/HDDS-2199?focusedWorklogId=320821=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320821 ] ASF GitHub Bot logged work on HDDS-2199: Author: ASF GitHub Bot Created on: 30/Sep/19 20:05 Start Date: 30/Sep/19 20:05 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on issue #1551: HDDS-2199 In SCMNodeManager dnsToUuidMap cannot track multiple DNs on the same host URL: https://github.com/apache/hadoop/pull/1551#issuecomment-536728763 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | 0 | reexec | 39 | Docker mode activated. | ||| _ Prechecks _ | | +1 | dupname | 0 | No case conflicting files found. | | +1 | @author | 0 | The patch does not contain any @author tags. | | +1 | test4tests | 0 | The patch appears to include 3 new or modified test files. | ||| _ trunk Compile Tests _ | | -1 | mvninstall | 31 | hadoop-hdds in trunk failed. | | -1 | mvninstall | 33 | hadoop-ozone in trunk failed. | | -1 | compile | 21 | hadoop-hdds in trunk failed. | | -1 | compile | 15 | hadoop-ozone in trunk failed. | | +1 | checkstyle | 50 | trunk passed | | +1 | mvnsite | 0 | trunk passed | | +1 | shadedclient | 843 | branch has no errors when building and testing our client artifacts. | | -1 | javadoc | 22 | hadoop-hdds in trunk failed. | | -1 | javadoc | 21 | hadoop-ozone in trunk failed. | | 0 | spotbugs | 943 | Used deprecated FindBugs config; considering switching to SpotBugs. | | -1 | findbugs | 33 | hadoop-hdds in trunk failed. | | -1 | findbugs | 20 | hadoop-ozone in trunk failed. | ||| _ Patch Compile Tests _ | | -1 | mvninstall | 36 | hadoop-hdds in the patch failed. | | -1 | mvninstall | 35 | hadoop-ozone in the patch failed. | | -1 | compile | 23 | hadoop-hdds in the patch failed. | | -1 | compile | 18 | hadoop-ozone in the patch failed. | | -1 | javac | 23 | hadoop-hdds in the patch failed. | | -1 | javac | 18 | hadoop-ozone in the patch failed. | | +1 | checkstyle | 55 | the patch passed | | +1 | mvnsite | 0 | the patch passed | | +1 | whitespace | 0 | The patch has no whitespace issues. | | +1 | shadedclient | 703 | patch has no errors when building and testing our client artifacts. | | -1 | javadoc | 22 | hadoop-hdds in the patch failed. | | -1 | javadoc | 20 | hadoop-ozone in the patch failed. | | -1 | findbugs | 31 | hadoop-hdds in the patch failed. | | -1 | findbugs | 21 | hadoop-ozone in the patch failed. | ||| _ Other Tests _ | | -1 | unit | 29 | hadoop-hdds in the patch failed. | | -1 | unit | 27 | hadoop-ozone in the patch failed. | | +1 | asflicense | 34 | The patch does not generate ASF License warnings. | | | | 2320 | | | Subsystem | Report/Notes | |--:|:-| | Docker | Client=19.03.1 Server=19.03.1 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1551/2/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/1551 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 59d2934bbdb9 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | personality/hadoop.sh | | git revision | trunk / 4d3c580 | | Default Java | 1.8.0_222 | | mvninstall | https://builds.apache.org/job/hadoop-multibranch/job/PR-1551/2/artifact/out/branch-mvninstall-hadoop-hdds.txt | | mvninstall | https://builds.apache.org/job/hadoop-multibranch/job/PR-1551/2/artifact/out/branch-mvninstall-hadoop-ozone.txt | | compile | https://builds.apache.org/job/hadoop-multibranch/job/PR-1551/2/artifact/out/branch-compile-hadoop-hdds.txt | | compile | https://builds.apache.org/job/hadoop-multibranch/job/PR-1551/2/artifact/out/branch-compile-hadoop-ozone.txt | | javadoc | https://builds.apache.org/job/hadoop-multibranch/job/PR-1551/2/artifact/out/branch-javadoc-hadoop-hdds.txt | | javadoc | https://builds.apache.org/job/hadoop-multibranch/job/PR-1551/2/artifact/out/branch-javadoc-hadoop-ozone.txt | | findbugs | https://builds.apache.org/job/hadoop-multibranch/job/PR-1551/2/artifact/out/branch-findbugs-hadoop-hdds.txt | | findbugs | https://builds.apache.org/job/hadoop-multibranch/job/PR-1551/2/artifact/out/branch-findbugs-hadoop-ozone.txt | | mvninstall | https://builds.apache.org/job/hadoop-multibranch/job/PR-1551/2/artifact/out/patch-mvninstall-hadoop-hdds.txt | | mvninstall | https://builds.apache.org/job/hadoop-multibranch/job/PR-1551/2/artifact/out/patch-mvninstall-hadoop-ozone.txt | | compile |
[jira] [Commented] (HDFS-13270) RBF: Router audit logger
[ https://issues.apache.org/jira/browse/HDFS-13270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941294#comment-16941294 ] Íñigo Goiri commented on HDFS-13270: Given the architecture with RBF, I think the most important is not to mimic the behavior of the NN fully. I think what should be clear is that we executed an operation on a particular NN. We should focus on making easy to correlate NN audit logs with Router audit logs. > RBF: Router audit logger > > > Key: HDFS-13270 > URL: https://issues.apache.org/jira/browse/HDFS-13270 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs >Affects Versions: 3.2.0 >Reporter: maobaolong >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-13270.001.patch, HDFS-13270.002.patch, > HDFS-13270.003.patch > > > We can use router auditlogger to log the client info and cmd, because the > FSNamesystem#Auditlogger's log think the client are all from router. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14495) RBF: Duplicate FederationRPCMetrics
[ https://issues.apache.org/jira/browse/HDFS-14495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941292#comment-16941292 ] Íñigo Goiri commented on HDFS-14495: The only problem I would see is on backwards compatibility. Not sure what the guideline is for JMX, etc. I would be concerned about removing a JMX bean for example. > RBF: Duplicate FederationRPCMetrics > --- > > Key: HDFS-14495 > URL: https://issues.apache.org/jira/browse/HDFS-14495 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: metrics >Reporter: Akira Ajisaka >Assignee: hemanthboyina >Priority: Major > > There are two FederationRPCMetrics displayed in Web UI (http:// hostname>:/jmx) and most of the metrics are the same. > * FederationRPCMetrics via {{@Metrics}} and {{@Metric}} annotations > * FederationRPCMetrics via registering FederationRPCMBean > Can we remove {{@Metrics}} and {{@Metric}} annotations to remove duplication? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2211) Collect docker logs if env fails to start
[ https://issues.apache.org/jira/browse/HDDS-2211?focusedWorklogId=320818=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320818 ] ASF GitHub Bot logged work on HDDS-2211: Author: ASF GitHub Bot Created on: 30/Sep/19 19:51 Start Date: 30/Sep/19 19:51 Worklog Time Spent: 10m Work Description: adoroszlai commented on issue #1553: HDDS-2211. Collect docker logs if env fails to start URL: https://github.com/apache/hadoop/pull/1553#issuecomment-536723744 /label ozone This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320818) Time Spent: 20m (was: 10m) > Collect docker logs if env fails to start > - > > Key: HDDS-2211 > URL: https://issues.apache.org/jira/browse/HDDS-2211 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: test >Reporter: Attila Doroszlai >Assignee: Attila Doroszlai >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Occasionally some acceptance test docker environment fails to start up > properly. We need docker logs for analysis, but they are not being collected. > https://github.com/elek/ozone-ci-q4/blob/master/trunk/trunk-nightly-extra-20190930-74rp4/acceptance/output.log#L3765-L3768 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2211) Collect docker logs if env fails to start
[ https://issues.apache.org/jira/browse/HDDS-2211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDDS-2211: - Labels: pull-request-available (was: ) > Collect docker logs if env fails to start > - > > Key: HDDS-2211 > URL: https://issues.apache.org/jira/browse/HDDS-2211 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: test >Reporter: Attila Doroszlai >Assignee: Attila Doroszlai >Priority: Major > Labels: pull-request-available > > Occasionally some acceptance test docker environment fails to start up > properly. We need docker logs for analysis, but they are not being collected. > https://github.com/elek/ozone-ci-q4/blob/master/trunk/trunk-nightly-extra-20190930-74rp4/acceptance/output.log#L3765-L3768 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2211) Collect docker logs if env fails to start
[ https://issues.apache.org/jira/browse/HDDS-2211?focusedWorklogId=320816=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320816 ] ASF GitHub Bot logged work on HDDS-2211: Author: ASF GitHub Bot Created on: 30/Sep/19 19:50 Start Date: 30/Sep/19 19:50 Worklog Time Spent: 10m Work Description: adoroszlai commented on pull request #1553: HDDS-2211. Collect docker logs if env fails to start URL: https://github.com/apache/hadoop/pull/1553 ## What changes were proposed in this pull request? 1. Collect docker logs if environment fails to start up (previously it was collected only after actual test run) 2. Copy docker logs to aggregate result directory 3. Fail fast if datanodes cannot be started 4. Avoid [ANSI codes in output](https://github.com/elek/ozone-ci-q4/blob/f700ebb2254527527960593496f15008245094d6/trunk/trunk-nightly-extra-20190930-74rp4/acceptance/output.log#L3774) 5. Sort test cases for consistent run ordering (Robot summary table is already sorted) https://issues.apache.org/jira/browse/HDDS-2211 ## How was this patch tested? Ran `acceptance.sh` locally. Also simulated failure during datanode startup, verified that docker log is saved. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320816) Remaining Estimate: 0h Time Spent: 10m > Collect docker logs if env fails to start > - > > Key: HDDS-2211 > URL: https://issues.apache.org/jira/browse/HDDS-2211 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: test >Reporter: Attila Doroszlai >Assignee: Attila Doroszlai >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Occasionally some acceptance test docker environment fails to start up > properly. We need docker logs for analysis, but they are not being collected. > https://github.com/elek/ozone-ci-q4/blob/master/trunk/trunk-nightly-extra-20190930-74rp4/acceptance/output.log#L3765-L3768 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2019) Handle Set DtService of token in S3Gateway for OM HA
[ https://issues.apache.org/jira/browse/HDDS-2019?focusedWorklogId=320795=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320795 ] ASF GitHub Bot logged work on HDDS-2019: Author: ASF GitHub Bot Created on: 30/Sep/19 19:29 Start Date: 30/Sep/19 19:29 Worklog Time Spent: 10m Work Description: xiaoyuyao commented on pull request #1489: HDDS-2019. Handle Set DtService of token in S3Gateway for OM HA. URL: https://github.com/apache/hadoop/pull/1489#discussion_r329748308 ## File path: hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/OzoneServiceProvider.java ## @@ -20,33 +20,75 @@ import org.apache.hadoop.hdds.conf.OzoneConfiguration; import org.apache.hadoop.io.Text; import org.apache.hadoop.ozone.OmUtils; +import org.apache.hadoop.ozone.s3.util.OzoneS3Util; import org.apache.hadoop.security.SecurityUtil; import javax.annotation.PostConstruct; import javax.enterprise.context.ApplicationScoped; import javax.enterprise.inject.Produces; import javax.inject.Inject; + +import java.util.Arrays; +import java.util.Collection; + +import static org.apache.hadoop.ozone.om.OMConfigKeys.OZONE_OM_NODES_KEY; +import static org.apache.hadoop.ozone.om.OMConfigKeys.OZONE_OM_SERVICE_IDS_KEY; + /** * This class creates the OM service . */ @ApplicationScoped public class OzoneServiceProvider { - private Text omServiceAdd; + private Text omServiceAddr; + + private String omserviceID; @Inject private OzoneConfiguration conf; @PostConstruct public void init() { -omServiceAdd = SecurityUtil.buildTokenService(OmUtils. -getOmAddressForClients(conf)); +Collection serviceIdList = +conf.getTrimmedStringCollection(OZONE_OM_SERVICE_IDS_KEY); +if (serviceIdList.size() == 0) { + // Non-HA cluster + omServiceAddr = SecurityUtil.buildTokenService(OmUtils. + getOmAddressForClients(conf)); +} else { + // HA cluster. + //For now if multiple service id's are configured we throw exception. Review comment: This is very similar to what HDFS HA does. Since OM support service discovery, can we get the internal service id via service discovery? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320795) Time Spent: 3h 20m (was: 3h 10m) > Handle Set DtService of token in S3Gateway for OM HA > > > Key: HDDS-2019 > URL: https://issues.apache.org/jira/browse/HDDS-2019 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Critical > Labels: pull-request-available > Time Spent: 3h 20m > Remaining Estimate: 0h > > When OM HA is enabled, when tokens are generated, the service name should be > set with address of all OM's. > > Current without HA, it is set with Om RpcAddress string. This Jira is to > handle: > # Set dtService with all OM address. Right now in OMClientProducer, UGI is > created with S3 token, and serviceName of token is set with OMAddress, for HA > case, this should be set with all OM RPC addresses. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2019) Handle Set DtService of token in S3Gateway for OM HA
[ https://issues.apache.org/jira/browse/HDDS-2019?focusedWorklogId=320788=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320788 ] ASF GitHub Bot logged work on HDDS-2019: Author: ASF GitHub Bot Created on: 30/Sep/19 19:25 Start Date: 30/Sep/19 19:25 Worklog Time Spent: 10m Work Description: xiaoyuyao commented on pull request #1489: HDDS-2019. Handle Set DtService of token in S3Gateway for OM HA. URL: https://github.com/apache/hadoop/pull/1489#discussion_r329746422 ## File path: hadoop-ozone/s3gateway/src/test/java/org/apache/hadoop/ozone/s3/util/TestOzoneS3Util.java ## @@ -0,0 +1,129 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.ozone.s3.util; + +import org.apache.hadoop.hdds.conf.OzoneConfiguration; +import org.apache.hadoop.ozone.OmUtils; +import org.apache.hadoop.ozone.om.OMConfigKeys; +import org.apache.hadoop.security.SecurityUtil; +import org.apache.hadoop.test.GenericTestUtils; +import org.junit.Assert; +import org.junit.Before; +import org.junit.Test; + +import java.util.Collection; + +import static org.apache.hadoop.fs.CommonConfigurationKeys.HADOOP_SECURITY_TOKEN_SERVICE_USE_IP; +import static org.junit.Assert.fail; + +/** + * Class used to test OzoneS3Util. + */ +public class TestOzoneS3Util { + + + private OzoneConfiguration configuration; + + @Before + public void setConf() { +configuration = new OzoneConfiguration(); +String serviceID = "omService"; +String nodeIDs = "om1,om2,om3"; +configuration.set(OMConfigKeys.OZONE_OM_SERVICE_IDS_KEY, serviceID); +configuration.set(OMConfigKeys.OZONE_OM_NODE_ID_KEY, nodeIDs); +configuration.setBoolean(HADOOP_SECURITY_TOKEN_SERVICE_USE_IP, false); + } + + @Test + public void testBuildServiceNameForToken() { +String serviceID = "omService"; +String nodeIDs = "om1,om2,om3"; +configuration.set(OMConfigKeys.OZONE_OM_SERVICE_IDS_KEY, serviceID); +configuration.set(OMConfigKeys.OZONE_OM_NODE_ID_KEY, nodeIDs); +configuration.setBoolean(HADOOP_SECURITY_TOKEN_SERVICE_USE_IP, false); + +Collection nodeIDList = configuration.getStringCollection( +OMConfigKeys.OZONE_OM_NODE_ID_KEY); + +String expectedOmServiceAddress = buildOMNodeAddresses(nodeIDList, +serviceID); + +SecurityUtil.setConfiguration(configuration); +String omserviceAddr = OzoneS3Util.buildServiceNameForToken(configuration, +serviceID, nodeIDList); + +Assert.assertEquals(expectedOmServiceAddress, omserviceAddr); + } + + @Test + public void testBuildServiceNameForTokenIncorrectConfig() { +String serviceID = "omService"; +String nodeIDs = "om1,om2,om3"; +configuration.set(OMConfigKeys.OZONE_OM_SERVICE_IDS_KEY, serviceID); +configuration.set(OMConfigKeys.OZONE_OM_NODE_ID_KEY, nodeIDs); +configuration.setBoolean(HADOOP_SECURITY_TOKEN_SERVICE_USE_IP, false); + +Collection nodeIDList = configuration.getStringCollection( +OMConfigKeys.OZONE_OM_NODE_ID_KEY); + +// Don't set om3 node rpc address. +configuration.set(OmUtils.addKeySuffixes(OMConfigKeys.OZONE_OM_ADDRESS_KEY, +serviceID, "om1"), "om1:9862"); +configuration.set(OmUtils.addKeySuffixes(OMConfigKeys.OZONE_OM_ADDRESS_KEY, +serviceID, "om2"), "om2:9862"); + + +SecurityUtil.setConfiguration(configuration); + +try { + OzoneS3Util.buildServiceNameForToken(configuration, + serviceID, nodeIDList); + fail("testBuildServiceNameForTokenIncorrectConfig failed"); +} catch (IllegalArgumentException ex) { + GenericTestUtils.assertExceptionContains("Could not find rpcAddress " + + "for", ex); +} + + + } + + + private String buildOMNodeAddresses(Collection nodeIDList, + String serviceID) { +StringBuilder omServiceAddrBuilder = new StringBuilder(); +int port = 9862; +int nodesLength = nodeIDList.size(); +int counter = 0; +for (String nodeID : nodeIDList) { + counter++; + String addr = nodeID + ":" + port++; + configuration.set(OmUtils.addKeySuffixes( +
[jira] [Work logged] (HDDS-2019) Handle Set DtService of token in S3Gateway for OM HA
[ https://issues.apache.org/jira/browse/HDDS-2019?focusedWorklogId=320787=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320787 ] ASF GitHub Bot logged work on HDDS-2019: Author: ASF GitHub Bot Created on: 30/Sep/19 19:24 Start Date: 30/Sep/19 19:24 Worklog Time Spent: 10m Work Description: xiaoyuyao commented on pull request #1489: HDDS-2019. Handle Set DtService of token in S3Gateway for OM HA. URL: https://github.com/apache/hadoop/pull/1489#discussion_r329746422 ## File path: hadoop-ozone/s3gateway/src/test/java/org/apache/hadoop/ozone/s3/util/TestOzoneS3Util.java ## @@ -0,0 +1,129 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.ozone.s3.util; + +import org.apache.hadoop.hdds.conf.OzoneConfiguration; +import org.apache.hadoop.ozone.OmUtils; +import org.apache.hadoop.ozone.om.OMConfigKeys; +import org.apache.hadoop.security.SecurityUtil; +import org.apache.hadoop.test.GenericTestUtils; +import org.junit.Assert; +import org.junit.Before; +import org.junit.Test; + +import java.util.Collection; + +import static org.apache.hadoop.fs.CommonConfigurationKeys.HADOOP_SECURITY_TOKEN_SERVICE_USE_IP; +import static org.junit.Assert.fail; + +/** + * Class used to test OzoneS3Util. + */ +public class TestOzoneS3Util { + + + private OzoneConfiguration configuration; + + @Before + public void setConf() { +configuration = new OzoneConfiguration(); +String serviceID = "omService"; +String nodeIDs = "om1,om2,om3"; +configuration.set(OMConfigKeys.OZONE_OM_SERVICE_IDS_KEY, serviceID); +configuration.set(OMConfigKeys.OZONE_OM_NODE_ID_KEY, nodeIDs); +configuration.setBoolean(HADOOP_SECURITY_TOKEN_SERVICE_USE_IP, false); + } + + @Test + public void testBuildServiceNameForToken() { +String serviceID = "omService"; +String nodeIDs = "om1,om2,om3"; +configuration.set(OMConfigKeys.OZONE_OM_SERVICE_IDS_KEY, serviceID); +configuration.set(OMConfigKeys.OZONE_OM_NODE_ID_KEY, nodeIDs); +configuration.setBoolean(HADOOP_SECURITY_TOKEN_SERVICE_USE_IP, false); + +Collection nodeIDList = configuration.getStringCollection( +OMConfigKeys.OZONE_OM_NODE_ID_KEY); + +String expectedOmServiceAddress = buildOMNodeAddresses(nodeIDList, +serviceID); + +SecurityUtil.setConfiguration(configuration); +String omserviceAddr = OzoneS3Util.buildServiceNameForToken(configuration, +serviceID, nodeIDList); + +Assert.assertEquals(expectedOmServiceAddress, omserviceAddr); + } + + @Test + public void testBuildServiceNameForTokenIncorrectConfig() { +String serviceID = "omService"; +String nodeIDs = "om1,om2,om3"; +configuration.set(OMConfigKeys.OZONE_OM_SERVICE_IDS_KEY, serviceID); +configuration.set(OMConfigKeys.OZONE_OM_NODE_ID_KEY, nodeIDs); +configuration.setBoolean(HADOOP_SECURITY_TOKEN_SERVICE_USE_IP, false); + +Collection nodeIDList = configuration.getStringCollection( +OMConfigKeys.OZONE_OM_NODE_ID_KEY); + +// Don't set om3 node rpc address. +configuration.set(OmUtils.addKeySuffixes(OMConfigKeys.OZONE_OM_ADDRESS_KEY, +serviceID, "om1"), "om1:9862"); +configuration.set(OmUtils.addKeySuffixes(OMConfigKeys.OZONE_OM_ADDRESS_KEY, +serviceID, "om2"), "om2:9862"); + + +SecurityUtil.setConfiguration(configuration); + +try { + OzoneS3Util.buildServiceNameForToken(configuration, + serviceID, nodeIDList); + fail("testBuildServiceNameForTokenIncorrectConfig failed"); +} catch (IllegalArgumentException ex) { + GenericTestUtils.assertExceptionContains("Could not find rpcAddress " + + "for", ex); +} + + + } + + + private String buildOMNodeAddresses(Collection nodeIDList, + String serviceID) { +StringBuilder omServiceAddrBuilder = new StringBuilder(); +int port = 9862; +int nodesLength = nodeIDList.size(); +int counter = 0; +for (String nodeID : nodeIDList) { + counter++; + String addr = nodeID + ":" + port++; + configuration.set(OmUtils.addKeySuffixes( +
[jira] [Commented] (HDFS-14305) Serial number in BlockTokenSecretManager could overlap between different namenodes
[ https://issues.apache.org/jira/browse/HDFS-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941272#comment-16941272 ] Konstantin Shvachko commented on HDFS-14305: Attached v08, which fixes {{TestFailoverWithBlockTokensEnabled}} and findbugs warning. [~jojochuang], correct this problem is in 2.10 as well. Trying to fix it before the release. > Serial number in BlockTokenSecretManager could overlap between different > namenodes > -- > > Key: HDFS-14305 > URL: https://issues.apache.org/jira/browse/HDFS-14305 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode, security >Reporter: Chao Sun >Assignee: Xiaoqiao He >Priority: Major > Labels: multi-sbnn > Fix For: 3.0.4, 3.3.0, 3.2.1, 3.1.3 > > Attachments: HDFS-14305-007.patch, HDFS-14305-008.patch, > HDFS-14305.001.patch, HDFS-14305.002.patch, HDFS-14305.003.patch, > HDFS-14305.004.patch, HDFS-14305.005.patch, HDFS-14305.006.patch > > > Currently, a {{BlockTokenSecretManager}} starts with a random integer as the > initial serial number, and then use this formula to rotate it: > {code:java} > this.intRange = Integer.MAX_VALUE / numNNs; > this.nnRangeStart = intRange * nnIndex; > this.serialNo = (this.serialNo % intRange) + (nnRangeStart); > {code} > while {{numNNs}} is the total number of NameNodes in the cluster, and > {{nnIndex}} is the index of the current NameNode specified in the > configuration {{dfs.ha.namenodes.}}. > However, with this approach, different NameNode could have overlapping ranges > for serial number. For simplicity, let's assume {{Integer.MAX_VALUE}} is 100, > and we have 2 NameNodes {{nn1}} and {{nn2}} in configuration. Then the ranges > for these two are: > {code} > nn1 -> [-49, 49] > nn2 -> [1, 99] > {code} > This is because the initial serial number could be any negative integer. > Moreover, when the keys are updated, the serial number will again be updated > with the formula: > {code} > this.serialNo = (this.serialNo % intRange) + (nnRangeStart); > {code} > which means the new serial number could be updated to a range that belongs to > a different NameNode, thus increasing the chance of collision again. > When the collision happens, DataNodes could overwrite an existing key which > will cause clients to fail because of {{InvalidToken}} error. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14305) Serial number in BlockTokenSecretManager could overlap between different namenodes
[ https://issues.apache.org/jira/browse/HDFS-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated HDFS-14305: --- Attachment: HDFS-14305-008.patch > Serial number in BlockTokenSecretManager could overlap between different > namenodes > -- > > Key: HDFS-14305 > URL: https://issues.apache.org/jira/browse/HDFS-14305 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode, security >Reporter: Chao Sun >Assignee: Xiaoqiao He >Priority: Major > Labels: multi-sbnn > Fix For: 3.0.4, 3.3.0, 3.2.1, 3.1.3 > > Attachments: HDFS-14305-007.patch, HDFS-14305-008.patch, > HDFS-14305.001.patch, HDFS-14305.002.patch, HDFS-14305.003.patch, > HDFS-14305.004.patch, HDFS-14305.005.patch, HDFS-14305.006.patch > > > Currently, a {{BlockTokenSecretManager}} starts with a random integer as the > initial serial number, and then use this formula to rotate it: > {code:java} > this.intRange = Integer.MAX_VALUE / numNNs; > this.nnRangeStart = intRange * nnIndex; > this.serialNo = (this.serialNo % intRange) + (nnRangeStart); > {code} > while {{numNNs}} is the total number of NameNodes in the cluster, and > {{nnIndex}} is the index of the current NameNode specified in the > configuration {{dfs.ha.namenodes.}}. > However, with this approach, different NameNode could have overlapping ranges > for serial number. For simplicity, let's assume {{Integer.MAX_VALUE}} is 100, > and we have 2 NameNodes {{nn1}} and {{nn2}} in configuration. Then the ranges > for these two are: > {code} > nn1 -> [-49, 49] > nn2 -> [1, 99] > {code} > This is because the initial serial number could be any negative integer. > Moreover, when the keys are updated, the serial number will again be updated > with the formula: > {code} > this.serialNo = (this.serialNo % intRange) + (nnRangeStart); > {code} > which means the new serial number could be updated to a range that belongs to > a different NameNode, thus increasing the chance of collision again. > When the collision happens, DataNodes could overwrite an existing key which > will cause clients to fail because of {{InvalidToken}} error. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2210) ContainerStateMachine should not be marked unhealthy if applyTransaction fails with closed container exception
[ https://issues.apache.org/jira/browse/HDDS-2210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDDS-2210: - Labels: pull-request-available (was: ) > ContainerStateMachine should not be marked unhealthy if applyTransaction > fails with closed container exception > -- > > Key: HDDS-2210 > URL: https://issues.apache.org/jira/browse/HDDS-2210 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > > Currently, if applyTransaction fails, the stateMachine is marked unhealthy > and next snapshot creation will fail. As a result of which the the raftServer > will close down leading to pipeline failure. ClosedContainer exception should > be ignored while marking the stateMachine unhealthy. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2210) ContainerStateMachine should not be marked unhealthy if applyTransaction fails with closed container exception
[ https://issues.apache.org/jira/browse/HDDS-2210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shashikant Banerjee updated HDDS-2210: -- Status: Patch Available (was: Open) > ContainerStateMachine should not be marked unhealthy if applyTransaction > fails with closed container exception > -- > > Key: HDDS-2210 > URL: https://issues.apache.org/jira/browse/HDDS-2210 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 10m > Remaining Estimate: 0h > > Currently, if applyTransaction fails, the stateMachine is marked unhealthy > and next snapshot creation will fail. As a result of which the the raftServer > will close down leading to pipeline failure. ClosedContainer exception should > be ignored while marking the stateMachine unhealthy. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2210) ContainerStateMachine should not be marked unhealthy if applyTransaction fails with closed container exception
[ https://issues.apache.org/jira/browse/HDDS-2210?focusedWorklogId=320782=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320782 ] ASF GitHub Bot logged work on HDDS-2210: Author: ASF GitHub Bot Created on: 30/Sep/19 19:14 Start Date: 30/Sep/19 19:14 Worklog Time Spent: 10m Work Description: bshashikant commented on pull request #1552: HDDS-2210. ContainerStateMachine should not be marked unhealthy if ap… URL: https://github.com/apache/hadoop/pull/1552 …plyTransaction fails with closed container exception. ## NOTICE Please create an issue in ASF JIRA before opening a pull request, and you need to set the title of the pull request which starts with the corresponding JIRA issue number. (e.g. HADOOP-X. Fix a typo in YYY.) For more details, please see https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320782) Remaining Estimate: 0h Time Spent: 10m > ContainerStateMachine should not be marked unhealthy if applyTransaction > fails with closed container exception > -- > > Key: HDDS-2210 > URL: https://issues.apache.org/jira/browse/HDDS-2210 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 10m > Remaining Estimate: 0h > > Currently, if applyTransaction fails, the stateMachine is marked unhealthy > and next snapshot creation will fail. As a result of which the the raftServer > will close down leading to pipeline failure. ClosedContainer exception should > be ignored while marking the stateMachine unhealthy. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2019) Handle Set DtService of token in S3Gateway for OM HA
[ https://issues.apache.org/jira/browse/HDDS-2019?focusedWorklogId=320780=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320780 ] ASF GitHub Bot logged work on HDDS-2019: Author: ASF GitHub Bot Created on: 30/Sep/19 19:09 Start Date: 30/Sep/19 19:09 Worklog Time Spent: 10m Work Description: xiaoyuyao commented on pull request #1489: HDDS-2019. Handle Set DtService of token in S3Gateway for OM HA. URL: https://github.com/apache/hadoop/pull/1489#discussion_r329740302 ## File path: hadoop-ozone/s3gateway/src/test/java/org/apache/hadoop/ozone/s3/util/TestOzoneS3Util.java ## @@ -0,0 +1,129 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.ozone.s3.util; + +import org.apache.hadoop.hdds.conf.OzoneConfiguration; +import org.apache.hadoop.ozone.OmUtils; +import org.apache.hadoop.ozone.om.OMConfigKeys; +import org.apache.hadoop.security.SecurityUtil; +import org.apache.hadoop.test.GenericTestUtils; +import org.junit.Assert; +import org.junit.Before; +import org.junit.Test; + +import java.util.Collection; + +import static org.apache.hadoop.fs.CommonConfigurationKeys.HADOOP_SECURITY_TOKEN_SERVICE_USE_IP; +import static org.junit.Assert.fail; + +/** + * Class used to test OzoneS3Util. + */ +public class TestOzoneS3Util { + + + private OzoneConfiguration configuration; + + @Before + public void setConf() { +configuration = new OzoneConfiguration(); +String serviceID = "omService"; +String nodeIDs = "om1,om2,om3"; +configuration.set(OMConfigKeys.OZONE_OM_SERVICE_IDS_KEY, serviceID); +configuration.set(OMConfigKeys.OZONE_OM_NODE_ID_KEY, nodeIDs); +configuration.setBoolean(HADOOP_SECURITY_TOKEN_SERVICE_USE_IP, false); + } + + @Test + public void testBuildServiceNameForToken() { +String serviceID = "omService"; Review comment: Line 55-59 is redundant since you have them in setConf before each test. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320780) Time Spent: 2h 50m (was: 2h 40m) > Handle Set DtService of token in S3Gateway for OM HA > > > Key: HDDS-2019 > URL: https://issues.apache.org/jira/browse/HDDS-2019 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Critical > Labels: pull-request-available > Time Spent: 2h 50m > Remaining Estimate: 0h > > When OM HA is enabled, when tokens are generated, the service name should be > set with address of all OM's. > > Current without HA, it is set with Om RpcAddress string. This Jira is to > handle: > # Set dtService with all OM address. Right now in OMClientProducer, UGI is > created with S3 token, and serviceName of token is set with OMAddress, for HA > case, this should be set with all OM RPC addresses. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14884) Add sanity check that zone key equals feinfo key while setting Xattrs
[ https://issues.apache.org/jira/browse/HDFS-14884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941254#comment-16941254 ] Mukul Kumar Singh commented on HDFS-14884: -- [^hdfs_distcp.patch] reproes the failure > Add sanity check that zone key equals feinfo key while setting Xattrs > - > > Key: HDFS-14884 > URL: https://issues.apache.org/jira/browse/HDFS-14884 > Project: Hadoop HDFS > Issue Type: Bug > Components: encryption, hdfs >Reporter: Mukul Kumar Singh >Priority: Major > Attachments: hdfs_distcp.patch > > > Currently, it is possible to set an external attribute where the zone key is > not the same as feinfo key. This jira will add a precondition before setting > this. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14884) Add sanity check that zone key equals feinfo key while setting Xattrs
[ https://issues.apache.org/jira/browse/HDFS-14884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh updated HDFS-14884: - Attachment: hdfs_distcp.patch > Add sanity check that zone key equals feinfo key while setting Xattrs > - > > Key: HDFS-14884 > URL: https://issues.apache.org/jira/browse/HDFS-14884 > Project: Hadoop HDFS > Issue Type: Bug > Components: encryption, hdfs >Reporter: Mukul Kumar Singh >Priority: Major > Attachments: hdfs_distcp.patch > > > Currently, it is possible to set an external attribute where the zone key is > not the same as feinfo key. This jira will add a precondition before setting > this. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14528) Failover from Active to Standby Failed
[ https://issues.apache.org/jira/browse/HDFS-14528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-14528: --- Labels: multi-sbnn (was: ) > Failover from Active to Standby Failed > > > Key: HDFS-14528 > URL: https://issues.apache.org/jira/browse/HDFS-14528 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha >Reporter: Ravuri Sushma sree >Assignee: Ravuri Sushma sree >Priority: Major > Labels: multi-sbnn > Attachments: HDFS-14528.003.patch, HDFS-14528.004.patch, > HDFS-14528.005.patch, HDFS-14528.2.Patch, ZKFC_issue.patch > > > *In a cluster with more than one Standby namenode, manual failover throws > exception for some cases* > *When trying to exectue the failover command from active to standby* > *._/hdfs haadmin -failover nn1 nn2, below Exception is thrown_* > Operation failed: Call From X-X-X-X/X-X-X-X to Y-Y-Y-Y: failed on > connection exception: java.net.ConnectException: Connection refused > This is encountered in the following cases : > Scenario 1 : > Namenodes - NN1(Active) , NN2(Standby), NN3(Standby) > When trying to manually failover from NN1 to NN2 if NN3 is down, Exception is > thrown > Scenario 2 : > Namenodes - NN1(Active) , NN2(Standby), NN3(Standby) > ZKFC's - ZKFC1, ZKFC2, ZKFC3 > When trying to manually failover using NN1 to NN3 if NN3's ZKFC (ZKFC3) is > down, Exception is thrown -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14855) client always print standbyexception info with multi standby namenode
[ https://issues.apache.org/jira/browse/HDFS-14855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-14855: --- Labels: multi-sbnn (was: ) > client always print standbyexception info with multi standby namenode > - > > Key: HDFS-14855 > URL: https://issues.apache.org/jira/browse/HDFS-14855 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Shen Yinjie >Assignee: Shen Yinjie >Priority: Major > Labels: multi-sbnn > Attachments: image-2019-09-19-20-04-54-591.png > > > When cluster has more than two standby namenodes, client executes shell will > print standbyexception info. May we change the log level from INFO to DEBUG, > !image-2019-09-19-20-04-54-591.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work started] (HDDS-2211) Collect docker logs if env fails to start
[ https://issues.apache.org/jira/browse/HDDS-2211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HDDS-2211 started by Attila Doroszlai. -- > Collect docker logs if env fails to start > - > > Key: HDDS-2211 > URL: https://issues.apache.org/jira/browse/HDDS-2211 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: test >Reporter: Attila Doroszlai >Assignee: Attila Doroszlai >Priority: Major > > Occasionally some acceptance test docker environment fails to start up > properly. We need docker logs for analysis, but they are not being collected. > https://github.com/elek/ozone-ci-q4/blob/master/trunk/trunk-nightly-extra-20190930-74rp4/acceptance/output.log#L3765-L3768 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14201) Ability to disallow safemode NN to become active
[ https://issues.apache.org/jira/browse/HDFS-14201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-14201: --- Labels: multi-sbnn (was: ) > Ability to disallow safemode NN to become active > > > Key: HDFS-14201 > URL: https://issues.apache.org/jira/browse/HDFS-14201 > Project: Hadoop HDFS > Issue Type: Improvement > Components: auto-failover >Affects Versions: 3.1.1, 2.9.2 >Reporter: Xiao Liang >Assignee: Xiao Liang >Priority: Major > Labels: multi-sbnn > Fix For: 3.3.0 > > Attachments: HDFS-14201.001.patch, HDFS-14201.002.patch, > HDFS-14201.003.patch, HDFS-14201.004.patch, HDFS-14201.005.patch, > HDFS-14201.006.patch, HDFS-14201.007.patch, HDFS-14201.008.patch, > HDFS-14201.009.patch > > > Currently with HA, Namenode in safemode can be possibly selected as active, > for availability of both read and write, Namenodes not in safemode are better > choices to become active though. > It can take tens of minutes for a cold started Namenode to get out of > safemode, especially when there are large number of files and blocks in HDFS, > that means if a Namenode in safemode become active, the cluster will be not > fully functioning for quite a while, even if it can while there is some > Namenode not in safemode. > The proposal here is to add an option, to allow Namenode to report itself as > UNHEALTHY to ZKFC, if it's in safemode, so as to only allow fully functioning > Namenode to become active, improving the general availability of the cluster. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14305) Serial number in BlockTokenSecretManager could overlap between different namenodes
[ https://issues.apache.org/jira/browse/HDFS-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941238#comment-16941238 ] Wei-Chiu Chuang commented on HDFS-14305: HDFS-6440 was backported into branch-2 by HDFS-14205. I'm assuming the issue in debate also impacts 2.10 release? > Serial number in BlockTokenSecretManager could overlap between different > namenodes > -- > > Key: HDFS-14305 > URL: https://issues.apache.org/jira/browse/HDFS-14305 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode, security >Reporter: Chao Sun >Assignee: Xiaoqiao He >Priority: Major > Labels: multi-sbnn > Fix For: 3.0.4, 3.3.0, 3.2.1, 3.1.3 > > Attachments: HDFS-14305-007.patch, HDFS-14305.001.patch, > HDFS-14305.002.patch, HDFS-14305.003.patch, HDFS-14305.004.patch, > HDFS-14305.005.patch, HDFS-14305.006.patch > > > Currently, a {{BlockTokenSecretManager}} starts with a random integer as the > initial serial number, and then use this formula to rotate it: > {code:java} > this.intRange = Integer.MAX_VALUE / numNNs; > this.nnRangeStart = intRange * nnIndex; > this.serialNo = (this.serialNo % intRange) + (nnRangeStart); > {code} > while {{numNNs}} is the total number of NameNodes in the cluster, and > {{nnIndex}} is the index of the current NameNode specified in the > configuration {{dfs.ha.namenodes.}}. > However, with this approach, different NameNode could have overlapping ranges > for serial number. For simplicity, let's assume {{Integer.MAX_VALUE}} is 100, > and we have 2 NameNodes {{nn1}} and {{nn2}} in configuration. Then the ranges > for these two are: > {code} > nn1 -> [-49, 49] > nn2 -> [1, 99] > {code} > This is because the initial serial number could be any negative integer. > Moreover, when the keys are updated, the serial number will again be updated > with the formula: > {code} > this.serialNo = (this.serialNo % intRange) + (nnRangeStart); > {code} > which means the new serial number could be updated to a range that belongs to > a different NameNode, thus increasing the chance of collision again. > When the collision happens, DataNodes could overwrite an existing key which > will cause clients to fail because of {{InvalidToken}} error. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14793) BlockTokenSecretManager should LOG block token range it operates on.
[ https://issues.apache.org/jira/browse/HDFS-14793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941235#comment-16941235 ] Wei-Chiu Chuang commented on HDFS-14793: Looks good, but it is superseded by HDFS-14305. > BlockTokenSecretManager should LOG block token range it operates on. > > > Key: HDFS-14793 > URL: https://issues.apache.org/jira/browse/HDFS-14793 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.10.0 >Reporter: Konstantin Shvachko >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14793.001.patch > > > At startup log enough information to identified the range of block token keys > for the NameNode. This should make it easier to debug issues with block > tokens. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14378) Simplify the design of multiple NN and both logic of edit log roll and checkpoint
[ https://issues.apache.org/jira/browse/HDFS-14378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-14378: --- Labels: multi-sbnn (was: ) > Simplify the design of multiple NN and both logic of edit log roll and > checkpoint > - > > Key: HDFS-14378 > URL: https://issues.apache.org/jira/browse/HDFS-14378 > Project: Hadoop HDFS > Issue Type: Improvement > Components: ha, namenode >Affects Versions: 3.1.2 >Reporter: star >Assignee: star >Priority: Major > Labels: multi-sbnn > Attachments: HDFS-14378-trunk.001.patch, HDFS-14378-trunk.002.patch, > HDFS-14378-trunk.003.patch, HDFS-14378-trunk.004.patch, > HDFS-14378-trunk.005.patch, HDFS-14378-trunk.006.patch > > > HDFS-6440 introduced a mechanism to support more than 2 NNs. It > implements a first-writer-win policy to avoid duplicated fsimage downloading. > Variable 'isPrimaryCheckPointer' is used to hold the first-writer state, with > which SNN will provide fsimage for ANN next time. Then we have three roles in > NN cluster: ANN, one primary SNN, one or more normal SNN. > Since HDFS-12248, there may be more than two primary SNN shortly after > a exception occurred. It takes care with a scenario that SNN will not upload > fsimage on IOE and Interrupted exceptions. Though it will not cause any > further functional issues, it is inconsistent. > Futher more, edit log may be rolled more frequently than necessary with > multiple Standby name nodes, HDFS-14349. (I'm not so sure about this, will > verify by unit tests or any one could point it out.) > Above all, I‘m wondering if we could make it simple with following > changes: > * There are only two roles:ANN, SNN > * ANN will roll its edit log every DFS_HA_LOGROLL_PERIOD_KEY period. > * ANN will select a SNN to download checkpoint. > SNN will just do logtail and checkpoint. Then provide a servlet for fsimage > downloading as normal. SNN will not try to roll edit log or send checkpoint > request to ANN. > In a word, ANN will be more active. Suggestions are welcomed. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14646) Standby NameNode should not upload fsimage to an inappropriate NameNode.
[ https://issues.apache.org/jira/browse/HDFS-14646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-14646: --- Labels: multi-sbnn (was: ) > Standby NameNode should not upload fsimage to an inappropriate NameNode. > > > Key: HDFS-14646 > URL: https://issues.apache.org/jira/browse/HDFS-14646 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 3.1.2 >Reporter: Xudong Cao >Assignee: Xudong Cao >Priority: Major > Labels: multi-sbnn > Attachments: HDFS-14646.000.patch, HDFS-14646.001.patch, > HDFS-14646.002.patch, HDFS-14646.003.patch, HDFS-14646.004.patch > > > *Problem Description:* > In the multi-NameNode scenario, when a SNN uploads a FsImage, it will put > the image to all other NNs (whether the peer NN is an ANN or not), and even > if the peer NN immediately replies an error (such as > TransferResult.NOT_ACTIVE_NAMENODE_FAILURE, TransferResult > .OLD_TRANSACTION_ID_FAILURE, etc.), the local SNN will not terminate the put > process immediately, but will put the FsImage completely to the peer NN, and > will not read the peer NN's reply until the put is completed. > Depending on the version of Jetty, this behavior can lead to different > consequences : > *1.Under Hadoop 2.7.2 (with Jetty 6.1.26)* > After peer NN called HttpServletResponse.sendError(), the underlying TCP > connection will still be established, and the data SNN sent will be read by > Jetty framework itself in the peer NN side, so the SNN will insignificantly > send the FsImage to the peer NN continuously, causing a waste of time and > bandwidth. In a relatively large HDFS cluster, the size of FsImage can often > reach about 30GB, This is indeed a big waste. > *2.Under newest release-3.2.0-RC1 (with Jetty 9.3.24) and trunk (with Jetty > 9.3.27)* > After peer NN called HttpServletResponse.sendError(), the underlying TCP > connection will be auto closed, and then SNN will directly get an "Error > writing request body to server" exception, as below, note this test needs a > relatively big FSImage (e.g. 10MB level): > {code:java} > 2019-08-17 03:59:25,413 INFO namenode.TransferFsImage: Sending fileName: > /tmp/hadoop-root/dfs/name/current/fsimage_3364240, fileSize: > 9864721. Sent total: 524288 bytes. Size of last segment intended to send: > 4096 bytes. > java.io.IOException: Error writing request body to server > at > sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.checkError(HttpURLConnection.java:3587) > at > sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.write(HttpURLConnection.java:3570) > at > org.apache.hadoop.hdfs.server.namenode.TransferFsImage.copyFileToStream(TransferFsImage.java:396) > at > org.apache.hadoop.hdfs.server.namenode.TransferFsImage.writeFileToPutRequest(TransferFsImage.java:340) > at > org.apache.hadoop.hdfs.server.namenode.TransferFsImage.uploadImage(TransferFsImage.java:314) > at > org.apache.hadoop.hdfs.server.namenode.TransferFsImage.uploadImageFromStorage(TransferFsImage.java:249) > at > org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer$1.call(StandbyCheckpointer.java:277) > at > org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer$1.call(StandbyCheckpointer.java:272) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 2019-08-17 03:59:25,422 INFO namenode.TransferFsImage: Sending fileName: > /tmp/hadoop-root/dfs/name/current/fsimage_3364240, fileSize: > 9864721. Sent total: 851968 bytes. Size of last segment intended to send: > 4096 bytes. > java.io.IOException: Error writing request body to server > at > sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.checkError(HttpURLConnection.java:3587) > at > sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.write(HttpURLConnection.java:3570) > at > org.apache.hadoop.hdfs.server.namenode.TransferFsImage.copyFileToStream(TransferFsImage.java:396) > at > org.apache.hadoop.hdfs.server.namenode.TransferFsImage.writeFileToPutRequest(TransferFsImage.java:340) > {code} > > *Solution:* > A standby NameNode should not upload fsimage to an inappropriate NameNode, > when he plans to put a FsImage to the peer NN, he need to check whether he > really need to put it at this time. > In detail, local SNN should establish an HTTP connection with the peer NN, > send the put request, and then immediately read the response (this is the key > point). If the peer
[jira] [Updated] (HDFS-14305) Serial number in BlockTokenSecretManager could overlap between different namenodes
[ https://issues.apache.org/jira/browse/HDFS-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-14305: --- Labels: multi-sbnn (was: ) > Serial number in BlockTokenSecretManager could overlap between different > namenodes > -- > > Key: HDFS-14305 > URL: https://issues.apache.org/jira/browse/HDFS-14305 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode, security >Reporter: Chao Sun >Assignee: Xiaoqiao He >Priority: Major > Labels: multi-sbnn > Fix For: 3.0.4, 3.3.0, 3.2.1, 3.1.3 > > Attachments: HDFS-14305-007.patch, HDFS-14305.001.patch, > HDFS-14305.002.patch, HDFS-14305.003.patch, HDFS-14305.004.patch, > HDFS-14305.005.patch, HDFS-14305.006.patch > > > Currently, a {{BlockTokenSecretManager}} starts with a random integer as the > initial serial number, and then use this formula to rotate it: > {code:java} > this.intRange = Integer.MAX_VALUE / numNNs; > this.nnRangeStart = intRange * nnIndex; > this.serialNo = (this.serialNo % intRange) + (nnRangeStart); > {code} > while {{numNNs}} is the total number of NameNodes in the cluster, and > {{nnIndex}} is the index of the current NameNode specified in the > configuration {{dfs.ha.namenodes.}}. > However, with this approach, different NameNode could have overlapping ranges > for serial number. For simplicity, let's assume {{Integer.MAX_VALUE}} is 100, > and we have 2 NameNodes {{nn1}} and {{nn2}} in configuration. Then the ranges > for these two are: > {code} > nn1 -> [-49, 49] > nn2 -> [1, 99] > {code} > This is because the initial serial number could be any negative integer. > Moreover, when the keys are updated, the serial number will again be updated > with the formula: > {code} > this.serialNo = (this.serialNo % intRange) + (nnRangeStart); > {code} > which means the new serial number could be updated to a range that belongs to > a different NameNode, thus increasing the chance of collision again. > When the collision happens, DataNodes could overwrite an existing key which > will cause clients to fail because of {{InvalidToken}} error. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2211) Collect docker logs if env fails to start
Attila Doroszlai created HDDS-2211: -- Summary: Collect docker logs if env fails to start Key: HDDS-2211 URL: https://issues.apache.org/jira/browse/HDDS-2211 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: test Reporter: Attila Doroszlai Assignee: Attila Doroszlai Occasionally some acceptance test docker environment fails to start up properly. We need docker logs for analysis, but they are not being collected. https://github.com/elek/ozone-ci-q4/blob/master/trunk/trunk-nightly-extra-20190930-74rp4/acceptance/output.log#L3765-L3768 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14883) NPE when the second SNN is starting
[ https://issues.apache.org/jira/browse/HDFS-14883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-14883: --- Labels: multi-sbnn (was: ) > NPE when the second SNN is starting > --- > > Key: HDFS-14883 > URL: https://issues.apache.org/jira/browse/HDFS-14883 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ranith Sardar >Assignee: Ranith Sardar >Priority: Major > Labels: multi-sbnn > > > {{| WARN | qtp79782883-47 | /imagetransfer | ServletHandler.java:632 > java.io.IOException: PutImage failed. java.lang.NullPointerException > at > org.apache.hadoop.hdfs.server.namenode.ImageServlet.validateRequest(ImageServlet.java:198) > at > org.apache.hadoop.hdfs.server.namenode.ImageServlet.doPut(ImageServlet.java:485) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:710) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) > at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:848) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1772) > at > org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:644)}} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2187) ozone-mr test fails with No FileSystem for scheme "o3fs"
[ https://issues.apache.org/jira/browse/HDDS-2187?focusedWorklogId=320734=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320734 ] ASF GitHub Bot logged work on HDDS-2187: Author: ASF GitHub Bot Created on: 30/Sep/19 18:27 Start Date: 30/Sep/19 18:27 Worklog Time Spent: 10m Work Description: adoroszlai commented on issue #1537: HDDS-2187. ozone-mr test fails with No FileSystem for scheme o3fs URL: https://github.com/apache/hadoop/pull/1537#issuecomment-536689992 > datanodes in ozonesecure-mr are not started AFAIK > > https://github.com/elek/ozone-ci-q4/blob/master/pr/pr-hdds-2187-2nl4x/acceptance/output.log I think it's `ozone-recon`, not `ozonesecure-mr`: https://github.com/elek/ozone-ci-q4/blob/c4e58095c9584185c75d6c6af4721c33edba854a/pr/pr-hdds-2187-2nl4x/acceptance/output.log#L1453 I've seen this problem intermittently in earlier runs, eg: https://github.com/elek/ozone-ci-q4/blob/f700ebb2254527527960593496f15008245094d6/trunk/trunk-nightly-extra-20190930-74rp4/acceptance/output.log#L3765-L3768 Working on a change to collect the docker logs. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320734) Time Spent: 1.5h (was: 1h 20m) > ozone-mr test fails with No FileSystem for scheme "o3fs" > > > Key: HDDS-2187 > URL: https://issues.apache.org/jira/browse/HDDS-2187 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: test >Reporter: Attila Doroszlai >Assignee: Attila Doroszlai >Priority: Major > Labels: pull-request-available > Time Spent: 1.5h > Remaining Estimate: 0h > > HDDS-2101 changed how Ozone filesystem provider is configured. {{ozone-mr}} > tests [started > failing|https://github.com/elek/ozone-ci/blob/2f2c99652af6b26a95f08eece9e545f0d72ccf45/pr/pr-hdds-2101-rtz55/acceptance/output.log#L255-L263], > but it [wasn't > noticed|https://github.com/elek/ozone-ci/blob/master/pr/pr-hdds-2101-rtz55/acceptance/result] > due to HDDS-2185. > {code} > Running command 'ozone fs -mkdir /user' > ${output} = mkdir: No FileSystem for scheme "o3fs" > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-14884) Add sanity check that zone key equals feinfo key while setting Xattrs
Mukul Kumar Singh created HDFS-14884: Summary: Add sanity check that zone key equals feinfo key while setting Xattrs Key: HDFS-14884 URL: https://issues.apache.org/jira/browse/HDFS-14884 Project: Hadoop HDFS Issue Type: Bug Components: encryption, hdfs Reporter: Mukul Kumar Singh Currently, it is possible to set an external attribute where the zone key is not the same as feinfo key. This jira will add a precondition before setting this. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14305) Serial number in BlockTokenSecretManager could overlap between different namenodes
[ https://issues.apache.org/jira/browse/HDFS-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941217#comment-16941217 ] Konstantin Shvachko commented on HDFS-14305: Hey [~hexiaoqiao], I don't think I understand what you mean. The original bug was that the ranges are not disjoint, so they could cause collision of block tokens issued by different NameNodes. Both v06 and v07 patches solve this problem. We can still have a collision if we add new NameNodes to the cluster and restart them in arbitrary order. As I suggested we should try to solve this problem in a follow up jira. v06 patch introduced smaller ranges, so upgrading to this version will create collisions even if one keeps the number of NameNodes unchanged. v07 patch just fixes the arithmetic bug, and keeps the ranges as they were before. Hope this makes sense. > Serial number in BlockTokenSecretManager could overlap between different > namenodes > -- > > Key: HDFS-14305 > URL: https://issues.apache.org/jira/browse/HDFS-14305 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode, security >Reporter: Chao Sun >Assignee: Xiaoqiao He >Priority: Major > Fix For: 3.0.4, 3.3.0, 3.2.1, 3.1.3 > > Attachments: HDFS-14305-007.patch, HDFS-14305.001.patch, > HDFS-14305.002.patch, HDFS-14305.003.patch, HDFS-14305.004.patch, > HDFS-14305.005.patch, HDFS-14305.006.patch > > > Currently, a {{BlockTokenSecretManager}} starts with a random integer as the > initial serial number, and then use this formula to rotate it: > {code:java} > this.intRange = Integer.MAX_VALUE / numNNs; > this.nnRangeStart = intRange * nnIndex; > this.serialNo = (this.serialNo % intRange) + (nnRangeStart); > {code} > while {{numNNs}} is the total number of NameNodes in the cluster, and > {{nnIndex}} is the index of the current NameNode specified in the > configuration {{dfs.ha.namenodes.}}. > However, with this approach, different NameNode could have overlapping ranges > for serial number. For simplicity, let's assume {{Integer.MAX_VALUE}} is 100, > and we have 2 NameNodes {{nn1}} and {{nn2}} in configuration. Then the ranges > for these two are: > {code} > nn1 -> [-49, 49] > nn2 -> [1, 99] > {code} > This is because the initial serial number could be any negative integer. > Moreover, when the keys are updated, the serial number will again be updated > with the formula: > {code} > this.serialNo = (this.serialNo % intRange) + (nnRangeStart); > {code} > which means the new serial number could be updated to a range that belongs to > a different NameNode, thus increasing the chance of collision again. > When the collision happens, DataNodes could overwrite an existing key which > will cause clients to fail because of {{InvalidToken}} error. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13270) RBF: Router audit logger
[ https://issues.apache.org/jira/browse/HDFS-13270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941215#comment-16941215 ] hemanthboyina commented on HDFS-13270: -- in the Namenode's implementation for most of the methods we get file status as return value based on file status object we get owner,permission values and form audit log accordingly {code:java} auditStat = FSDirMkdirOp.mkdirs(this, pc, src, permissions, . logAuditEvent(true, operationName, src, null, auditStat); {code} that's not the case with Routers {code:java} public boolean mkdirs( return rpcClient.invokeSingle(firstLocation, method, Boolean.class); {code} welcoming suggestions to implement , thanks > RBF: Router audit logger > > > Key: HDFS-13270 > URL: https://issues.apache.org/jira/browse/HDFS-13270 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs >Affects Versions: 3.2.0 >Reporter: maobaolong >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-13270.001.patch, HDFS-13270.002.patch, > HDFS-13270.003.patch > > > We can use router auditlogger to log the client info and cmd, because the > FSNamesystem#Auditlogger's log think the client are all from router. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14498) LeaseManager can loop forever on the file for which create has failed
[ https://issues.apache.org/jira/browse/HDFS-14498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941211#comment-16941211 ] leigh commented on HDFS-14498: -- Hi, We also encountered this issue today: Hadoop 3.2.0 Source code repository https://github.com/apache/hadoop.git -r e97acb3bd8f3befd27418996fa5d4b50bf2e17bf Compiled by sunilg on 2019-01-08T06:08Z Compiled with protoc 2.5.0 >From source with checksum d3f0795ed0d9dc378e2c785d3668f39 We did have an issue in our cluster before this happened where some of our DN stopped receiving heartbeats from the active NN (although heartbeats from the standby NN's could be seen). I ran the recoverLease command on the bad files but that did not help. I restarted the NN's and all the DN's. It stopped the spamming of the logs but we were still unable to write to the file. In the end I had to delete the bad files. We have a sizeable cluster. Is there anything in particular you would like to see from the logs? Thanks in advance. > LeaseManager can loop forever on the file for which create has failed > -- > > Key: HDFS-14498 > URL: https://issues.apache.org/jira/browse/HDFS-14498 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.9.0 >Reporter: Sergey Shelukhin >Priority: Major > > The logs from file creation are long gone due to infinite lease logging, > however it presumably failed... the client who was trying to write this file > is definitely long dead. > The version includes HDFS-4882. > We get this log pattern repeating infinitely: > {noformat} > 2019-05-16 14:00:16,893 INFO > [org.apache.hadoop.hdfs.server.namenode.LeaseManager$Monitor@b27557f] > org.apache.hadoop.hdfs.server.namenode.LeaseManager: [Lease. Holder: > DFSClient_NONMAPREDUCE_-20898906_61, pending creates: 1] has expired hard > limit > 2019-05-16 14:00:16,893 INFO > [org.apache.hadoop.hdfs.server.namenode.LeaseManager$Monitor@b27557f] > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Recovering [Lease. > Holder: DFSClient_NONMAPREDUCE_-20898906_61, pending creates: 1], src= > 2019-05-16 14:00:16,893 WARN > [org.apache.hadoop.hdfs.server.namenode.LeaseManager$Monitor@b27557f] > org.apache.hadoop.hdfs.StateChange: DIR* NameSystem.internalReleaseLease: > Failed to release lease for file . Committed blocks are waiting to be > minimally replicated. Try again later. > 2019-05-16 14:00:16,893 WARN > [org.apache.hadoop.hdfs.server.namenode.LeaseManager$Monitor@b27557f] > org.apache.hadoop.hdfs.server.namenode.LeaseManager: Cannot release the path > in the lease [Lease. Holder: DFSClient_NONMAPREDUCE_-20898906_61, > pending creates: 1]. It will be retried. > org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: DIR* > NameSystem.internalReleaseLease: Failed to release lease for file . > Committed blocks are waiting to be minimally replicated. Try again later. > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.internalReleaseLease(FSNamesystem.java:3357) > at > org.apache.hadoop.hdfs.server.namenode.LeaseManager.checkLeases(LeaseManager.java:573) > at > org.apache.hadoop.hdfs.server.namenode.LeaseManager$Monitor.run(LeaseManager.java:509) > at java.lang.Thread.run(Thread.java:745) > $ grep -c "Recovering.*DFSClient_NONMAPREDUCE_-20898906_61, pending creates: > 1" hdfs_nn* > hdfs_nn.log:1068035 > hdfs_nn.log.2019-05-16-14:1516179 > hdfs_nn.log.2019-05-16-15:1538350 > {noformat} > Aside from an actual bug fix, it might make sense to make LeaseManager not > log so much, in case if there are more bugs like this... -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14495) RBF: Duplicate FederationRPCMetrics
[ https://issues.apache.org/jira/browse/HDFS-14495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941198#comment-16941198 ] hemanthboyina commented on HDFS-14495: -- interfaces RouterRpcMonitor and FederationRPCMBean are having most of the metrics same can we remove the dependency and duplication by removing either one of these > RBF: Duplicate FederationRPCMetrics > --- > > Key: HDFS-14495 > URL: https://issues.apache.org/jira/browse/HDFS-14495 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: metrics >Reporter: Akira Ajisaka >Assignee: hemanthboyina >Priority: Major > > There are two FederationRPCMetrics displayed in Web UI (http:// hostname>:/jmx) and most of the metrics are the same. > * FederationRPCMetrics via {{@Metrics}} and {{@Metric}} annotations > * FederationRPCMetrics via registering FederationRPCMBean > Can we remove {{@Metrics}} and {{@Metric}} annotations to remove duplication? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2187) ozone-mr test fails with No FileSystem for scheme "o3fs"
[ https://issues.apache.org/jira/browse/HDDS-2187?focusedWorklogId=320715=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320715 ] ASF GitHub Bot logged work on HDDS-2187: Author: ASF GitHub Bot Created on: 30/Sep/19 17:58 Start Date: 30/Sep/19 17:58 Worklog Time Spent: 10m Work Description: elek commented on issue #1537: HDDS-2187. ozone-mr test fails with No FileSystem for scheme o3fs URL: https://github.com/apache/hadoop/pull/1537#issuecomment-536677464 Thanks the update @adoroszlai Looks good. Good go have `ozone fs` fixed One problem: datanodes in ozonesecure-mr are not started AFAIK https://github.com/elek/ozone-ci-q4/blob/master/pr/pr-hdds-2187-2nl4x/acceptance/output.log This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320715) Time Spent: 1h 20m (was: 1h 10m) > ozone-mr test fails with No FileSystem for scheme "o3fs" > > > Key: HDDS-2187 > URL: https://issues.apache.org/jira/browse/HDDS-2187 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: test >Reporter: Attila Doroszlai >Assignee: Attila Doroszlai >Priority: Major > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > > HDDS-2101 changed how Ozone filesystem provider is configured. {{ozone-mr}} > tests [started > failing|https://github.com/elek/ozone-ci/blob/2f2c99652af6b26a95f08eece9e545f0d72ccf45/pr/pr-hdds-2101-rtz55/acceptance/output.log#L255-L263], > but it [wasn't > noticed|https://github.com/elek/ozone-ci/blob/master/pr/pr-hdds-2101-rtz55/acceptance/result] > due to HDDS-2185. > {code} > Running command 'ozone fs -mkdir /user' > ${output} = mkdir: No FileSystem for scheme "o3fs" > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2210) ContainerStateMachine should not be marked unhealthy if applyTransaction fails with closed container exception
Shashikant Banerjee created HDDS-2210: - Summary: ContainerStateMachine should not be marked unhealthy if applyTransaction fails with closed container exception Key: HDDS-2210 URL: https://issues.apache.org/jira/browse/HDDS-2210 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Datanode Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Fix For: 0.5.0 Currently, if applyTransaction fails, the stateMachine is marked unhealthy and next snapshot creation will fail. As a result of which the the raftServer will close down leading to pipeline failure. ClosedContainer exception should be ignored while marking the stateMachine unhealthy. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-1720) Add ability to configure RocksDB logs for Ozone Manager
[ https://issues.apache.org/jira/browse/HDDS-1720?focusedWorklogId=320708=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320708 ] ASF GitHub Bot logged work on HDDS-1720: Author: ASF GitHub Bot Created on: 30/Sep/19 17:52 Start Date: 30/Sep/19 17:52 Worklog Time Spent: 10m Work Description: avijayanhwx commented on issue #1538: HDDS-1720 : Add ability to configure RocksDB logs for Ozone Manager. URL: https://github.com/apache/hadoop/pull/1538#issuecomment-536674819 /retest This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320708) Time Spent: 40m (was: 0.5h) > Add ability to configure RocksDB logs for Ozone Manager > --- > > Key: HDDS-1720 > URL: https://issues.apache.org/jira/browse/HDDS-1720 > Project: Hadoop Distributed Data Store > Issue Type: Task >Reporter: Aravindan Vijayan >Assignee: Aravindan Vijayan >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 40m > Remaining Estimate: 0h > > While doing performance testing, it was seen that there was no way to get > RocksDB logs for Ozone Manager. Along with Rocksdb metrics, this may be a > useful mechanism to understand the health of Rocksdb while investigating > large clusters. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14495) RBF: Duplicate FederationRPCMetrics
[ https://issues.apache.org/jira/browse/HDFS-14495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941186#comment-16941186 ] hemanthboyina commented on HDFS-14495: -- hi [~elgoiri] any suggestions for this ? > RBF: Duplicate FederationRPCMetrics > --- > > Key: HDFS-14495 > URL: https://issues.apache.org/jira/browse/HDFS-14495 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: metrics >Reporter: Akira Ajisaka >Assignee: hemanthboyina >Priority: Major > > There are two FederationRPCMetrics displayed in Web UI (http:// hostname>:/jmx) and most of the metrics are the same. > * FederationRPCMetrics via {{@Metrics}} and {{@Metric}} annotations > * FederationRPCMetrics via registering FederationRPCMBean > Can we remove {{@Metrics}} and {{@Metric}} annotations to remove duplication? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2169) Avoid buffer copies while submitting client requests in Ratis
[ https://issues.apache.org/jira/browse/HDDS-2169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941170#comment-16941170 ] Hadoop QA commented on HDDS-2169: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 37m 24s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 36s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 35s{color} | {color:red} hadoop-hdds in trunk failed. {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 38s{color} | {color:red} hadoop-ozone in trunk failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 19s{color} | {color:red} hadoop-hdds in trunk failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 13s{color} | {color:red} hadoop-ozone in trunk failed. {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 40s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 19s{color} | {color:red} hadoop-hdds in trunk failed. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 17s{color} | {color:red} hadoop-ozone in trunk failed. {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 17m 8s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 30s{color} | {color:red} hadoop-hdds in trunk failed. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 17s{color} | {color:red} hadoop-ozone in trunk failed. {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 32s{color} | {color:red} hadoop-hdds in the patch failed. {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 34s{color} | {color:red} hadoop-ozone in the patch failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 22s{color} | {color:red} hadoop-hdds in the patch failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 15s{color} | {color:red} hadoop-ozone in the patch failed. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 22s{color} | {color:red} hadoop-hdds in the patch failed. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 15s{color} | {color:red} hadoop-ozone in the patch failed. {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 25s{color} | {color:orange} hadoop-hdds: The patch generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 9s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 19s{color} | {color:red} hadoop-hdds in the patch failed. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 16s{color} |
[jira] [Work logged] (HDDS-2169) Avoid buffer copies while submitting client requests in Ratis
[ https://issues.apache.org/jira/browse/HDDS-2169?focusedWorklogId=320700=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320700 ] ASF GitHub Bot logged work on HDDS-2169: Author: ASF GitHub Bot Created on: 30/Sep/19 17:37 Start Date: 30/Sep/19 17:37 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on issue #1517: HDDS-2169 URL: https://github.com/apache/hadoop/pull/1517#issuecomment-536668387 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | 0 | reexec | 2244 | Docker mode activated. | ||| _ Prechecks _ | | +1 | dupname | 0 | No case conflicting files found. | | +1 | @author | 0 | The patch does not contain any @author tags. | | -1 | test4tests | 0 | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | ||| _ trunk Compile Tests _ | | 0 | mvndep | 36 | Maven dependency ordering for branch | | -1 | mvninstall | 35 | hadoop-hdds in trunk failed. | | -1 | mvninstall | 38 | hadoop-ozone in trunk failed. | | -1 | compile | 19 | hadoop-hdds in trunk failed. | | -1 | compile | 13 | hadoop-ozone in trunk failed. | | +1 | checkstyle | 59 | trunk passed | | +1 | mvnsite | 0 | trunk passed | | +1 | shadedclient | 940 | branch has no errors when building and testing our client artifacts. | | -1 | javadoc | 19 | hadoop-hdds in trunk failed. | | -1 | javadoc | 17 | hadoop-ozone in trunk failed. | | 0 | spotbugs | 1028 | Used deprecated FindBugs config; considering switching to SpotBugs. | | -1 | findbugs | 30 | hadoop-hdds in trunk failed. | | -1 | findbugs | 17 | hadoop-ozone in trunk failed. | ||| _ Patch Compile Tests _ | | 0 | mvndep | 15 | Maven dependency ordering for patch | | -1 | mvninstall | 32 | hadoop-hdds in the patch failed. | | -1 | mvninstall | 34 | hadoop-ozone in the patch failed. | | -1 | compile | 22 | hadoop-hdds in the patch failed. | | -1 | compile | 15 | hadoop-ozone in the patch failed. | | -1 | javac | 22 | hadoop-hdds in the patch failed. | | -1 | javac | 15 | hadoop-ozone in the patch failed. | | -0 | checkstyle | 25 | hadoop-hdds: The patch generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0) | | +1 | mvnsite | 0 | the patch passed | | +1 | whitespace | 0 | The patch has no whitespace issues. | | +1 | shadedclient | 789 | patch has no errors when building and testing our client artifacts. | | -1 | javadoc | 19 | hadoop-hdds in the patch failed. | | -1 | javadoc | 16 | hadoop-ozone in the patch failed. | | -1 | findbugs | 29 | hadoop-hdds in the patch failed. | | -1 | findbugs | 17 | hadoop-ozone in the patch failed. | ||| _ Other Tests _ | | -1 | unit | 25 | hadoop-hdds in the patch failed. | | -1 | unit | 22 | hadoop-ozone in the patch failed. | | +1 | asflicense | 30 | The patch does not generate ASF License warnings. | | | | 4686 | | | Subsystem | Report/Notes | |--:|:-| | Docker | Client=19.03.0 Server=19.03.0 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1517/6/artifact/out/Dockerfile | | JIRA Issue | HDDS-2169 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12981115/o2169_20190923.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 11b7ced0c5c4 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | personality/hadoop.sh | | git revision | trunk / 98ca07e | | Default Java | 1.8.0_222 | | mvninstall | https://builds.apache.org/job/hadoop-multibranch/job/PR-1517/6/artifact/out/branch-mvninstall-hadoop-hdds.txt | | mvninstall | https://builds.apache.org/job/hadoop-multibranch/job/PR-1517/6/artifact/out/branch-mvninstall-hadoop-ozone.txt | | compile | https://builds.apache.org/job/hadoop-multibranch/job/PR-1517/6/artifact/out/branch-compile-hadoop-hdds.txt | | compile | https://builds.apache.org/job/hadoop-multibranch/job/PR-1517/6/artifact/out/branch-compile-hadoop-ozone.txt | | javadoc | https://builds.apache.org/job/hadoop-multibranch/job/PR-1517/6/artifact/out/branch-javadoc-hadoop-hdds.txt | | javadoc | https://builds.apache.org/job/hadoop-multibranch/job/PR-1517/6/artifact/out/branch-javadoc-hadoop-ozone.txt | | findbugs | https://builds.apache.org/job/hadoop-multibranch/job/PR-1517/6/artifact/out/branch-findbugs-hadoop-hdds.txt | | findbugs | https://builds.apache.org/job/hadoop-multibranch/job/PR-1517/6/artifact/out/branch-findbugs-hadoop-ozone.txt | |
[jira] [Assigned] (HDDS-1984) Fix listBucket API
[ https://issues.apache.org/jira/browse/HDDS-1984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] YiSheng Lien reassigned HDDS-1984: -- Assignee: YiSheng Lien > Fix listBucket API > -- > > Key: HDDS-1984 > URL: https://issues.apache.org/jira/browse/HDDS-1984 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Bharat Viswanadham >Assignee: YiSheng Lien >Priority: Major > > This Jira is to fix listBucket API in HA code path. > In HA, we have an in-memory cache, where we put the result to in-memory cache > and return the response, later it will be picked by double buffer thread and > it will flush to disk. So, now when do listBuckets, it should use both > in-memory cache and rocksdb bucket table to list buckets in a volume. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-1984) Fix listBucket API
[ https://issues.apache.org/jira/browse/HDDS-1984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] YiSheng Lien reassigned HDDS-1984: -- Assignee: (was: YiSheng Lien) > Fix listBucket API > -- > > Key: HDDS-1984 > URL: https://issues.apache.org/jira/browse/HDDS-1984 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Bharat Viswanadham >Priority: Major > > This Jira is to fix listBucket API in HA code path. > In HA, we have an in-memory cache, where we put the result to in-memory cache > and return the response, later it will be picked by double buffer thread and > it will flush to disk. So, now when do listBuckets, it should use both > in-memory cache and rocksdb bucket table to list buckets in a volume. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org