[jira] [Commented] (HDFS-15185) StartupProgress reports edits segments until the entire startup completes
[ https://issues.apache.org/jira/browse/HDFS-15185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040602#comment-17040602 ] Hadoop QA commented on HDFS-15185: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 21s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 29m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 34s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 22m 8s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 4m 23s{color} | {color:red} hadoop-hdfs in trunk failed. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 32s{color} | {color:red} hadoop-hdfs in trunk failed. {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 35s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 31s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 31s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 28s{color} | {color:orange} The patch fails to run checkstyle in hadoop-hdfs {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 30s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 50 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 0m 32s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 30s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 42s{color} | {color:red} hadoop-hdfs-project_hadoop-hdfs generated 100 new + 0 unchanged - 0 fixed = 100 total (was 0) {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 17m 2s{color} | {color:red} hadoop-hdfs in the patch passed. {color} | | {color:blue}0{color} | {color:blue} asflicense {color} | {color:blue} 0m 31s{color} | {color:blue} ASF License check generated no output? {color} | | {color:black}{color} | {color:black} {color} | {color:black} 83m 43s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped | | | hadoop.hdfs.server.blockmanagement.TestBlockManager | | | hadoop.hdfs.server.blockmanagement.TestSequentialBlockId | | | hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.6 Server=19.03.6 Image:yetus/hadoop:c44943d1fc3 | | JIRA Issue | HDFS-15185 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12993964/HDFS-15185.001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 6ef1930e7d42 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / ec75071 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_232 | | findbugs |
[jira] [Commented] (HDFS-15185) StartupProgress reports edits segments until the entire startup completes
[ https://issues.apache.org/jira/browse/HDFS-15185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040576#comment-17040576 ] Konstantin Shvachko commented on HDFS-15185: New segments are added via {{StartupProgress.setTotal()}}, which adds steps even though the phase is complete. {{setTotal()}} is called in {{FSEditLogLoader.loadEditRecords()}} while tailing edits through {{EditLogTailer.doTailEdits()}} after the {{LOADING_EDITS}} phase ended. > StartupProgress reports edits segments until the entire startup completes > - > > Key: HDFS-15185 > URL: https://issues.apache.org/jira/browse/HDFS-15185 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.10.0 >Reporter: Konstantin Shvachko >Assignee: Konstantin Shvachko >Priority: Major > Attachments: HDFS-15185.001.patch > > > Startup Progress page keeps reporting edits segments after the {{LOAD_EDITS}} > stage is complete. New steps are added to StartupProgress while journal > tailing until all startup phases are completed. This adds a lot of edits > steps, since {{SAFEMODE}} phase can take a long time on a large cluster. > With fast tailing the segments are small, but the number of them is large - > 160K. This makes the page load forever. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15185) StartupProgress reports edits segments until the entire startup completes
[ https://issues.apache.org/jira/browse/HDFS-15185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated HDFS-15185: --- Release Note: New segments are added via {{StartupProgress.setTotal()}}, which adds steps even though the phase is complete. {{setTotal()}} is called in {{FSEditLogLoader.loadEditRecords()}} while tailing edits through {{EditLogTailer.doTailEdits()}} after the {{LOADING_EDITS}} phase ended. Status: Patch Available (was: Open) > StartupProgress reports edits segments until the entire startup completes > - > > Key: HDFS-15185 > URL: https://issues.apache.org/jira/browse/HDFS-15185 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.10.0 >Reporter: Konstantin Shvachko >Assignee: Konstantin Shvachko >Priority: Major > Attachments: HDFS-15185.001.patch > > > Startup Progress page keeps reporting edits segments after the {{LOAD_EDITS}} > stage is complete. New steps are added to StartupProgress while journal > tailing until all startup phases are completed. This adds a lot of edits > steps, since {{SAFEMODE}} phase can take a long time on a large cluster. > With fast tailing the segments are small, but the number of them is large - > 160K. This makes the page load forever. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15185) StartupProgress reports edits segments until the entire startup completes
[ https://issues.apache.org/jira/browse/HDFS-15185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated HDFS-15185: --- Release Note: (was: New segments are added via {{StartupProgress.setTotal()}}, which adds steps even though the phase is complete. {{setTotal()}} is called in {{FSEditLogLoader.loadEditRecords()}} while tailing edits through {{EditLogTailer.doTailEdits()}} after the {{LOADING_EDITS}} phase ended.) > StartupProgress reports edits segments until the entire startup completes > - > > Key: HDFS-15185 > URL: https://issues.apache.org/jira/browse/HDFS-15185 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.10.0 >Reporter: Konstantin Shvachko >Assignee: Konstantin Shvachko >Priority: Major > Attachments: HDFS-15185.001.patch > > > Startup Progress page keeps reporting edits segments after the {{LOAD_EDITS}} > stage is complete. New steps are added to StartupProgress while journal > tailing until all startup phases are completed. This adds a lot of edits > steps, since {{SAFEMODE}} phase can take a long time on a large cluster. > With fast tailing the segments are small, but the number of them is large - > 160K. This makes the page load forever. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15185) StartupProgress reports edits segments until the entire startup completes
[ https://issues.apache.org/jira/browse/HDFS-15185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated HDFS-15185: --- Attachment: HDFS-15185.001.patch > StartupProgress reports edits segments until the entire startup completes > - > > Key: HDFS-15185 > URL: https://issues.apache.org/jira/browse/HDFS-15185 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.10.0 >Reporter: Konstantin Shvachko >Assignee: Konstantin Shvachko >Priority: Major > Attachments: HDFS-15185.001.patch > > > Startup Progress page keeps reporting edits segments after the {{LOAD_EDITS}} > stage is complete. New steps are added to StartupProgress while journal > tailing until all startup phases are completed. This adds a lot of edits > steps, since {{SAFEMODE}} phase can take a long time on a large cluster. > With fast tailing the segments are small, but the number of them is large - > 160K. This makes the page load forever. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-15185) StartupProgress reports edits segments until the entire startup completes
[ https://issues.apache.org/jira/browse/HDFS-15185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040562#comment-17040562 ] Konstantin Shvachko edited comment on HDFS-15185 at 2/20/20 1:56 AM: - HDFS-14500 tried to fix it, but it is still happening. Most of the excessive segments look like this: {code} ByteStringEditLog[406685107236, 406685108346] 203.99 KB (0/) 100% ByteStringEditLog[406685108347, 406685110031] 348.07 KB (0/1685) 100% ByteStringEditLog[406685110032, 406685111308] 277.39 KB (0/1277) 100% ByteStringEditLog[406685111309, 406685112369] 239.67 KB (0/1061) 100% ByteStringEditLog[406685112370, 406685113032] 129.18 KB (0/663) 100% {code} was (Author: shv): HDFS-14500 tried to fix it, but it is still happening. Most of the excessive segments look like this: {code} ByteStringEditLog[406685107236, 406685108346] 203.99 KB (0/) 100% ByteStringEditLog[406685108347, 406685110031] 348.07 KB (0/1685) 100% ByteStringEditLog[406685110032, 406685111308] 277.39 KB (0/1277) 100% ByteStringEditLog[406685111309, 406685112369] 239.67 KB (0/1061) 100% ByteStringEditLog[406685112370, 406685113032] 129.18 KB (0/663) 100% {code} > StartupProgress reports edits segments until the entire startup completes > - > > Key: HDFS-15185 > URL: https://issues.apache.org/jira/browse/HDFS-15185 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.10.0 >Reporter: Konstantin Shvachko >Assignee: Konstantin Shvachko >Priority: Major > > Startup Progress page keeps reporting edits segments after the {{LOAD_EDITS}} > stage is complete. New steps are added to StartupProgress while journal > tailing until all startup phases are completed. This adds a lot of edits > steps, since {{SAFEMODE}} phase can take a long time on a large cluster. > With fast tailing the segments are small, but the number of them is large - > 160K. This makes the page load forever. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15185) StartupProgress reports edits segments until the entire startup completes
[ https://issues.apache.org/jira/browse/HDFS-15185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040562#comment-17040562 ] Konstantin Shvachko commented on HDFS-15185: HDFS-14500 tried to fix it, but it is still happening. Most of the excessive segments look like this: {code} ByteStringEditLog[406685107236, 406685108346] 203.99 KB (0/) 100% ByteStringEditLog[406685108347, 406685110031] 348.07 KB (0/1685) 100% ByteStringEditLog[406685110032, 406685111308] 277.39 KB (0/1277) 100% ByteStringEditLog[406685111309, 406685112369] 239.67 KB (0/1061) 100% ByteStringEditLog[406685112370, 406685113032] 129.18 KB (0/663) 100% {code} > StartupProgress reports edits segments until the entire startup completes > - > > Key: HDFS-15185 > URL: https://issues.apache.org/jira/browse/HDFS-15185 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.10.0 >Reporter: Konstantin Shvachko >Assignee: Konstantin Shvachko >Priority: Major > > Startup Progress page keeps reporting edits segments after the {{LOAD_EDITS}} > stage is complete. New steps are added to StartupProgress while journal > tailing until all startup phases are completed. This adds a lot of edits > steps, since {{SAFEMODE}} phase can take a long time on a large cluster. > With fast tailing the segments are small, but the number of them is large - > 160K. This makes the page load forever. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15185) StartupProgress reports edits segments until the entire startup completes
Konstantin Shvachko created HDFS-15185: -- Summary: StartupProgress reports edits segments until the entire startup completes Key: HDFS-15185 URL: https://issues.apache.org/jira/browse/HDFS-15185 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.10.0 Reporter: Konstantin Shvachko Assignee: Konstantin Shvachko Startup Progress page keeps reporting edits segments after the {{LOAD_EDITS}} stage is complete. New steps are added to StartupProgress while journal tailing until all startup phases are completed. This adds a lot of edits steps, since {{SAFEMODE}} phase can take a long time on a large cluster. With fast tailing the segments are small, but the number of them is large - 160K. This makes the page load forever. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15115) Namenode crash caused by NPE in BlockPlacementPolicyDefault when dynamically change logger to debug
[ https://issues.apache.org/jira/browse/HDFS-15115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040507#comment-17040507 ] Kihwal Lee commented on HDFS-15115: --- In your test, {code} repl.chooseRandom(1, "/", excludeNodes, 1024L, 3, results, false, types); {code} {{"/"}} is not the correct root. Take a look at {{NodeBase.ROOT}} to find out. This test case was failing in our testing because we have an internal optimization that prevents wasting time when an unusable rack is specified. > Namenode crash caused by NPE in BlockPlacementPolicyDefault when dynamically > change logger to debug > --- > > Key: HDFS-15115 > URL: https://issues.apache.org/jira/browse/HDFS-15115 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: wangzhixiang >Assignee: wangzhixiang >Priority: Major > Fix For: 3.3.0, 3.1.4, 3.2.2, 2.10.1 > > Attachments: HDFS-15115.001.patch, HDFS-15115.003.patch, > HDFS-15115.004.patch, HDFS-15115.005.patch, HDFS-15115.2.patch > > > To get debug info, we dynamically change the logger of > BlockPlacementPolicyDefault to debug when namenode is running. However, the > Namenode crashs. From the log, we find some NPE in > BlockPlacementPolicyDefault.chooseRandom. Because *StringBuilder builder* > will be used 4 times in BlockPlacementPolicyDefault.chooseRandom method. > While the *builder* only initializes in the first time of this method. If we > change the logger of BlockPlacementPolicyDefault to debug after the part, the > *builder* in remaining part is *NULL* and cause *NPE* -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15115) Namenode crash caused by NPE in BlockPlacementPolicyDefault when dynamically change logger to debug
[ https://issues.apache.org/jira/browse/HDFS-15115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040398#comment-17040398 ] Yuval Degani commented on HDFS-15115: - Thanks, [~ayushtkn]! > Namenode crash caused by NPE in BlockPlacementPolicyDefault when dynamically > change logger to debug > --- > > Key: HDFS-15115 > URL: https://issues.apache.org/jira/browse/HDFS-15115 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: wangzhixiang >Assignee: wangzhixiang >Priority: Major > Fix For: 3.3.0, 3.1.4, 3.2.2, 2.10.1 > > Attachments: HDFS-15115.001.patch, HDFS-15115.003.patch, > HDFS-15115.004.patch, HDFS-15115.005.patch, HDFS-15115.2.patch > > > To get debug info, we dynamically change the logger of > BlockPlacementPolicyDefault to debug when namenode is running. However, the > Namenode crashs. From the log, we find some NPE in > BlockPlacementPolicyDefault.chooseRandom. Because *StringBuilder builder* > will be used 4 times in BlockPlacementPolicyDefault.chooseRandom method. > While the *builder* only initializes in the first time of this method. If we > change the logger of BlockPlacementPolicyDefault to debug after the part, the > *builder* in remaining part is *NULL* and cause *NPE* -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15165) In Du missed calling getAttributesProvider
[ https://issues.apache.org/jira/browse/HDFS-15165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040380#comment-17040380 ] Hudson commented on HDFS-15165: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17968 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/17968/]) HDFS-15165. In Du missed calling getAttributesProvider. Contributed by (inigoiri: rev ec7507162c7e23c0cd251e09b6be0030a500f1ca) * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSPermissionChecker.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestINodeAttributeProvider.java > In Du missed calling getAttributesProvider > -- > > Key: HDFS-15165 > URL: https://issues.apache.org/jira/browse/HDFS-15165 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-15165.00.patch, HDFS-15165.01.patch, > example-test.patch > > > HDFS-12130 changed the behavior of DU command. > It merged both check permission and computation in to a single step. > During this change, when it is required to getInodeAttributes, it just used > inode.getAttributes(). But when attribute provider class is configured, we > should call attribute provider configured object to get InodeAttributes and > use the returned InodeAttributes during checkPermission. > So, if we see after HDFS-12130, code is changed as below. > > {code:java} > byte[][] localComponents = {inode.getLocalNameBytes()}; > INodeAttributes[] iNodeAttr = {inode.getSnapshotINode(snapshotId)}; > enforcer.checkPermission( > fsOwner, supergroup, callerUgi, > iNodeAttr, // single inode attr in the array > new INode[]{inode}, // single inode in the array > localComponents, snapshotId, > null, -1, // this will skip checkTraverse() because > // not checking ancestor here > false, null, null, > access, // the target access to be checked against the inode > null, // passing null sub access avoids checking children > false); > {code} > > If we observe 2nd line it is missing the check if attribute provider class is > configured use that to get InodeAttributeProvider. Because of this when hdfs > path is managed by sentry, and InodeAttributeProvider class is configured > with SentryINodeAttributeProvider, it does not get > SentryInodeAttributeProvider object and not using AclFeature from that if any > Acl’s are set. This has caused the issue of AccessControlException when du > command is run against hdfs path managed by Sentry. > > {code:java} > [root@gg-620-1 ~]# hdfs dfs -du /dev/edl/sc/consumer/lpfg/str/edf/abc/ > du: Permission denied: user=systest, access=READ_EXECUTE, > inode="/dev/edl/sc/consumer/lpfg/str/lpfg_wrk/PRISMA_TO_ICERTIS_OUTBOUND_RM_MASTER/_impala_insert_staging":impala:hive:drwxrwx--x{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15159) Prevent adding same DN multiple times in PendingReconstructionBlocks
[ https://issues.apache.org/jira/browse/HDFS-15159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040377#comment-17040377 ] Hadoop QA commented on HDFS-15159: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 49s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 1s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 8s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 44s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 35s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 57s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 44s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 17m 2s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 39m 50s{color} | {color:red} hadoop-hdfs in the patch passed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 33s{color} | {color:red} The patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}112m 50s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting | | | hadoop.hdfs.server.namenode.TestEditLog | | | hadoop.hdfs.server.namenode.TestAddStripedBlockInFBR | | | hadoop.hdfs.server.namenode.TestParallelImageWrite | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestSpaceReservation | | | hadoop.hdfs.server.namenode.TestLeaseManager | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.6 Server=19.03.6 Image:yetus/hadoop:c44943d1fc3 | | JIRA Issue | HDFS-15159 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12993918/HDFS-15159.001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 12bf223395cf 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 3f1aad0 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_232 | | findbugs | v3.1.0-RC1 | | unit |
[jira] [Commented] (HDFS-15165) In Du missed calling getAttributesProvider
[ https://issues.apache.org/jira/browse/HDFS-15165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040372#comment-17040372 ] Íñigo Goiri commented on HDFS-15165: Thanks [~bharat] for the patch and [~sodonnell] for the review. Committed to trunk. > In Du missed calling getAttributesProvider > -- > > Key: HDFS-15165 > URL: https://issues.apache.org/jira/browse/HDFS-15165 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-15165.00.patch, HDFS-15165.01.patch, > example-test.patch > > > HDFS-12130 changed the behavior of DU command. > It merged both check permission and computation in to a single step. > During this change, when it is required to getInodeAttributes, it just used > inode.getAttributes(). But when attribute provider class is configured, we > should call attribute provider configured object to get InodeAttributes and > use the returned InodeAttributes during checkPermission. > So, if we see after HDFS-12130, code is changed as below. > > {code:java} > byte[][] localComponents = {inode.getLocalNameBytes()}; > INodeAttributes[] iNodeAttr = {inode.getSnapshotINode(snapshotId)}; > enforcer.checkPermission( > fsOwner, supergroup, callerUgi, > iNodeAttr, // single inode attr in the array > new INode[]{inode}, // single inode in the array > localComponents, snapshotId, > null, -1, // this will skip checkTraverse() because > // not checking ancestor here > false, null, null, > access, // the target access to be checked against the inode > null, // passing null sub access avoids checking children > false); > {code} > > If we observe 2nd line it is missing the check if attribute provider class is > configured use that to get InodeAttributeProvider. Because of this when hdfs > path is managed by sentry, and InodeAttributeProvider class is configured > with SentryINodeAttributeProvider, it does not get > SentryInodeAttributeProvider object and not using AclFeature from that if any > Acl’s are set. This has caused the issue of AccessControlException when du > command is run against hdfs path managed by Sentry. > > {code:java} > [root@gg-620-1 ~]# hdfs dfs -du /dev/edl/sc/consumer/lpfg/str/edf/abc/ > du: Permission denied: user=systest, access=READ_EXECUTE, > inode="/dev/edl/sc/consumer/lpfg/str/lpfg_wrk/PRISMA_TO_ICERTIS_OUTBOUND_RM_MASTER/_impala_insert_staging":impala:hive:drwxrwx--x{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15165) In Du missed calling getAttributesProvider
[ https://issues.apache.org/jira/browse/HDFS-15165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Íñigo Goiri updated HDFS-15165: --- Fix Version/s: 3.3.0 Hadoop Flags: Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) > In Du missed calling getAttributesProvider > -- > > Key: HDFS-15165 > URL: https://issues.apache.org/jira/browse/HDFS-15165 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-15165.00.patch, HDFS-15165.01.patch, > example-test.patch > > > HDFS-12130 changed the behavior of DU command. > It merged both check permission and computation in to a single step. > During this change, when it is required to getInodeAttributes, it just used > inode.getAttributes(). But when attribute provider class is configured, we > should call attribute provider configured object to get InodeAttributes and > use the returned InodeAttributes during checkPermission. > So, if we see after HDFS-12130, code is changed as below. > > {code:java} > byte[][] localComponents = {inode.getLocalNameBytes()}; > INodeAttributes[] iNodeAttr = {inode.getSnapshotINode(snapshotId)}; > enforcer.checkPermission( > fsOwner, supergroup, callerUgi, > iNodeAttr, // single inode attr in the array > new INode[]{inode}, // single inode in the array > localComponents, snapshotId, > null, -1, // this will skip checkTraverse() because > // not checking ancestor here > false, null, null, > access, // the target access to be checked against the inode > null, // passing null sub access avoids checking children > false); > {code} > > If we observe 2nd line it is missing the check if attribute provider class is > configured use that to get InodeAttributeProvider. Because of this when hdfs > path is managed by sentry, and InodeAttributeProvider class is configured > with SentryINodeAttributeProvider, it does not get > SentryInodeAttributeProvider object and not using AclFeature from that if any > Acl’s are set. This has caused the issue of AccessControlException when du > command is run against hdfs path managed by Sentry. > > {code:java} > [root@gg-620-1 ~]# hdfs dfs -du /dev/edl/sc/consumer/lpfg/str/edf/abc/ > du: Permission denied: user=systest, access=READ_EXECUTE, > inode="/dev/edl/sc/consumer/lpfg/str/lpfg_wrk/PRISMA_TO_ICERTIS_OUTBOUND_RM_MASTER/_impala_insert_staging":impala:hive:drwxrwx--x{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15165) In Du missed calling getAttributesProvider
[ https://issues.apache.org/jira/browse/HDFS-15165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040361#comment-17040361 ] Íñigo Goiri commented on HDFS-15165: +1 on [^HDFS-15165.01.patch]. Committing. > In Du missed calling getAttributesProvider > -- > > Key: HDFS-15165 > URL: https://issues.apache.org/jira/browse/HDFS-15165 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Major > Attachments: HDFS-15165.00.patch, HDFS-15165.01.patch, > example-test.patch > > > HDFS-12130 changed the behavior of DU command. > It merged both check permission and computation in to a single step. > During this change, when it is required to getInodeAttributes, it just used > inode.getAttributes(). But when attribute provider class is configured, we > should call attribute provider configured object to get InodeAttributes and > use the returned InodeAttributes during checkPermission. > So, if we see after HDFS-12130, code is changed as below. > > {code:java} > byte[][] localComponents = {inode.getLocalNameBytes()}; > INodeAttributes[] iNodeAttr = {inode.getSnapshotINode(snapshotId)}; > enforcer.checkPermission( > fsOwner, supergroup, callerUgi, > iNodeAttr, // single inode attr in the array > new INode[]{inode}, // single inode in the array > localComponents, snapshotId, > null, -1, // this will skip checkTraverse() because > // not checking ancestor here > false, null, null, > access, // the target access to be checked against the inode > null, // passing null sub access avoids checking children > false); > {code} > > If we observe 2nd line it is missing the check if attribute provider class is > configured use that to get InodeAttributeProvider. Because of this when hdfs > path is managed by sentry, and InodeAttributeProvider class is configured > with SentryINodeAttributeProvider, it does not get > SentryInodeAttributeProvider object and not using AclFeature from that if any > Acl’s are set. This has caused the issue of AccessControlException when du > command is run against hdfs path managed by Sentry. > > {code:java} > [root@gg-620-1 ~]# hdfs dfs -du /dev/edl/sc/consumer/lpfg/str/edf/abc/ > du: Permission denied: user=systest, access=READ_EXECUTE, > inode="/dev/edl/sc/consumer/lpfg/str/lpfg_wrk/PRISMA_TO_ICERTIS_OUTBOUND_RM_MASTER/_impala_insert_staging":impala:hive:drwxrwx--x{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15176) Enable GcTimePercentage Metric in NameNode's JvmMetrics.
[ https://issues.apache.org/jira/browse/HDFS-15176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040354#comment-17040354 ] Íñigo Goiri commented on HDFS-15176: +1 on [^HDFS-15176.004.patch]. > Enable GcTimePercentage Metric in NameNode's JvmMetrics. > > > Key: HDFS-15176 > URL: https://issues.apache.org/jira/browse/HDFS-15176 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Jinglun >Assignee: Jinglun >Priority: Minor > Attachments: HDFS-15176.001.patch, HDFS-15176.002.patch, > HDFS-15176.003.patch, HDFS-15176.004.patch > > > The GcTimePercentage(computed by GcTimeMonitor) could be used as a dimension > to analyze the NameNode GC. We should add a switch config to enable the > GcTimePercentage metric in HDFS. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15159) Prevent adding same DN multiple times in PendingReconstructionBlocks
[ https://issues.apache.org/jira/browse/HDFS-15159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040351#comment-17040351 ] Íñigo Goiri commented on HDFS-15159: We definitely need a test here. > Prevent adding same DN multiple times in PendingReconstructionBlocks > > > Key: HDFS-15159 > URL: https://issues.apache.org/jira/browse/HDFS-15159 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: hemanthboyina >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-15159.001.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14731) [FGL] Remove redundant locking on NameNode.
[ https://issues.apache.org/jira/browse/HDFS-14731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040325#comment-17040325 ] Yuval Degani commented on HDFS-14731: - [~shv], the patch LGTM. This a great step in untangling some of the locking mess. +1 > [FGL] Remove redundant locking on NameNode. > --- > > Key: HDFS-14731 > URL: https://issues.apache.org/jira/browse/HDFS-14731 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: Konstantin Shvachko >Assignee: Konstantin Shvachko >Priority: Major > Attachments: HDFS-14731.001.patch > > > Currently NameNode has two global locks: FSNamesystemLock and > FSDirectoryLock. An analysis shows that single FSNamesystemLock is sufficient > to guarantee consistency of the NameNode state. FSDirectoryLock can be > removed. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15154) Allow only hdfs superusers the ability to assign HDFS storage policies
[ https://issues.apache.org/jira/browse/HDFS-15154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040323#comment-17040323 ] Siddharth Wagle commented on HDFS-15154: I realized that the deprecated key is now ignored, trying to figure out a clean way to hand deprecation. > Allow only hdfs superusers the ability to assign HDFS storage policies > -- > > Key: HDFS-15154 > URL: https://issues.apache.org/jira/browse/HDFS-15154 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 3.0.0 >Reporter: Bob Cauthen >Assignee: Siddharth Wagle >Priority: Major > Attachments: HDFS-15154.01.patch, HDFS-15154.02.patch, > HDFS-15154.03.patch, HDFS-15154.04.patch > > > Please provide a way to limit only HDFS superusers the ability to assign HDFS > Storage Policies to HDFS directories. > Currently, and based on Jira HDFS-7093, all storage policies can be disabled > cluster wide by setting the following: > dfs.storage.policy.enabled to false > But we need a way to allow only HDFS superusers the ability to assign an HDFS > Storage Policy to an HDFS directory. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-15154) Allow only hdfs superusers the ability to assign HDFS storage policies
[ https://issues.apache.org/jira/browse/HDFS-15154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040323#comment-17040323 ] Siddharth Wagle edited comment on HDFS-15154 at 2/19/20 6:38 PM: - I realized that the deprecated key is now ignored, trying to figure out a clean way to handle deprecation. was (Author: swagle): I realized that the deprecated key is now ignored, trying to figure out a clean way to hand deprecation. > Allow only hdfs superusers the ability to assign HDFS storage policies > -- > > Key: HDFS-15154 > URL: https://issues.apache.org/jira/browse/HDFS-15154 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 3.0.0 >Reporter: Bob Cauthen >Assignee: Siddharth Wagle >Priority: Major > Attachments: HDFS-15154.01.patch, HDFS-15154.02.patch, > HDFS-15154.03.patch, HDFS-15154.04.patch > > > Please provide a way to limit only HDFS superusers the ability to assign HDFS > Storage Policies to HDFS directories. > Currently, and based on Jira HDFS-7093, all storage policies can be disabled > cluster wide by setting the following: > dfs.storage.policy.enabled to false > But we need a way to allow only HDFS superusers the ability to assign an HDFS > Storage Policy to an HDFS directory. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15149) TestDeadNodeDetection test cases time-out
[ https://issues.apache.org/jira/browse/HDFS-15149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040280#comment-17040280 ] Ahmed Hussein commented on HDFS-15149: -- Hi [~leosun08], {{TestDeadNodeDetection}} is still flaky. To reproduce, run the test case inside a loop and leave it running for sometime. {code:bash} ## from root directory mvn test -Dtest=TestDeadNodeDetection cd hadoop-hdfs-project/hadoop-hdfs/ mvn test -Dtest=TestDeadNodeDetection while :;do mvn surefire:test -Dtest=TestDeadNodeDetection || break;done {code} {{testDeadNodeDetectionDeadNodeRecovery}} failed on every single run. > TestDeadNodeDetection test cases time-out > - > > Key: HDFS-15149 > URL: https://issues.apache.org/jira/browse/HDFS-15149 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Ahmed Hussein >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-15149-001.patch, HDFS-15149-002.patch > > > TestDeadNodeDetection JUnit time out times out with the following stack > traces: > * 1- testDeadNodeDetectionInBackground* > {code:bash} > [ERROR] Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: > 264.757 s <<< FAILURE! - in org.apache.hadoop.hdfs.TestDeadNodeDetection > [ERROR] > testDeadNodeDetectionInBackground(org.apache.hadoop.hdfs.TestDeadNodeDetection) > Time elapsed: 125.806 s <<< ERROR! > java.util.concurrent.TimeoutException: > Timed out waiting for condition. Thread diagnostics: > Timestamp: 2020-01-24 08:31:07,023 > "client DomainSocketWatcher" daemon prio=5 tid=117 runnable > java.lang.Thread.State: RUNNABLE > at org.apache.hadoop.net.unix.DomainSocketWatcher.doPoll0(Native > Method) > at > org.apache.hadoop.net.unix.DomainSocketWatcher.access$900(DomainSocketWatcher.java:52) > at > org.apache.hadoop.net.unix.DomainSocketWatcher$2.run(DomainSocketWatcher.java:503) > at java.lang.Thread.run(Thread.java:748) > "Session-HouseKeeper-48c3205a" prio=5 tid=350 timed_waiting > java.lang.Thread.State: TIMED_WAITING > at sun.misc.Unsafe.park(Native Method) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078) > at > java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093) > at > java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > "java.util.concurrent.ThreadPoolExecutor$Worker@3ae54156[State = -1, empty > queue]" daemon prio=5 tid=752 in Object.wait() > java.lang.Thread.State: WAITING (on object monitor) > at sun.misc.Unsafe.park(Native Method) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) > at > java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > "CacheReplicationMonitor(1960356187)" prio=5 tid=386 timed_waiting > java.lang.Thread.State: TIMED_WAITING > at sun.misc.Unsafe.park(Native Method) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2163) > at > org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor.run(CacheReplicationMonitor.java:181) > "Timer for 'NameNode' metrics system" daemon prio=5 tid=339 timed_waiting > java.lang.Thread.State: TIMED_WAITING > at java.lang.Object.wait(Native Method) > at java.util.TimerThread.mainLoop(Timer.java:552) > at java.util.TimerThread.run(Timer.java:505) > "org.apache.hadoop.hdfs.server.namenode.FSNamesystem$LazyPersistFileScrubber@6b760460" > daemon prio=5 tid=385 timed_waiting > java.lang.Thread.State: TIMED_WAITING > at java.lang.Thread.sleep(Native Method) > at >
[jira] [Commented] (HDFS-15149) TestDeadNodeDetection test cases time-out
[ https://issues.apache.org/jira/browse/HDFS-15149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040277#comment-17040277 ] Íñigo Goiri commented on HDFS-15149: I was referring to syncing through interruptions and make it more explicit. I'm also positive to bring the type of coordination that [~ahussein] is bringing up. > TestDeadNodeDetection test cases time-out > - > > Key: HDFS-15149 > URL: https://issues.apache.org/jira/browse/HDFS-15149 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Ahmed Hussein >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-15149-001.patch, HDFS-15149-002.patch > > > TestDeadNodeDetection JUnit time out times out with the following stack > traces: > * 1- testDeadNodeDetectionInBackground* > {code:bash} > [ERROR] Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: > 264.757 s <<< FAILURE! - in org.apache.hadoop.hdfs.TestDeadNodeDetection > [ERROR] > testDeadNodeDetectionInBackground(org.apache.hadoop.hdfs.TestDeadNodeDetection) > Time elapsed: 125.806 s <<< ERROR! > java.util.concurrent.TimeoutException: > Timed out waiting for condition. Thread diagnostics: > Timestamp: 2020-01-24 08:31:07,023 > "client DomainSocketWatcher" daemon prio=5 tid=117 runnable > java.lang.Thread.State: RUNNABLE > at org.apache.hadoop.net.unix.DomainSocketWatcher.doPoll0(Native > Method) > at > org.apache.hadoop.net.unix.DomainSocketWatcher.access$900(DomainSocketWatcher.java:52) > at > org.apache.hadoop.net.unix.DomainSocketWatcher$2.run(DomainSocketWatcher.java:503) > at java.lang.Thread.run(Thread.java:748) > "Session-HouseKeeper-48c3205a" prio=5 tid=350 timed_waiting > java.lang.Thread.State: TIMED_WAITING > at sun.misc.Unsafe.park(Native Method) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078) > at > java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093) > at > java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > "java.util.concurrent.ThreadPoolExecutor$Worker@3ae54156[State = -1, empty > queue]" daemon prio=5 tid=752 in Object.wait() > java.lang.Thread.State: WAITING (on object monitor) > at sun.misc.Unsafe.park(Native Method) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) > at > java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > "CacheReplicationMonitor(1960356187)" prio=5 tid=386 timed_waiting > java.lang.Thread.State: TIMED_WAITING > at sun.misc.Unsafe.park(Native Method) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2163) > at > org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor.run(CacheReplicationMonitor.java:181) > "Timer for 'NameNode' metrics system" daemon prio=5 tid=339 timed_waiting > java.lang.Thread.State: TIMED_WAITING > at java.lang.Object.wait(Native Method) > at java.util.TimerThread.mainLoop(Timer.java:552) > at java.util.TimerThread.run(Timer.java:505) > "org.apache.hadoop.hdfs.server.namenode.FSNamesystem$LazyPersistFileScrubber@6b760460" > daemon prio=5 tid=385 timed_waiting > java.lang.Thread.State: TIMED_WAITING > at java.lang.Thread.sleep(Native Method) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem$LazyPersistFileScrubber.run(FSNamesystem.java:4420) > at java.lang.Thread.run(Thread.java:748) > "qtp164757726-349" daemon prio=5 tid=349 runnable > java.lang.Thread.State: RUNNABLE > at sun.nio.ch.EPollArrayWrapper.epollWait(Native
[jira] [Created] (HDFS-15184) Failed to execute goal org.apache.maven.plugins:maven-antrun-plugin:1.7:run (make) on project hadoop-hdfs-native-client: An Ant BuildException has occured: exec returned:
任建亭 created HDFS-15184: -- Summary: Failed to execute goal org.apache.maven.plugins:maven-antrun-plugin:1.7:run (make) on project hadoop-hdfs-native-client: An Ant BuildException has occured: exec returned: 1 Key: HDFS-15184 URL: https://issues.apache.org/jira/browse/HDFS-15184 Project: Hadoop HDFS Issue Type: Bug Components: hdfs Affects Versions: 3.2.1 Environment: windows 10 JDK 1.8 maven3.6.1 ProtocolBuffer 2.5.0 CMake 3.1.3 git 2.25.0 zlib 1.2.5 Visual Studio 2010 Professional Reporter: 任建亭 Fix For: 3.2.1 When I build hadoop 3.2.1 on windows10, it failed. My command is 'mvn clean package -Pdist,native-win -DskipTests -Dtar'. {code:java} [ERROR] Failed to execute goal org.apache.maven.plugins:maven-antrun-plugin:1.7:run (make) on project hadoop-hdfs-native-client: An Ant BuildException has occured: exec returned: 1 [ERROR] around Ant part .. @ 9:122 in D:\h3s\hadoop-hdfs-project\hadoop-hdfs-native-client\target\antrun\build-main.xml {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14816) TestFileCorruption#testCorruptionWithDiskFailure logic is not correct
[ https://issues.apache.org/jira/browse/HDFS-14816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hemanthboyina updated HDFS-14816: - Resolution: Not A Problem Status: Resolved (was: Patch Available) > TestFileCorruption#testCorruptionWithDiskFailure logic is not correct > - > > Key: HDFS-14816 > URL: https://issues.apache.org/jira/browse/HDFS-14816 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: hemanthboyina >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14816.001.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15159) Prevent adding same DN multiple times in PendingReconstructionBlocks
[ https://issues.apache.org/jira/browse/HDFS-15159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hemanthboyina updated HDFS-15159: - Attachment: HDFS-15159.001.patch Status: Patch Available (was: Open) > Prevent adding same DN multiple times in PendingReconstructionBlocks > > > Key: HDFS-15159 > URL: https://issues.apache.org/jira/browse/HDFS-15159 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: hemanthboyina >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-15159.001.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15183) For AzureNativeFS, when BlockCompaction is enabled, FileSystem.create(path).close() would throw exception.
Xiaolei Liu created HDFS-15183: -- Summary: For AzureNativeFS, when BlockCompaction is enabled, FileSystem.create(path).close() would throw exception. Key: HDFS-15183 URL: https://issues.apache.org/jira/browse/HDFS-15183 Project: Hadoop HDFS Issue Type: Bug Components: fs/azure Affects Versions: 3.2.1, 2.9.2 Environment: macOS Mojave 10.14.6 Reporter: Xiaolei Liu For AzureNativeFS, when BlockCompaction is enabled, FileSystem.create(path).close() would throw blob not existed exception. Block Compaction Setting: fs.azure.block.blob.with.compaction.dir Exception is thrown from close(), this would happen when no write happened. When actually write any content in the file, same context close() won't trigger the exception. When BlockCompaction is not enabled, this issue won't happen. Call Stack: org.apache.hadoop.fs.azure.AzureException: Source blob _$azuretmpfolder$/956457df-4a3e-4285-bc68-29f68b9b36c4test1911.log does not exist. org.apache.hadoop.fs.azure.AzureException: Source blob _$azuretmpfolder$/956457df-4a3e-4285-bc68-29f68b9b36c4test1911.log does not exist. at org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.rename(AzureNativeFileSystemStore.java:2648) at org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.rename(AzureNativeFileSystemStore.java:2608) at org.apache.hadoop.fs.azure.NativeAzureFileSystem$NativeAzureFsOutputStream.restoreKey(NativeAzureFileSystem.java:1199) at org.apache.hadoop.fs.azure.NativeAzureFileSystem$NativeAzureFsOutputStream.close(NativeAzureFileSystem.java:1068) at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72) at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15149) TestDeadNodeDetection test cases time-out
[ https://issues.apache.org/jira/browse/HDFS-15149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040235#comment-17040235 ] Hadoop QA commented on HDFS-15149: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 22s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 53s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 17s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 53s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 53s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 56s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 1s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 7s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 3m 9s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 47s{color} | {color:orange} hadoop-hdfs-project: The patch generated 1 new + 48 unchanged - 0 fixed = 49 total (was 48) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 5s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 50s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 88m 25s{color} | {color:green} hadoop-hdfs in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 31s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}167m 36s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.4 Server=19.03.4 Image:yetus/hadoop:c44943d1fc3 | | JIRA Issue | HDFS-15149 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12993899/HDFS-15149-002.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 4e72eef0be54 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / cb3f3cc | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_242 | | findbugs | v3.1.0-RC1 | | checkstyle |
[jira] [Commented] (HDFS-15182) TestBlockManager#testOneOfTwoRacksDecommissioned() fail in trunk
[ https://issues.apache.org/jira/browse/HDFS-15182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040199#comment-17040199 ] Hadoop QA commented on HDFS-15182: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 47s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 21s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 16s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 50s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 17s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 18m 5s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 46s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 33s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 44s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}118m 43s{color} | {color:red} hadoop-hdfs in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 32s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}191m 20s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.TestDeadNodeDetection | | | hadoop.hdfs.server.blockmanagement.TestBlockManager | | | hadoop.hdfs.server.namenode.ha.TestBootstrapStandby | | | hadoop.hdfs.server.namenode.TestAddOverReplicatedStripedBlocks | | | hadoop.hdfs.TestReconstructStripedFile | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.6 Server=19.03.6 Image:yetus/hadoop:c44943d1fc3 | | JIRA Issue | HDFS-15182 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12993889/HDFS-15182-002.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 61343db41a0a 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / cb3f3cc | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_232 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/28809/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/28809/testReport/ | | Max. process+thread count | 2956 (vs. ulimit of 5500) | | modules | C:
[jira] [Commented] (HDFS-15115) Namenode crash caused by NPE in BlockPlacementPolicyDefault when dynamically change logger to debug
[ https://issues.apache.org/jira/browse/HDFS-15115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040179#comment-17040179 ] Ayush Saxena commented on HDFS-15115: - Cherry-picked to branch-2.10 > Namenode crash caused by NPE in BlockPlacementPolicyDefault when dynamically > change logger to debug > --- > > Key: HDFS-15115 > URL: https://issues.apache.org/jira/browse/HDFS-15115 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: wangzhixiang >Assignee: wangzhixiang >Priority: Major > Fix For: 3.3.0, 3.1.4, 3.2.2, 2.10.1 > > Attachments: HDFS-15115.001.patch, HDFS-15115.003.patch, > HDFS-15115.004.patch, HDFS-15115.005.patch, HDFS-15115.2.patch > > > To get debug info, we dynamically change the logger of > BlockPlacementPolicyDefault to debug when namenode is running. However, the > Namenode crashs. From the log, we find some NPE in > BlockPlacementPolicyDefault.chooseRandom. Because *StringBuilder builder* > will be used 4 times in BlockPlacementPolicyDefault.chooseRandom method. > While the *builder* only initializes in the first time of this method. If we > change the logger of BlockPlacementPolicyDefault to debug after the part, the > *builder* in remaining part is *NULL* and cause *NPE* -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15115) Namenode crash caused by NPE in BlockPlacementPolicyDefault when dynamically change logger to debug
[ https://issues.apache.org/jira/browse/HDFS-15115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena updated HDFS-15115: Fix Version/s: 2.10.1 > Namenode crash caused by NPE in BlockPlacementPolicyDefault when dynamically > change logger to debug > --- > > Key: HDFS-15115 > URL: https://issues.apache.org/jira/browse/HDFS-15115 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: wangzhixiang >Assignee: wangzhixiang >Priority: Major > Fix For: 3.3.0, 3.1.4, 3.2.2, 2.10.1 > > Attachments: HDFS-15115.001.patch, HDFS-15115.003.patch, > HDFS-15115.004.patch, HDFS-15115.005.patch, HDFS-15115.2.patch > > > To get debug info, we dynamically change the logger of > BlockPlacementPolicyDefault to debug when namenode is running. However, the > Namenode crashs. From the log, we find some NPE in > BlockPlacementPolicyDefault.chooseRandom. Because *StringBuilder builder* > will be used 4 times in BlockPlacementPolicyDefault.chooseRandom method. > While the *builder* only initializes in the first time of this method. If we > change the logger of BlockPlacementPolicyDefault to debug after the part, the > *builder* in remaining part is *NULL* and cause *NPE* -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15176) Enable GcTimePercentage Metric in NameNode's JvmMetrics.
[ https://issues.apache.org/jira/browse/HDFS-15176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040167#comment-17040167 ] Hadoop QA commented on HDFS-15176: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 47s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 21s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 47s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 20m 16s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 57s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 41s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 22s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 7s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 39s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 30s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}119m 17s{color} | {color:red} hadoop-hdfs in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 59s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}240m 54s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.namenode.ha.TestEditLogTailer | | | hadoop.hdfs.qjournal.server.TestJournalNodeSync | | | hadoop.hdfs.TestRollingUpgrade | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.6 Server=19.03.6 Image:yetus/hadoop:c44943d1fc3 | | JIRA Issue | HDFS-15176 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12993879/HDFS-15176.004.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml | | uname | Linux c1481eec50e7 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality |
[jira] [Updated] (HDFS-15181) Webhdfs getTrashRoot() causes internal AccessControlException
[ https://issues.apache.org/jira/browse/HDFS-15181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-15181: -- Resolution: Duplicate Status: Resolved (was: Patch Available) Duping this to HDFS-15052. > Webhdfs getTrashRoot() causes internal AccessControlException > - > > Key: HDFS-15181 > URL: https://issues.apache.org/jira/browse/HDFS-15181 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Reporter: Kihwal Lee >Assignee: Kihwal Lee >Priority: Blocker > Attachments: HDFS-15181.trunk.patch > > > HDFS-10756 added the {{getTrashRoot()}} support for WebHdfs. However, it was > done by creating a FileSystem instance in the namenode. This is unacceptable > for many reasons and also the implementation is not correct. The current > implementation only works when security is off. When security is on, the > internal client received AccessControlException and does not work. > A similar bug was preset in HDFS-11156. Again, this is not merely a > "performance bug". These don't work with security on. Fortunately > HDFS-11156 was reverted and reworked. I've recently reverted it and ported > the rework to branch-2.10. > Unless HDFS-10756 can be remedied quickly, it needs to be reverted. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15149) TestDeadNodeDetection test cases time-out
[ https://issues.apache.org/jira/browse/HDFS-15149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lisheng Sun updated HDFS-15149: --- Attachment: HDFS-15149-002.patch > TestDeadNodeDetection test cases time-out > - > > Key: HDFS-15149 > URL: https://issues.apache.org/jira/browse/HDFS-15149 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Ahmed Hussein >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-15149-001.patch, HDFS-15149-002.patch > > > TestDeadNodeDetection JUnit time out times out with the following stack > traces: > * 1- testDeadNodeDetectionInBackground* > {code:bash} > [ERROR] Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: > 264.757 s <<< FAILURE! - in org.apache.hadoop.hdfs.TestDeadNodeDetection > [ERROR] > testDeadNodeDetectionInBackground(org.apache.hadoop.hdfs.TestDeadNodeDetection) > Time elapsed: 125.806 s <<< ERROR! > java.util.concurrent.TimeoutException: > Timed out waiting for condition. Thread diagnostics: > Timestamp: 2020-01-24 08:31:07,023 > "client DomainSocketWatcher" daemon prio=5 tid=117 runnable > java.lang.Thread.State: RUNNABLE > at org.apache.hadoop.net.unix.DomainSocketWatcher.doPoll0(Native > Method) > at > org.apache.hadoop.net.unix.DomainSocketWatcher.access$900(DomainSocketWatcher.java:52) > at > org.apache.hadoop.net.unix.DomainSocketWatcher$2.run(DomainSocketWatcher.java:503) > at java.lang.Thread.run(Thread.java:748) > "Session-HouseKeeper-48c3205a" prio=5 tid=350 timed_waiting > java.lang.Thread.State: TIMED_WAITING > at sun.misc.Unsafe.park(Native Method) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078) > at > java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093) > at > java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > "java.util.concurrent.ThreadPoolExecutor$Worker@3ae54156[State = -1, empty > queue]" daemon prio=5 tid=752 in Object.wait() > java.lang.Thread.State: WAITING (on object monitor) > at sun.misc.Unsafe.park(Native Method) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) > at > java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > "CacheReplicationMonitor(1960356187)" prio=5 tid=386 timed_waiting > java.lang.Thread.State: TIMED_WAITING > at sun.misc.Unsafe.park(Native Method) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2163) > at > org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor.run(CacheReplicationMonitor.java:181) > "Timer for 'NameNode' metrics system" daemon prio=5 tid=339 timed_waiting > java.lang.Thread.State: TIMED_WAITING > at java.lang.Object.wait(Native Method) > at java.util.TimerThread.mainLoop(Timer.java:552) > at java.util.TimerThread.run(Timer.java:505) > "org.apache.hadoop.hdfs.server.namenode.FSNamesystem$LazyPersistFileScrubber@6b760460" > daemon prio=5 tid=385 timed_waiting > java.lang.Thread.State: TIMED_WAITING > at java.lang.Thread.sleep(Native Method) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem$LazyPersistFileScrubber.run(FSNamesystem.java:4420) > at java.lang.Thread.run(Thread.java:748) > "qtp164757726-349" daemon prio=5 tid=349 runnable > java.lang.Thread.State: RUNNABLE > at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) > at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269) > at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93) >
[jira] [Commented] (HDFS-15149) TestDeadNodeDetection test cases time-out
[ https://issues.apache.org/jira/browse/HDFS-15149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040027#comment-17040027 ] Lisheng Sun commented on HDFS-15149: hi [~elgoiri] {quote} I like the rest of the solution though. {quote} what refers to the rest of the solution though? > TestDeadNodeDetection test cases time-out > - > > Key: HDFS-15149 > URL: https://issues.apache.org/jira/browse/HDFS-15149 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Ahmed Hussein >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-15149-001.patch > > > TestDeadNodeDetection JUnit time out times out with the following stack > traces: > * 1- testDeadNodeDetectionInBackground* > {code:bash} > [ERROR] Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: > 264.757 s <<< FAILURE! - in org.apache.hadoop.hdfs.TestDeadNodeDetection > [ERROR] > testDeadNodeDetectionInBackground(org.apache.hadoop.hdfs.TestDeadNodeDetection) > Time elapsed: 125.806 s <<< ERROR! > java.util.concurrent.TimeoutException: > Timed out waiting for condition. Thread diagnostics: > Timestamp: 2020-01-24 08:31:07,023 > "client DomainSocketWatcher" daemon prio=5 tid=117 runnable > java.lang.Thread.State: RUNNABLE > at org.apache.hadoop.net.unix.DomainSocketWatcher.doPoll0(Native > Method) > at > org.apache.hadoop.net.unix.DomainSocketWatcher.access$900(DomainSocketWatcher.java:52) > at > org.apache.hadoop.net.unix.DomainSocketWatcher$2.run(DomainSocketWatcher.java:503) > at java.lang.Thread.run(Thread.java:748) > "Session-HouseKeeper-48c3205a" prio=5 tid=350 timed_waiting > java.lang.Thread.State: TIMED_WAITING > at sun.misc.Unsafe.park(Native Method) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078) > at > java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093) > at > java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > "java.util.concurrent.ThreadPoolExecutor$Worker@3ae54156[State = -1, empty > queue]" daemon prio=5 tid=752 in Object.wait() > java.lang.Thread.State: WAITING (on object monitor) > at sun.misc.Unsafe.park(Native Method) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) > at > java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > "CacheReplicationMonitor(1960356187)" prio=5 tid=386 timed_waiting > java.lang.Thread.State: TIMED_WAITING > at sun.misc.Unsafe.park(Native Method) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2163) > at > org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor.run(CacheReplicationMonitor.java:181) > "Timer for 'NameNode' metrics system" daemon prio=5 tid=339 timed_waiting > java.lang.Thread.State: TIMED_WAITING > at java.lang.Object.wait(Native Method) > at java.util.TimerThread.mainLoop(Timer.java:552) > at java.util.TimerThread.run(Timer.java:505) > "org.apache.hadoop.hdfs.server.namenode.FSNamesystem$LazyPersistFileScrubber@6b760460" > daemon prio=5 tid=385 timed_waiting > java.lang.Thread.State: TIMED_WAITING > at java.lang.Thread.sleep(Native Method) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem$LazyPersistFileScrubber.run(FSNamesystem.java:4420) > at java.lang.Thread.run(Thread.java:748) > "qtp164757726-349" daemon prio=5 tid=349 runnable > java.lang.Thread.State: RUNNABLE > at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) > at
[jira] [Commented] (HDFS-15149) TestDeadNodeDetection test cases time-out
[ https://issues.apache.org/jira/browse/HDFS-15149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040025#comment-17040025 ] Lisheng Sun commented on HDFS-15149: Thank [~ahussein] for your suggestion. {quote} The poll period and waiting time (5000 and 10) in waitFoDeadNode is very large. I assume you had to use large numbers to match the delays of the detector threads. {quote} The poll period and waiting time are indeed too long and i can reduce them. {quote} I have a question about clearAndGetDetectedDeadNodes(): As far as I understand Calling the method in a loop means that a "deadnode" can be removed from the deadNodes map. In other words, the count may never reach 3, because the map does not for the removed nodes from the list. Please feel free to correct my understanding of the code if I am wrong. {quote} the method of clearAndGetDetectedDeadNodes() 's purpose is return the new deadNodes that don't include dead node which is not used by any DFSInputStream. {quote} IMHO, DeadNodeDetector.java needs to introduce more aggressive mechanisms to coordinate between the threads. Instead of just racing between each other, tasks can use conditional variables to communicate like synchronized queues, or object monitors. Another benefit from using conditional variables is that the runtime of the tests will be improved because there won't be need to wait for a full cycle. The DefaultSpeculator.java has a synchronized queue just for the purpose of testing: "DefaultSpeculator.scanControl". {quote} Based on your good suggestion i will optimize it. Thank you. > TestDeadNodeDetection test cases time-out > - > > Key: HDFS-15149 > URL: https://issues.apache.org/jira/browse/HDFS-15149 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Ahmed Hussein >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-15149-001.patch > > > TestDeadNodeDetection JUnit time out times out with the following stack > traces: > * 1- testDeadNodeDetectionInBackground* > {code:bash} > [ERROR] Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: > 264.757 s <<< FAILURE! - in org.apache.hadoop.hdfs.TestDeadNodeDetection > [ERROR] > testDeadNodeDetectionInBackground(org.apache.hadoop.hdfs.TestDeadNodeDetection) > Time elapsed: 125.806 s <<< ERROR! > java.util.concurrent.TimeoutException: > Timed out waiting for condition. Thread diagnostics: > Timestamp: 2020-01-24 08:31:07,023 > "client DomainSocketWatcher" daemon prio=5 tid=117 runnable > java.lang.Thread.State: RUNNABLE > at org.apache.hadoop.net.unix.DomainSocketWatcher.doPoll0(Native > Method) > at > org.apache.hadoop.net.unix.DomainSocketWatcher.access$900(DomainSocketWatcher.java:52) > at > org.apache.hadoop.net.unix.DomainSocketWatcher$2.run(DomainSocketWatcher.java:503) > at java.lang.Thread.run(Thread.java:748) > "Session-HouseKeeper-48c3205a" prio=5 tid=350 timed_waiting > java.lang.Thread.State: TIMED_WAITING > at sun.misc.Unsafe.park(Native Method) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078) > at > java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093) > at > java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > "java.util.concurrent.ThreadPoolExecutor$Worker@3ae54156[State = -1, empty > queue]" daemon prio=5 tid=752 in Object.wait() > java.lang.Thread.State: WAITING (on object monitor) > at sun.misc.Unsafe.park(Native Method) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) > at > java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > "CacheReplicationMonitor(1960356187)" prio=5 tid=386 timed_waiting
[jira] [Commented] (HDFS-15177) Split datanode invalide block deletion, to avoid the FsDatasetImpl lock too much time.
[ https://issues.apache.org/jira/browse/HDFS-15177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040016#comment-17040016 ] Stephen O'Donnell commented on HDFS-15177: -- The FoldedTreeSet structure is used in both NN and DN. I have seen it behave slowly in both, so when the DN gets slowed down by all the deletes, its worth scanning your jstack traces to see if you see anything like that mentioned in HDFS-15131. > Split datanode invalide block deletion, to avoid the FsDatasetImpl lock too > much time. > -- > > Key: HDFS-15177 > URL: https://issues.apache.org/jira/browse/HDFS-15177 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: zhuqi >Assignee: zhuqi >Priority: Major > Attachments: image-2020-02-18-22-39-00-642.png, > image-2020-02-18-22-51-28-624.png, image-2020-02-18-22-52-59-202.png, > image-2020-02-18-22-55-38-661.png > > > In our cluster, the datanode receive the delete command with too many blocks > deletion when we have many blockpools sharing the same datanode and the > datanode with about 30 storage dirs, it will cause the FsDatasetImpl lock too > much time. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15182) TestBlockManager#testOneOfTwoRacksDecommissioned() fail in trunk
[ https://issues.apache.org/jira/browse/HDFS-15182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17039993#comment-17039993 ] Lisheng Sun commented on HDFS-15182: the v002 fixed it in @Before. > TestBlockManager#testOneOfTwoRacksDecommissioned() fail in trunk > > > Key: HDFS-15182 > URL: https://issues.apache.org/jira/browse/HDFS-15182 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Minor > Attachments: HDFS-15182-001.patch, HDFS-15182-002.patch > > > when run only a UT of TestBlockManager#testOneOfTwoRacksDecommissioned(), it > will fail and throw NullPointerException. > Since NameNode#metrics is static variable,run all uts in TestBlockManager and > other ut has init metrics. > But that it runs only testOneOfTwoRacksDecommissioned without initialing > metrics throws NullPointerException. > {code:java} > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addBlock(BlockManager.java:4088) > at > org.apache.hadoop.hdfs.server.blockmanagement.TestBlockManager.fulfillPipeline(TestBlockManager.java:518) > at > org.apache.hadoop.hdfs.server.blockmanagement.TestBlockManager.doTestOneOfTwoRacksDecommissioned(TestBlockManager.java:388) > at > org.apache.hadoop.hdfs.server.blockmanagement.TestBlockManager.testOneOfTwoRacksDecommissioned(TestBlockManager.java:353) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at org.junit.runner.JUnitCore.run(JUnitCore.java:137) > at > com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68) > at > com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:47) > at > com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242) > at > com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15182) TestBlockManager#testOneOfTwoRacksDecommissioned() fail in trunk
[ https://issues.apache.org/jira/browse/HDFS-15182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lisheng Sun updated HDFS-15182: --- Attachment: HDFS-15182-002.patch > TestBlockManager#testOneOfTwoRacksDecommissioned() fail in trunk > > > Key: HDFS-15182 > URL: https://issues.apache.org/jira/browse/HDFS-15182 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Minor > Attachments: HDFS-15182-001.patch, HDFS-15182-002.patch > > > when run only a UT of TestBlockManager#testOneOfTwoRacksDecommissioned(), it > will fail and throw NullPointerException. > Since NameNode#metrics is static variable,run all uts in TestBlockManager and > other ut has init metrics. > But that it runs only testOneOfTwoRacksDecommissioned without initialing > metrics throws NullPointerException. > {code:java} > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addBlock(BlockManager.java:4088) > at > org.apache.hadoop.hdfs.server.blockmanagement.TestBlockManager.fulfillPipeline(TestBlockManager.java:518) > at > org.apache.hadoop.hdfs.server.blockmanagement.TestBlockManager.doTestOneOfTwoRacksDecommissioned(TestBlockManager.java:388) > at > org.apache.hadoop.hdfs.server.blockmanagement.TestBlockManager.testOneOfTwoRacksDecommissioned(TestBlockManager.java:353) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at org.junit.runner.JUnitCore.run(JUnitCore.java:137) > at > com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68) > at > com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:47) > at > com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242) > at > com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15167) Block Report Interval shouldn't be reset apart from first Block Report
[ https://issues.apache.org/jira/browse/HDFS-15167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17039928#comment-17039928 ] Ayush Saxena commented on HDFS-15167: - Thanx [~elgoiri] for the review!!! [~surendrasingh] can you give a check too. > Block Report Interval shouldn't be reset apart from first Block Report > -- > > Key: HDFS-15167 > URL: https://issues.apache.org/jira/browse/HDFS-15167 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Major > Attachments: HDFS-15167-01.patch, HDFS-15167-02.patch, > HDFS-15167-03.patch, HDFS-15167-04.patch, HDFS-15167-05.patch, > HDFS-15167-06.patch, HDFS-15167-07.patch, HDFS-15167-08.patch > > > Presently BlockReport interval is reset even in case the BR is manually > triggered or BR is triggered for diskError. > Which isn't required. As per the comment also, it is intended for first BR > only : > {code:java} > // If we have sent the first set of block reports, then wait a random > // time before we start the periodic block reports. > if (resetBlockReportTime) { > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15176) Enable GcTimePercentage Metric in NameNode's JvmMetrics.
[ https://issues.apache.org/jira/browse/HDFS-15176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17039927#comment-17039927 ] Jinglun commented on HDFS-15176: Hi [~elgoiri], thanks your nice and detailed comments ! Upload v04. > Enable GcTimePercentage Metric in NameNode's JvmMetrics. > > > Key: HDFS-15176 > URL: https://issues.apache.org/jira/browse/HDFS-15176 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Jinglun >Assignee: Jinglun >Priority: Minor > Attachments: HDFS-15176.001.patch, HDFS-15176.002.patch, > HDFS-15176.003.patch, HDFS-15176.004.patch > > > The GcTimePercentage(computed by GcTimeMonitor) could be used as a dimension > to analyze the NameNode GC. We should add a switch config to enable the > GcTimePercentage metric in HDFS. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15176) Enable GcTimePercentage Metric in NameNode's JvmMetrics.
[ https://issues.apache.org/jira/browse/HDFS-15176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jinglun updated HDFS-15176: --- Attachment: HDFS-15176.004.patch Status: Patch Available (was: Open) > Enable GcTimePercentage Metric in NameNode's JvmMetrics. > > > Key: HDFS-15176 > URL: https://issues.apache.org/jira/browse/HDFS-15176 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Jinglun >Assignee: Jinglun >Priority: Minor > Attachments: HDFS-15176.001.patch, HDFS-15176.002.patch, > HDFS-15176.003.patch, HDFS-15176.004.patch > > > The GcTimePercentage(computed by GcTimeMonitor) could be used as a dimension > to analyze the NameNode GC. We should add a switch config to enable the > GcTimePercentage metric in HDFS. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15176) Enable GcTimePercentage Metric in NameNode's JvmMetrics.
[ https://issues.apache.org/jira/browse/HDFS-15176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jinglun updated HDFS-15176: --- Status: Open (was: Patch Available) > Enable GcTimePercentage Metric in NameNode's JvmMetrics. > > > Key: HDFS-15176 > URL: https://issues.apache.org/jira/browse/HDFS-15176 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Jinglun >Assignee: Jinglun >Priority: Minor > Attachments: HDFS-15176.001.patch, HDFS-15176.002.patch, > HDFS-15176.003.patch, HDFS-15176.004.patch > > > The GcTimePercentage(computed by GcTimeMonitor) could be used as a dimension > to analyze the NameNode GC. We should add a switch config to enable the > GcTimePercentage metric in HDFS. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15154) Allow only hdfs superusers the ability to assign HDFS storage policies
[ https://issues.apache.org/jira/browse/HDFS-15154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17039912#comment-17039912 ] Hadoop QA commented on HDFS-15154: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 41s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 6 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 50s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 1s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 57s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 6s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 24s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 50s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 39s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 52s{color} | {color:green} hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 1018 unchanged - 1 fixed = 1018 total (was 1019) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 2s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 88m 22s{color} | {color:red} hadoop-hdfs in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 31s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}153m 28s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.4 Server=19.03.4 Image:yetus/hadoop:c44943d1fc3 | | JIRA Issue | HDFS-15154 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12993852/HDFS-15154.04.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml | | uname | Linux 6b9e35c24845 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / cb3f3cc | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_242 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/28807/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/28807/testReport/ | | Max. process+thread count | 4325
[jira] [Commented] (HDFS-15165) In Du missed calling getAttributesProvider
[ https://issues.apache.org/jira/browse/HDFS-15165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17039902#comment-17039902 ] Stephen O'Donnell commented on HDFS-15165: -- The 01 patch LGTM. Non-binding +1. > In Du missed calling getAttributesProvider > -- > > Key: HDFS-15165 > URL: https://issues.apache.org/jira/browse/HDFS-15165 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Major > Attachments: HDFS-15165.00.patch, HDFS-15165.01.patch, > example-test.patch > > > HDFS-12130 changed the behavior of DU command. > It merged both check permission and computation in to a single step. > During this change, when it is required to getInodeAttributes, it just used > inode.getAttributes(). But when attribute provider class is configured, we > should call attribute provider configured object to get InodeAttributes and > use the returned InodeAttributes during checkPermission. > So, if we see after HDFS-12130, code is changed as below. > > {code:java} > byte[][] localComponents = {inode.getLocalNameBytes()}; > INodeAttributes[] iNodeAttr = {inode.getSnapshotINode(snapshotId)}; > enforcer.checkPermission( > fsOwner, supergroup, callerUgi, > iNodeAttr, // single inode attr in the array > new INode[]{inode}, // single inode in the array > localComponents, snapshotId, > null, -1, // this will skip checkTraverse() because > // not checking ancestor here > false, null, null, > access, // the target access to be checked against the inode > null, // passing null sub access avoids checking children > false); > {code} > > If we observe 2nd line it is missing the check if attribute provider class is > configured use that to get InodeAttributeProvider. Because of this when hdfs > path is managed by sentry, and InodeAttributeProvider class is configured > with SentryINodeAttributeProvider, it does not get > SentryInodeAttributeProvider object and not using AclFeature from that if any > Acl’s are set. This has caused the issue of AccessControlException when du > command is run against hdfs path managed by Sentry. > > {code:java} > [root@gg-620-1 ~]# hdfs dfs -du /dev/edl/sc/consumer/lpfg/str/edf/abc/ > du: Permission denied: user=systest, access=READ_EXECUTE, > inode="/dev/edl/sc/consumer/lpfg/str/lpfg_wrk/PRISMA_TO_ICERTIS_OUTBOUND_RM_MASTER/_impala_insert_staging":impala:hive:drwxrwx--x{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15120) Refresh BlockPlacementPolicy at runtime.
[ https://issues.apache.org/jira/browse/HDFS-15120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17039897#comment-17039897 ] Hadoop QA commented on HDFS-15120: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 7s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 16s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 20s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 58s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 19m 4s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 56s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 53s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}106m 11s{color} | {color:red} hadoop-hdfs in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 32s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}184m 26s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.balancer.TestBalancerWithHANameNodes | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.6 Server=19.03.6 Image:yetus/hadoop:c44943d1fc3 | | JIRA Issue | HDFS-15120 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12993847/HDFS-15120.005.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux c8db7b7d6d28 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / cb3f3cc | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_232 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/28806/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/28806/testReport/ | | Max. process+thread count | 2994 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/28806/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org |
[jira] [Updated] (HDFS-10659) Namenode crashes after Journalnode re-installation in an HA cluster due to missing paxos directory
[ https://issues.apache.org/jira/browse/HDFS-10659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira Ajisaka updated HDFS-10659: - Fix Version/s: 2.10.1 2.9.3 Cherry-picked to branch-2.10 and branch-2.9. The logging API is different between branch-3.1 and branch-2.10, so I had to fix it while cherry-picking. > Namenode crashes after Journalnode re-installation in an HA cluster due to > missing paxos directory > -- > > Key: HDFS-10659 > URL: https://issues.apache.org/jira/browse/HDFS-10659 > Project: Hadoop HDFS > Issue Type: Improvement > Components: ha, journal-node >Affects Versions: 2.7.0 >Reporter: Amit Anand >Assignee: star >Priority: Major > Fix For: 3.3.0, 3.2.1, 2.9.3, 3.1.3, 2.10.1 > > Attachments: HDFS-10659.000.patch, HDFS-10659.001.patch, > HDFS-10659.002.patch, HDFS-10659.003.patch, HDFS-10659.004.patch, > HDFS-10659.005.patch, HDFS-10659.006.patch > > > In my environment I am seeing {{Namenodes}} crashing down after majority of > {{Journalnodes}} are re-installed. We manage multiple clusters and do rolling > upgrades followed by rolling re-install of each node including master(NN, JN, > RM, ZK) nodes. When a journal node is re-installed or moved to a new > disk/host, instead of running {{"initializeSharedEdits"}} command, I copy > {{VERSION}} file from one of the other {{Journalnode}} and that allows my > {{NN}} to start writing data to the newly installed {{Journalnode}}. > To acheive quorum for JN and recover unfinalized segments NN during starupt > creates .tmp files under {{"/jn/current/paxos"}} directory . In > current implementation "paxos" directry is only created during > {{"initializeSharedEdits"}} command and if a JN is re-installed the "paxos" > directory is not created upon JN startup or by NN while writing .tmp > files which causes NN to crash with following error message: > {code} > 192.168.100.16:8485: /disk/1/dfs/jn/Test-Laptop/current/paxos/64044.tmp (No > such file or directory) > at java.io.FileOutputStream.open(Native Method) > at java.io.FileOutputStream.(FileOutputStream.java:221) > at java.io.FileOutputStream.(FileOutputStream.java:171) > at > org.apache.hadoop.hdfs.util.AtomicFileOutputStream.(AtomicFileOutputStream.java:58) > at > org.apache.hadoop.hdfs.qjournal.server.Journal.persistPaxosData(Journal.java:971) > at > org.apache.hadoop.hdfs.qjournal.server.Journal.acceptRecovery(Journal.java:846) > at > org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.acceptRecovery(JournalNodeRpcServer.java:205) > at > org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.acceptRecovery(QJournalProtocolServerSideTranslatorPB.java:249) > at > org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:25435) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2151) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2147) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2145) > {code} > The current > [getPaxosFile|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/server/JNStorage.java#L128-L130] > method simply returns a path to a file under "paxos" directory without > verifiying its existence. Since "paxos" directoy holds files that are > required for NN recovery and acheiving JN quorum my proposed solution is to > add a check to "getPaxosFile" method and create the {{"paxos"}} directory if > it is missing. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10659) Namenode crashes after Journalnode re-installation in an HA cluster due to missing paxos directory
[ https://issues.apache.org/jira/browse/HDFS-10659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17039839#comment-17039839 ] Akira Ajisaka commented on HDFS-10659: -- Our production cluster faced this issue today during the maintenance. Cherry-picking this to branch-2.10 and branch-2.9. > Namenode crashes after Journalnode re-installation in an HA cluster due to > missing paxos directory > -- > > Key: HDFS-10659 > URL: https://issues.apache.org/jira/browse/HDFS-10659 > Project: Hadoop HDFS > Issue Type: Improvement > Components: ha, journal-node >Affects Versions: 2.7.0 >Reporter: Amit Anand >Assignee: star >Priority: Major > Fix For: 3.3.0, 3.2.1, 3.1.3 > > Attachments: HDFS-10659.000.patch, HDFS-10659.001.patch, > HDFS-10659.002.patch, HDFS-10659.003.patch, HDFS-10659.004.patch, > HDFS-10659.005.patch, HDFS-10659.006.patch > > > In my environment I am seeing {{Namenodes}} crashing down after majority of > {{Journalnodes}} are re-installed. We manage multiple clusters and do rolling > upgrades followed by rolling re-install of each node including master(NN, JN, > RM, ZK) nodes. When a journal node is re-installed or moved to a new > disk/host, instead of running {{"initializeSharedEdits"}} command, I copy > {{VERSION}} file from one of the other {{Journalnode}} and that allows my > {{NN}} to start writing data to the newly installed {{Journalnode}}. > To acheive quorum for JN and recover unfinalized segments NN during starupt > creates .tmp files under {{"/jn/current/paxos"}} directory . In > current implementation "paxos" directry is only created during > {{"initializeSharedEdits"}} command and if a JN is re-installed the "paxos" > directory is not created upon JN startup or by NN while writing .tmp > files which causes NN to crash with following error message: > {code} > 192.168.100.16:8485: /disk/1/dfs/jn/Test-Laptop/current/paxos/64044.tmp (No > such file or directory) > at java.io.FileOutputStream.open(Native Method) > at java.io.FileOutputStream.(FileOutputStream.java:221) > at java.io.FileOutputStream.(FileOutputStream.java:171) > at > org.apache.hadoop.hdfs.util.AtomicFileOutputStream.(AtomicFileOutputStream.java:58) > at > org.apache.hadoop.hdfs.qjournal.server.Journal.persistPaxosData(Journal.java:971) > at > org.apache.hadoop.hdfs.qjournal.server.Journal.acceptRecovery(Journal.java:846) > at > org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.acceptRecovery(JournalNodeRpcServer.java:205) > at > org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.acceptRecovery(QJournalProtocolServerSideTranslatorPB.java:249) > at > org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:25435) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2151) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2147) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2145) > {code} > The current > [getPaxosFile|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/server/JNStorage.java#L128-L130] > method simply returns a path to a file under "paxos" directory without > verifiying its existence. Since "paxos" directoy holds files that are > required for NN recovery and acheiving JN quorum my proposed solution is to > add a check to "getPaxosFile" method and create the {{"paxos"}} directory if > it is missing. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15154) Allow only hdfs superusers the ability to assign HDFS storage policies
[ https://issues.apache.org/jira/browse/HDFS-15154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17039821#comment-17039821 ] Hadoop QA commented on HDFS-15154: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 54s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 8 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 5s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 6s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 9s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 40s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 52s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 6 new + 1024 unchanged - 0 fixed = 1030 total (was 1024) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 17s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}115m 43s{color} | {color:red} hadoop-hdfs in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 37s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}182m 35s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.TestDeadNodeDetection | | | hadoop.hdfs.server.namenode.ha.TestHAAppend | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.6 Server=19.03.6 Image:yetus/hadoop:c44943d1fc3 | | JIRA Issue | HDFS-15154 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12993838/HDFS-15154.03.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml | | uname | Linux 323329b07ee6 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / cb3f3cc | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_232 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/28805/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt | | unit |
[jira] [Comment Edited] (HDFS-15154) Allow only hdfs superusers the ability to assign HDFS storage policies
[ https://issues.apache.org/jira/browse/HDFS-15154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17039720#comment-17039720 ] Siddharth Wagle edited comment on HDFS-15154 at 2/19/20 8:29 AM: - Looking into the test failures, most of them seem to fail due to OOM, retriggering. 04 => checkstyle fixes. was (Author: swagle): Looking into the test failures, most of them seem to fail due to OOM, retriggering. > Allow only hdfs superusers the ability to assign HDFS storage policies > -- > > Key: HDFS-15154 > URL: https://issues.apache.org/jira/browse/HDFS-15154 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 3.0.0 >Reporter: Bob Cauthen >Assignee: Siddharth Wagle >Priority: Major > Attachments: HDFS-15154.01.patch, HDFS-15154.02.patch, > HDFS-15154.03.patch, HDFS-15154.04.patch > > > Please provide a way to limit only HDFS superusers the ability to assign HDFS > Storage Policies to HDFS directories. > Currently, and based on Jira HDFS-7093, all storage policies can be disabled > cluster wide by setting the following: > dfs.storage.policy.enabled to false > But we need a way to allow only HDFS superusers the ability to assign an HDFS > Storage Policy to an HDFS directory. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15154) Allow only hdfs superusers the ability to assign HDFS storage policies
[ https://issues.apache.org/jira/browse/HDFS-15154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Wagle updated HDFS-15154: --- Attachment: HDFS-15154.04.patch > Allow only hdfs superusers the ability to assign HDFS storage policies > -- > > Key: HDFS-15154 > URL: https://issues.apache.org/jira/browse/HDFS-15154 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 3.0.0 >Reporter: Bob Cauthen >Assignee: Siddharth Wagle >Priority: Major > Attachments: HDFS-15154.01.patch, HDFS-15154.02.patch, > HDFS-15154.03.patch, HDFS-15154.04.patch > > > Please provide a way to limit only HDFS superusers the ability to assign HDFS > Storage Policies to HDFS directories. > Currently, and based on Jira HDFS-7093, all storage policies can be disabled > cluster wide by setting the following: > dfs.storage.policy.enabled to false > But we need a way to allow only HDFS superusers the ability to assign an HDFS > Storage Policy to an HDFS directory. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org