[jira] [Commented] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts
[ https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16997937#comment-16997937 ] Feilong He commented on HDFS-14740: --- Thanks [~rakeshr] for your suggestion. '{{dfs.datanode.pmem.cache.restore}}' and '{{dfs.datanode.pmem.cache.dirs}}' looks good to me. [^HDFS-14740.008.patch] has some updates covering this. > Recover data blocks from persistent memory read cache during datanode restarts > -- > > Key: HDFS-14740 > URL: https://issues.apache.org/jira/browse/HDFS-14740 > Project: Hadoop HDFS > Issue Type: Improvement > Components: caching, datanode >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Attachments: HDFS-14740.000.patch, HDFS-14740.001.patch, > HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, > HDFS-14740.005.patch, HDFS-14740.006.patch, HDFS-14740.007.patch, > HDFS-14740.008.patch, HDFS_Persistent_Read-Cache_Design-v1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.pdf, HDFS_Persistent_Read-Cache_Test-v2.pdf > > > In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache > management. Even though PM can persist cache data, for simplifying the > initial implementation, the previous cache data will be cleaned up during > DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking > advantage of PM's data persistence characteristic, i.e., recovering the > status for cached data, if any, when DataNode restarts, thus, cache warm up > time can be saved for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15068) DataNode could meet deadlock if invoke refreshVolumes when register
[ https://issues.apache.org/jira/browse/HDFS-15068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16997936#comment-16997936 ] Xiaoqiao He commented on HDFS-15068: Thanks [~Aiphag0] for your contribution. I agree that it is safe to move `get namespace information` logic out from `synchronized`. It will be better to add unit test to cover this corner case. > DataNode could meet deadlock if invoke refreshVolumes when register > --- > > Key: HDFS-15068 > URL: https://issues.apache.org/jira/browse/HDFS-15068 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Xiaoqiao He >Assignee: Aiphago >Priority: Critical > Attachments: HDFS-15068.001.patch > > > DataNode could meet deadlock when invoke `dfsadmin -reconfig datanode ip:host > start` to trigger #refreshVolumes. > 1. DataNod#refreshVolumes hold datanode instance ownable {{synchronizer}} > when enter this method first, then try to hold BPOfferService {{readlock}} > when `bpos.getNamespaceInfo()` in following code segment. > {code:java} > for (BPOfferService bpos : blockPoolManager.getAllNamenodeThreads()) { > nsInfos.add(bpos.getNamespaceInfo()); > } > {code} > 2. BPOfferService#registrationSucceeded (which is invoked by #register when > DataNode start or #reregister when processCommandFromActor) hold > BPOfferService {{writelock}} first, then try to hold datanode instance > ownable {{synchronizer}} in following method. > {code:java} > synchronized void bpRegistrationSucceeded(DatanodeRegistration > bpRegistration, > String blockPoolId) throws IOException { > id = bpRegistration; > if(!storage.getDatanodeUuid().equals(bpRegistration.getDatanodeUuid())) { > throw new IOException("Inconsistent Datanode IDs. Name-node returned " > + bpRegistration.getDatanodeUuid() > + ". Expecting " + storage.getDatanodeUuid()); > } > > registerBlockPoolWithSecretManager(bpRegistration, blockPoolId); > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts
[ https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He updated HDFS-14740: -- Attachment: HDFS-14740.008.patch > Recover data blocks from persistent memory read cache during datanode restarts > -- > > Key: HDFS-14740 > URL: https://issues.apache.org/jira/browse/HDFS-14740 > Project: Hadoop HDFS > Issue Type: Improvement > Components: caching, datanode >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Attachments: HDFS-14740.000.patch, HDFS-14740.001.patch, > HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, > HDFS-14740.005.patch, HDFS-14740.006.patch, HDFS-14740.007.patch, > HDFS-14740.008.patch, HDFS_Persistent_Read-Cache_Design-v1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.pdf, HDFS_Persistent_Read-Cache_Test-v2.pdf > > > In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache > management. Even though PM can persist cache data, for simplifying the > initial implementation, the previous cache data will be cleaned up during > DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking > advantage of PM's data persistence characteristic, i.e., recovering the > status for cached data, if any, when DataNode restarts, thus, cache warm up > time can be saved for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15068) DataNode could meet deadlock if invoke refreshVolumes when register
[ https://issues.apache.org/jira/browse/HDFS-15068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aiphago updated HDFS-15068: --- Attachment: HDFS-15068.001.patch > DataNode could meet deadlock if invoke refreshVolumes when register > --- > > Key: HDFS-15068 > URL: https://issues.apache.org/jira/browse/HDFS-15068 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Xiaoqiao He >Assignee: Aiphago >Priority: Critical > Attachments: HDFS-15068.001.patch > > > DataNode could meet deadlock when invoke `dfsadmin -reconfig datanode ip:host > start` to trigger #refreshVolumes. > 1. DataNod#refreshVolumes hold datanode instance ownable {{synchronizer}} > when enter this method first, then try to hold BPOfferService {{readlock}} > when `bpos.getNamespaceInfo()` in following code segment. > {code:java} > for (BPOfferService bpos : blockPoolManager.getAllNamenodeThreads()) { > nsInfos.add(bpos.getNamespaceInfo()); > } > {code} > 2. BPOfferService#registrationSucceeded (which is invoked by #register when > DataNode start or #reregister when processCommandFromActor) hold > BPOfferService {{writelock}} first, then try to hold datanode instance > ownable {{synchronizer}} in following method. > {code:java} > synchronized void bpRegistrationSucceeded(DatanodeRegistration > bpRegistration, > String blockPoolId) throws IOException { > id = bpRegistration; > if(!storage.getDatanodeUuid().equals(bpRegistration.getDatanodeUuid())) { > throw new IOException("Inconsistent Datanode IDs. Name-node returned " > + bpRegistration.getDatanodeUuid() > + ". Expecting " + storage.getDatanodeUuid()); > } > > registerBlockPoolWithSecretManager(bpRegistration, blockPoolId); > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15068) DataNode could meet deadlock if invoke refreshVolumes when register
[ https://issues.apache.org/jira/browse/HDFS-15068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16997922#comment-16997922 ] Aiphago commented on HDFS-15068: I change the lock order of refreshVolumes() and here is the patch.[^HDFS-15068.001.patch] > DataNode could meet deadlock if invoke refreshVolumes when register > --- > > Key: HDFS-15068 > URL: https://issues.apache.org/jira/browse/HDFS-15068 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Xiaoqiao He >Assignee: Aiphago >Priority: Critical > Attachments: HDFS-15068.001.patch > > > DataNode could meet deadlock when invoke `dfsadmin -reconfig datanode ip:host > start` to trigger #refreshVolumes. > 1. DataNod#refreshVolumes hold datanode instance ownable {{synchronizer}} > when enter this method first, then try to hold BPOfferService {{readlock}} > when `bpos.getNamespaceInfo()` in following code segment. > {code:java} > for (BPOfferService bpos : blockPoolManager.getAllNamenodeThreads()) { > nsInfos.add(bpos.getNamespaceInfo()); > } > {code} > 2. BPOfferService#registrationSucceeded (which is invoked by #register when > DataNode start or #reregister when processCommandFromActor) hold > BPOfferService {{writelock}} first, then try to hold datanode instance > ownable {{synchronizer}} in following method. > {code:java} > synchronized void bpRegistrationSucceeded(DatanodeRegistration > bpRegistration, > String blockPoolId) throws IOException { > id = bpRegistration; > if(!storage.getDatanodeUuid().equals(bpRegistration.getDatanodeUuid())) { > throw new IOException("Inconsistent Datanode IDs. Name-node returned " > + bpRegistration.getDatanodeUuid() > + ". Expecting " + storage.getDatanodeUuid()); > } > > registerBlockPoolWithSecretManager(bpRegistration, blockPoolId); > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15068) DataNode could meet deadlock if invoke refreshVolumes when register
Xiaoqiao He created HDFS-15068: -- Summary: DataNode could meet deadlock if invoke refreshVolumes when register Key: HDFS-15068 URL: https://issues.apache.org/jira/browse/HDFS-15068 Project: Hadoop HDFS Issue Type: Bug Components: datanode Reporter: Xiaoqiao He Assignee: Aiphago DataNode could meet deadlock when invoke `dfsadmin -reconfig datanode ip:host start` to trigger #refreshVolumes. 1. DataNod#refreshVolumes hold datanode instance ownable {{synchronizer}} when enter this method first, then try to hold BPOfferService {{readlock}} when `bpos.getNamespaceInfo()` in following code segment. {code:java} for (BPOfferService bpos : blockPoolManager.getAllNamenodeThreads()) { nsInfos.add(bpos.getNamespaceInfo()); } {code} 2. BPOfferService#registrationSucceeded (which is invoked by #register when DataNode start or #reregister when processCommandFromActor) hold BPOfferService {{writelock}} first, then try to hold datanode instance ownable {{synchronizer}} in following method. {code:java} synchronized void bpRegistrationSucceeded(DatanodeRegistration bpRegistration, String blockPoolId) throws IOException { id = bpRegistration; if(!storage.getDatanodeUuid().equals(bpRegistration.getDatanodeUuid())) { throw new IOException("Inconsistent Datanode IDs. Name-node returned " + bpRegistration.getDatanodeUuid() + ". Expecting " + storage.getDatanodeUuid()); } registerBlockPoolWithSecretManager(bpRegistration, blockPoolId); } {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15062) Add LOG when sendIBRs failed
[ https://issues.apache.org/jira/browse/HDFS-15062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16997857#comment-16997857 ] Hadoop QA commented on HDFS-15062: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 42s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 17s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 58s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 3s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 24s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 14s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 34s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 9s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}102m 44s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 31s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}164m 20s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.TestDeadNodeDetection | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:e573ea49085 | | JIRA Issue | HDFS-15062 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12988957/HDFS-15062.002.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 68f1356b8742 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 578bd10 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/28535/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/28535/testReport/ | | Max. process+thread count | 2767 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://bu
[jira] [Commented] (HDFS-14908) LeaseManager should check parent-child relationship when filter open files.
[ https://issues.apache.org/jira/browse/HDFS-14908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16997820#comment-16997820 ] Hudson commented on HDFS-14908: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17770 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/17770/]) HDFS-14908. LeaseManager should check parent-child relationship when (inigoiri: rev 24080666e5e2214d4a362c889cd9aa617be5de81) * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/LeaseManager.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestListOpenFiles.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/DFSTestUtil.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java > LeaseManager should check parent-child relationship when filter open files. > --- > > Key: HDFS-14908 > URL: https://issues.apache.org/jira/browse/HDFS-14908 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.1.0, 3.0.1 >Reporter: Jinglun >Assignee: Jinglun >Priority: Minor > Fix For: 3.3.0 > > Attachments: HDFS-14908.001.patch, HDFS-14908.002.patch, > HDFS-14908.003.patch, HDFS-14908.004.patch, HDFS-14908.005.patch, > HDFS-14908.006.patch, HDFS-14908.007.patch, HDFS-14908.008.patch, > HDFS-14908.009.patch, HDFS-14908.010.patch, HDFS-14908.TestV4.patch, > Test.java, TestV2.java, TestV3.java > > > Now when doing listOpenFiles(), LeaseManager only checks whether the filter > path is the prefix of the open files. We should check whether the filter path > is the parent/ancestor of the open files. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14908) LeaseManager should check parent-child relationship when filter open files.
[ https://issues.apache.org/jira/browse/HDFS-14908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16997802#comment-16997802 ] Íñigo Goiri commented on HDFS-14908: Thanks [~LiJinglun] for the patch and [~weichiu] and [~hemanthboyina] for checking! Committed to trunk. > LeaseManager should check parent-child relationship when filter open files. > --- > > Key: HDFS-14908 > URL: https://issues.apache.org/jira/browse/HDFS-14908 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.1.0, 3.0.1 >Reporter: Jinglun >Assignee: Jinglun >Priority: Minor > Fix For: 3.3.0 > > Attachments: HDFS-14908.001.patch, HDFS-14908.002.patch, > HDFS-14908.003.patch, HDFS-14908.004.patch, HDFS-14908.005.patch, > HDFS-14908.006.patch, HDFS-14908.007.patch, HDFS-14908.008.patch, > HDFS-14908.009.patch, HDFS-14908.010.patch, HDFS-14908.TestV4.patch, > Test.java, TestV2.java, TestV3.java > > > Now when doing listOpenFiles(), LeaseManager only checks whether the filter > path is the prefix of the open files. We should check whether the filter path > is the parent/ancestor of the open files. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14908) LeaseManager should check parent-child relationship when filter open files.
[ https://issues.apache.org/jira/browse/HDFS-14908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Íñigo Goiri updated HDFS-14908: --- Fix Version/s: 3.3.0 Hadoop Flags: Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) > LeaseManager should check parent-child relationship when filter open files. > --- > > Key: HDFS-14908 > URL: https://issues.apache.org/jira/browse/HDFS-14908 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.1.0, 3.0.1 >Reporter: Jinglun >Assignee: Jinglun >Priority: Minor > Fix For: 3.3.0 > > Attachments: HDFS-14908.001.patch, HDFS-14908.002.patch, > HDFS-14908.003.patch, HDFS-14908.004.patch, HDFS-14908.005.patch, > HDFS-14908.006.patch, HDFS-14908.007.patch, HDFS-14908.008.patch, > HDFS-14908.009.patch, HDFS-14908.010.patch, HDFS-14908.TestV4.patch, > Test.java, TestV2.java, TestV3.java > > > Now when doing listOpenFiles(), LeaseManager only checks whether the filter > path is the prefix of the open files. We should check whether the filter path > is the parent/ancestor of the open files. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15062) Add LOG when sendIBRs failed
[ https://issues.apache.org/jira/browse/HDFS-15062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16997773#comment-16997773 ] Íñigo Goiri commented on HDFS-15062: +1 on [^HDFS-15062.002.patch]. > Add LOG when sendIBRs failed > > > Key: HDFS-15062 > URL: https://issues.apache.org/jira/browse/HDFS-15062 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 3.0.3, 3.2.1, 3.1.3 >Reporter: Fei Hui >Assignee: Fei Hui >Priority: Major > Attachments: HDFS-15062.001.patch, HDFS-15062.002.patch > > > {code} > /** Send IBRs to namenode. */ > void sendIBRs(DatanodeProtocol namenode, DatanodeRegistration registration, > String bpid, String nnRpcLatencySuffix) throws IOException { > // Generate a list of the pending reports for each storage under the lock > final StorageReceivedDeletedBlocks[] reports = generateIBRs(); > if (reports.length == 0) { > // Nothing new to report. > return; > } > // Send incremental block reports to the Namenode outside the lock > if (LOG.isDebugEnabled()) { > LOG.debug("call blockReceivedAndDeleted: " + Arrays.toString(reports)); > } > boolean success = false; > final long startTime = monotonicNow(); > try { > namenode.blockReceivedAndDeleted(registration, bpid, reports); > success = true; > } finally { > if (success) { > dnMetrics.addIncrementalBlockReport(monotonicNow() - startTime, > nnRpcLatencySuffix); > lastIBR = startTime; > } else { > // If we didn't succeed in sending the report, put all of the > // blocks back onto our queue, but only in the case where we > // didn't put something newer in the meantime. > putMissing(reports); > } > } > } > {code} > When call namenode.blockReceivedAndDelete failed, will put reports to > pendingIBRs. Maybe we should add log for failed case. It is helpful for > trouble shooting -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15062) Add LOG when sendIBRs failed
[ https://issues.apache.org/jira/browse/HDFS-15062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fei Hui updated HDFS-15062: --- Attachment: HDFS-15062.002.patch > Add LOG when sendIBRs failed > > > Key: HDFS-15062 > URL: https://issues.apache.org/jira/browse/HDFS-15062 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 3.0.3, 3.2.1, 3.1.3 >Reporter: Fei Hui >Assignee: Fei Hui >Priority: Major > Attachments: HDFS-15062.001.patch, HDFS-15062.002.patch > > > {code} > /** Send IBRs to namenode. */ > void sendIBRs(DatanodeProtocol namenode, DatanodeRegistration registration, > String bpid, String nnRpcLatencySuffix) throws IOException { > // Generate a list of the pending reports for each storage under the lock > final StorageReceivedDeletedBlocks[] reports = generateIBRs(); > if (reports.length == 0) { > // Nothing new to report. > return; > } > // Send incremental block reports to the Namenode outside the lock > if (LOG.isDebugEnabled()) { > LOG.debug("call blockReceivedAndDeleted: " + Arrays.toString(reports)); > } > boolean success = false; > final long startTime = monotonicNow(); > try { > namenode.blockReceivedAndDeleted(registration, bpid, reports); > success = true; > } finally { > if (success) { > dnMetrics.addIncrementalBlockReport(monotonicNow() - startTime, > nnRpcLatencySuffix); > lastIBR = startTime; > } else { > // If we didn't succeed in sending the report, put all of the > // blocks back onto our queue, but only in the case where we > // didn't put something newer in the meantime. > putMissing(reports); > } > } > } > {code} > When call namenode.blockReceivedAndDelete failed, will put reports to > pendingIBRs. Maybe we should add log for failed case. It is helpful for > trouble shooting -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15062) Add LOG when sendIBRs failed
[ https://issues.apache.org/jira/browse/HDFS-15062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16997749#comment-16997749 ] Fei Hui commented on HDFS-15062: [~elgoiri] Thanks for your comments! Upload v002 patch > Add LOG when sendIBRs failed > > > Key: HDFS-15062 > URL: https://issues.apache.org/jira/browse/HDFS-15062 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 3.0.3, 3.2.1, 3.1.3 >Reporter: Fei Hui >Assignee: Fei Hui >Priority: Major > Attachments: HDFS-15062.001.patch, HDFS-15062.002.patch > > > {code} > /** Send IBRs to namenode. */ > void sendIBRs(DatanodeProtocol namenode, DatanodeRegistration registration, > String bpid, String nnRpcLatencySuffix) throws IOException { > // Generate a list of the pending reports for each storage under the lock > final StorageReceivedDeletedBlocks[] reports = generateIBRs(); > if (reports.length == 0) { > // Nothing new to report. > return; > } > // Send incremental block reports to the Namenode outside the lock > if (LOG.isDebugEnabled()) { > LOG.debug("call blockReceivedAndDeleted: " + Arrays.toString(reports)); > } > boolean success = false; > final long startTime = monotonicNow(); > try { > namenode.blockReceivedAndDeleted(registration, bpid, reports); > success = true; > } finally { > if (success) { > dnMetrics.addIncrementalBlockReport(monotonicNow() - startTime, > nnRpcLatencySuffix); > lastIBR = startTime; > } else { > // If we didn't succeed in sending the report, put all of the > // blocks back onto our queue, but only in the case where we > // didn't put something newer in the meantime. > putMissing(reports); > } > } > } > {code} > When call namenode.blockReceivedAndDelete failed, will put reports to > pendingIBRs. Maybe we should add log for failed case. It is helpful for > trouble shooting -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10648) Expose Balancer metrics through Metrics2
[ https://issues.apache.org/jira/browse/HDFS-10648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16997664#comment-16997664 ] Leon Gao commented on HDFS-10648: - [~zhangchen] Just follow up a bit if you are still working on it, thanks! > Expose Balancer metrics through Metrics2 > > > Key: HDFS-10648 > URL: https://issues.apache.org/jira/browse/HDFS-10648 > Project: Hadoop HDFS > Issue Type: New Feature > Components: balancer & mover, metrics >Reporter: Mark Wagner >Assignee: Chen Zhang >Priority: Major > Labels: metrics > > The Balancer currently prints progress information to the console. For > deployments that run the balancer frequently, it would be helpful to collect > those metrics for publishing to the available sinks. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15067) Optimize heartbeat for large cluster
[ https://issues.apache.org/jira/browse/HDFS-15067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16997588#comment-16997588 ] hemanthboyina commented on HDFS-15067: -- Good proposal [~surendrasingh] Rather than 6 sec , It will be better if we have any configurable time period > Optimize heartbeat for large cluster > > > Key: HDFS-15067 > URL: https://issues.apache.org/jira/browse/HDFS-15067 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 3.1.1 >Reporter: Surendra Singh Lilhore >Assignee: Surendra Singh Lilhore >Priority: Major > > In a large cluster Namenode spend some time in processing heartbeats. For > example, in 10K node cluster namenode process 10K RPC's for heartbeat in each > 3sec. This will impact the client response time. This heart beat can be > optimized. DN can start skipping one heart beat if no > work(Write/replication/Delete) is allocated from long time. DN can start > sending heart beat in 6 sec. Once the DN stating getting work from NN , it > can start sending heart beat normally. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15067) Optimize heartbeat for large cluster
Surendra Singh Lilhore created HDFS-15067: - Summary: Optimize heartbeat for large cluster Key: HDFS-15067 URL: https://issues.apache.org/jira/browse/HDFS-15067 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Affects Versions: 3.1.1 Reporter: Surendra Singh Lilhore Assignee: Surendra Singh Lilhore In a large cluster Namenode spend some time in processing heartbeats. For example, in 10K node cluster namenode process 10K RPC's for heartbeat in each 3sec. This will impact the client response time. This heart beat can be optimized. DN can start skipping one heart beat if no work(Write/replication/Delete) is allocated from long time. DN can start sending heart beat in 6 sec. Once the DN stating getting work from NN , it can start sending heart beat normally. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15054) Delete Snapshot not updating new modification time
[ https://issues.apache.org/jira/browse/HDFS-15054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16997529#comment-16997529 ] hemanthboyina commented on HDFS-15054: -- thanks for the comment [~elgoiri] The Failed test cases are unrelated > Delete Snapshot not updating new modification time > -- > > Key: HDFS-15054 > URL: https://issues.apache.org/jira/browse/HDFS-15054 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: hemanthboyina >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-15054.001.patch, HDFS-15054.002.patch > > > on creating a snapshot , we set modifcation time for the snapshot along with > that we update modification time of snapshot created directory > {code:java} > snapshotRoot.updateModificationTime(now, Snapshot.CURRENT_STATE_ID); > s.getRoot().setModificationTime(now, Snapshot.CURRENT_STATE_ID); {code} > So on deleting snapshot , we should update the modification time for snapshot > created directory . -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15062) Add LOG when sendIBRs failed
[ https://issues.apache.org/jira/browse/HDFS-15062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16997515#comment-16997515 ] Íñigo Goiri commented on HDFS-15062: Not that it makes much of a difference with warn and this setting but it makes sense to use the logger format: {code} LOG.warn("Failed to call blockReceivedAndDeleted: {}", Arrays.toString(reports)); {code} > Add LOG when sendIBRs failed > > > Key: HDFS-15062 > URL: https://issues.apache.org/jira/browse/HDFS-15062 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 3.0.3, 3.2.1, 3.1.3 >Reporter: Fei Hui >Assignee: Fei Hui >Priority: Major > Attachments: HDFS-15062.001.patch > > > {code} > /** Send IBRs to namenode. */ > void sendIBRs(DatanodeProtocol namenode, DatanodeRegistration registration, > String bpid, String nnRpcLatencySuffix) throws IOException { > // Generate a list of the pending reports for each storage under the lock > final StorageReceivedDeletedBlocks[] reports = generateIBRs(); > if (reports.length == 0) { > // Nothing new to report. > return; > } > // Send incremental block reports to the Namenode outside the lock > if (LOG.isDebugEnabled()) { > LOG.debug("call blockReceivedAndDeleted: " + Arrays.toString(reports)); > } > boolean success = false; > final long startTime = monotonicNow(); > try { > namenode.blockReceivedAndDeleted(registration, bpid, reports); > success = true; > } finally { > if (success) { > dnMetrics.addIncrementalBlockReport(monotonicNow() - startTime, > nnRpcLatencySuffix); > lastIBR = startTime; > } else { > // If we didn't succeed in sending the report, put all of the > // blocks back onto our queue, but only in the case where we > // didn't put something newer in the meantime. > putMissing(reports); > } > } > } > {code} > When call namenode.blockReceivedAndDelete failed, will put reports to > pendingIBRs. Maybe we should add log for failed case. It is helpful for > trouble shooting -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15054) Delete Snapshot not updating new modification time
[ https://issues.apache.org/jira/browse/HDFS-15054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16997505#comment-16997505 ] Íñigo Goiri commented on HDFS-15054: I don't think the failed unit tests are related. [~hemanthboyina], do you mind double checking? > Delete Snapshot not updating new modification time > -- > > Key: HDFS-15054 > URL: https://issues.apache.org/jira/browse/HDFS-15054 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: hemanthboyina >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-15054.001.patch, HDFS-15054.002.patch > > > on creating a snapshot , we set modifcation time for the snapshot along with > that we update modification time of snapshot created directory > {code:java} > snapshotRoot.updateModificationTime(now, Snapshot.CURRENT_STATE_ID); > s.getRoot().setModificationTime(now, Snapshot.CURRENT_STATE_ID); {code} > So on deleting snapshot , we should update the modification time for snapshot > created directory . -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15051) RBF: Propose to revoke WRITE MountTableEntry privilege to super user only
[ https://issues.apache.org/jira/browse/HDFS-15051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16997474#comment-16997474 ] Hadoop QA commented on HDFS-15051: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 44s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 5s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 25s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 55s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 48s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 47s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 44s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 14s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 25s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 61m 50s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:e573ea49085 | | JIRA Issue | HDFS-15051 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12988941/HDFS-15051.004.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 067ddc215241 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 3821516 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/28534/testReport/ | | Max. process+thread count | 2789 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: hadoop-hdfs-project/hadoop-hdfs-rbf | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/28534/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > RBF: Propose to revoke WRITE MountTableEntry privilege to super user only > - > >
[jira] [Created] (HDFS-15066) HttpFS: Implement setErasureCodingPolicy , unsetErasureCodingPolicy , getErasureCodingPolicy
hemanthboyina created HDFS-15066: Summary: HttpFS: Implement setErasureCodingPolicy , unsetErasureCodingPolicy , getErasureCodingPolicy Key: HDFS-15066 URL: https://issues.apache.org/jira/browse/HDFS-15066 Project: Hadoop HDFS Issue Type: Sub-task Reporter: hemanthboyina Assignee: hemanthboyina -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-15065) HttpFS : Support Enable / Disable EC policy
[ https://issues.apache.org/jira/browse/HDFS-15065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hemanthboyina reassigned HDFS-15065: Assignee: hemanthboyina > HttpFS : Support Enable / Disable EC policy > --- > > Key: HDFS-15065 > URL: https://issues.apache.org/jira/browse/HDFS-15065 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: hemanthboyina >Assignee: hemanthboyina >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15063) HttpFS : getFileStatus doesn't return ecPolicy
[ https://issues.apache.org/jira/browse/HDFS-15063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16997456#comment-16997456 ] Wei-Chiu Chuang commented on HDFS-15063: Good catch. Missed this one in HDFS-14683. > HttpFS : getFileStatus doesn't return ecPolicy > -- > > Key: HDFS-15063 > URL: https://issues.apache.org/jira/browse/HDFS-15063 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: hemanthboyina >Assignee: hemanthboyina >Priority: Major > > Currently LISTSTATUS call to HttpFS returns a json. These jsonArray elements > have the ecPolicy name. > But when HttpFsFileSystem converts it back into a FileStatus object, then > ecPolicy is not added -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15065) HttpFS : Support Enable / Disable EC policy
hemanthboyina created HDFS-15065: Summary: HttpFS : Support Enable / Disable EC policy Key: HDFS-15065 URL: https://issues.apache.org/jira/browse/HDFS-15065 Project: Hadoop HDFS Issue Type: Sub-task Reporter: hemanthboyina -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15064) HttpFS : Implement Erasure Coding Operations
hemanthboyina created HDFS-15064: Summary: HttpFS : Implement Erasure Coding Operations Key: HDFS-15064 URL: https://issues.apache.org/jira/browse/HDFS-15064 Project: Hadoop HDFS Issue Type: Improvement Reporter: hemanthboyina Assignee: hemanthboyina -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15063) HttpFS : getFileStatus doesn't return ecPolicy
hemanthboyina created HDFS-15063: Summary: HttpFS : getFileStatus doesn't return ecPolicy Key: HDFS-15063 URL: https://issues.apache.org/jira/browse/HDFS-15063 Project: Hadoop HDFS Issue Type: Bug Reporter: hemanthboyina Assignee: hemanthboyina Currently LISTSTATUS call to HttpFS returns a json. These jsonArray elements have the ecPolicy name. But when HttpFsFileSystem converts it back into a FileStatus object, then ecPolicy is not added -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15051) RBF: Propose to revoke WRITE MountTableEntry privilege to super user only
[ https://issues.apache.org/jira/browse/HDFS-15051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16997423#comment-16997423 ] Xiaoqiao He commented on HDFS-15051: [^HDFS-15051.004.patch] fix checkstyle. > RBF: Propose to revoke WRITE MountTableEntry privilege to super user only > - > > Key: HDFS-15051 > URL: https://issues.apache.org/jira/browse/HDFS-15051 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf >Reporter: Xiaoqiao He >Assignee: Xiaoqiao He >Priority: Major > Attachments: HDFS-15051.001.patch, HDFS-15051.002.patch, > HDFS-15051.003.patch, HDFS-15051.004.patch > > > The current permission checker of #MountTableStoreImpl is not very restrict. > In some case, any user could add/update/remove MountTableEntry without the > expected permission checking. > The following code segment try to check permission when operate > MountTableEntry, however mountTable object is from Client/RouterAdmin > {{MountTable mountTable = request.getEntry();}}, and user could pass any mode > which could bypass the permission checker. > {code:java} > public void checkPermission(MountTable mountTable, FsAction access) > throws AccessControlException { > if (isSuperUser()) { > return; > } > FsPermission mode = mountTable.getMode(); > if (getUser().equals(mountTable.getOwnerName()) > && mode.getUserAction().implies(access)) { > return; > } > if (isMemberOfGroup(mountTable.getGroupName()) > && mode.getGroupAction().implies(access)) { > return; > } > if (!getUser().equals(mountTable.getOwnerName()) > && !isMemberOfGroup(mountTable.getGroupName()) > && mode.getOtherAction().implies(access)) { > return; > } > throw new AccessControlException( > "Permission denied while accessing mount table " > + mountTable.getSourcePath() > + ": user " + getUser() + " does not have " + access.toString() > + " permissions."); > } > {code} > I just propose revoke WRITE MountTableEntry privilege to super user only. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15051) RBF: Propose to revoke WRITE MountTableEntry privilege to super user only
[ https://issues.apache.org/jira/browse/HDFS-15051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoqiao He updated HDFS-15051: --- Attachment: HDFS-15051.004.patch > RBF: Propose to revoke WRITE MountTableEntry privilege to super user only > - > > Key: HDFS-15051 > URL: https://issues.apache.org/jira/browse/HDFS-15051 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf >Reporter: Xiaoqiao He >Assignee: Xiaoqiao He >Priority: Major > Attachments: HDFS-15051.001.patch, HDFS-15051.002.patch, > HDFS-15051.003.patch, HDFS-15051.004.patch > > > The current permission checker of #MountTableStoreImpl is not very restrict. > In some case, any user could add/update/remove MountTableEntry without the > expected permission checking. > The following code segment try to check permission when operate > MountTableEntry, however mountTable object is from Client/RouterAdmin > {{MountTable mountTable = request.getEntry();}}, and user could pass any mode > which could bypass the permission checker. > {code:java} > public void checkPermission(MountTable mountTable, FsAction access) > throws AccessControlException { > if (isSuperUser()) { > return; > } > FsPermission mode = mountTable.getMode(); > if (getUser().equals(mountTable.getOwnerName()) > && mode.getUserAction().implies(access)) { > return; > } > if (isMemberOfGroup(mountTable.getGroupName()) > && mode.getGroupAction().implies(access)) { > return; > } > if (!getUser().equals(mountTable.getOwnerName()) > && !isMemberOfGroup(mountTable.getGroupName()) > && mode.getOtherAction().implies(access)) { > return; > } > throw new AccessControlException( > "Permission denied while accessing mount table " > + mountTable.getSourcePath() > + ": user " + getUser() + " does not have " + access.toString() > + " permissions."); > } > {code} > I just propose revoke WRITE MountTableEntry privilege to super user only. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15048) Fix findbug in DirectoryScanner
[ https://issues.apache.org/jira/browse/HDFS-15048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16997393#comment-16997393 ] Takanobu Asanuma commented on HDFS-15048: - Thanks, [~iwasakims], [~ayushtkn] and [~weichiu]! > Fix findbug in DirectoryScanner > --- > > Key: HDFS-15048 > URL: https://issues.apache.org/jira/browse/HDFS-15048 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Takanobu Asanuma >Assignee: Masatake Iwasaki >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-15048.001.patch > > > There is a findbug in DirectoryScanner. > {noformat} > Multithreaded correctness Warnings > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile() calls > Thread.sleep() with a lock held > Bug type SWL_SLEEP_WITH_LOCK_HELD (click for details) > In class org.apache.hadoop.hdfs.server.datanode.DirectoryScanner > In method org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile() > At DirectoryScanner.java:[line 441] > {noformat} > https://builds.apache.org/job/PreCommit-HDFS-Build/28498/artifact/out/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15051) RBF: Propose to revoke WRITE MountTableEntry privilege to super user only
[ https://issues.apache.org/jira/browse/HDFS-15051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16997372#comment-16997372 ] Hadoop QA commented on HDFS-15051: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 42s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 21s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 30s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 46s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 14s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs-rbf: The patch generated 3 new + 0 unchanged - 0 fixed = 3 total (was 0) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 44s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 44s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 16s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 26s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 62m 6s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:e573ea49085 | | JIRA Issue | HDFS-15051 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12988933/HDFS-15051.003.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 0aab7771482a 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 3821516 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/28533/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs-rbf.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/28533/testReport/ | | Max. process+thread count | 2757 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: hadoop-hdfs-project/hadoop-hdfs-rbf | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/28533/console | | Powered by | Apache Yetus
[jira] [Commented] (HDFS-15051) RBF: Propose to revoke WRITE MountTableEntry privilege to super user only
[ https://issues.apache.org/jira/browse/HDFS-15051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16997311#comment-16997311 ] Xiaoqiao He commented on HDFS-15051: if no object, [^HDFS-15051.003.patch] try to fix permission check logic based on current implementation. > RBF: Propose to revoke WRITE MountTableEntry privilege to super user only > - > > Key: HDFS-15051 > URL: https://issues.apache.org/jira/browse/HDFS-15051 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf >Reporter: Xiaoqiao He >Assignee: Xiaoqiao He >Priority: Major > Attachments: HDFS-15051.001.patch, HDFS-15051.002.patch, > HDFS-15051.003.patch > > > The current permission checker of #MountTableStoreImpl is not very restrict. > In some case, any user could add/update/remove MountTableEntry without the > expected permission checking. > The following code segment try to check permission when operate > MountTableEntry, however mountTable object is from Client/RouterAdmin > {{MountTable mountTable = request.getEntry();}}, and user could pass any mode > which could bypass the permission checker. > {code:java} > public void checkPermission(MountTable mountTable, FsAction access) > throws AccessControlException { > if (isSuperUser()) { > return; > } > FsPermission mode = mountTable.getMode(); > if (getUser().equals(mountTable.getOwnerName()) > && mode.getUserAction().implies(access)) { > return; > } > if (isMemberOfGroup(mountTable.getGroupName()) > && mode.getGroupAction().implies(access)) { > return; > } > if (!getUser().equals(mountTable.getOwnerName()) > && !isMemberOfGroup(mountTable.getGroupName()) > && mode.getOtherAction().implies(access)) { > return; > } > throw new AccessControlException( > "Permission denied while accessing mount table " > + mountTable.getSourcePath() > + ": user " + getUser() + " does not have " + access.toString() > + " permissions."); > } > {code} > I just propose revoke WRITE MountTableEntry privilege to super user only. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15051) RBF: Propose to revoke WRITE MountTableEntry privilege to super user only
[ https://issues.apache.org/jira/browse/HDFS-15051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoqiao He updated HDFS-15051: --- Attachment: HDFS-15051.003.patch > RBF: Propose to revoke WRITE MountTableEntry privilege to super user only > - > > Key: HDFS-15051 > URL: https://issues.apache.org/jira/browse/HDFS-15051 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf >Reporter: Xiaoqiao He >Assignee: Xiaoqiao He >Priority: Major > Attachments: HDFS-15051.001.patch, HDFS-15051.002.patch, > HDFS-15051.003.patch > > > The current permission checker of #MountTableStoreImpl is not very restrict. > In some case, any user could add/update/remove MountTableEntry without the > expected permission checking. > The following code segment try to check permission when operate > MountTableEntry, however mountTable object is from Client/RouterAdmin > {{MountTable mountTable = request.getEntry();}}, and user could pass any mode > which could bypass the permission checker. > {code:java} > public void checkPermission(MountTable mountTable, FsAction access) > throws AccessControlException { > if (isSuperUser()) { > return; > } > FsPermission mode = mountTable.getMode(); > if (getUser().equals(mountTable.getOwnerName()) > && mode.getUserAction().implies(access)) { > return; > } > if (isMemberOfGroup(mountTable.getGroupName()) > && mode.getGroupAction().implies(access)) { > return; > } > if (!getUser().equals(mountTable.getOwnerName()) > && !isMemberOfGroup(mountTable.getGroupName()) > && mode.getOtherAction().implies(access)) { > return; > } > throw new AccessControlException( > "Permission denied while accessing mount table " > + mountTable.getSourcePath() > + ": user " + getUser() + " does not have " + access.toString() > + " permissions."); > } > {code} > I just propose revoke WRITE MountTableEntry privilege to super user only. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15062) Add LOG when sendIBRs failed
[ https://issues.apache.org/jira/browse/HDFS-15062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16997294#comment-16997294 ] Hadoop QA commented on HDFS-15062: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 43s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 21s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 3s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 38s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 11s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 27s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 8s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}103m 12s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 32s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}164m 52s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.TestDeadNodeDetection | | | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:e573ea49085 | | JIRA Issue | HDFS-15062 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12988913/HDFS-15062.001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 7c1e2b89cbc7 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / dc6cf17 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/28532/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/28532/testReport/ | | Max. process+thread count | 2778 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-h
[jira] [Commented] (HDFS-15062) Add LOG when sendIBRs failed
[ https://issues.apache.org/jira/browse/HDFS-15062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16997224#comment-16997224 ] Fei Hui commented on HDFS-15062: [~weichiu] [~ayushtkn] Could you please take a look ? Thanks > Add LOG when sendIBRs failed > > > Key: HDFS-15062 > URL: https://issues.apache.org/jira/browse/HDFS-15062 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 3.0.3, 3.2.1, 3.1.3 >Reporter: Fei Hui >Assignee: Fei Hui >Priority: Major > Attachments: HDFS-15062.001.patch > > > {code} > /** Send IBRs to namenode. */ > void sendIBRs(DatanodeProtocol namenode, DatanodeRegistration registration, > String bpid, String nnRpcLatencySuffix) throws IOException { > // Generate a list of the pending reports for each storage under the lock > final StorageReceivedDeletedBlocks[] reports = generateIBRs(); > if (reports.length == 0) { > // Nothing new to report. > return; > } > // Send incremental block reports to the Namenode outside the lock > if (LOG.isDebugEnabled()) { > LOG.debug("call blockReceivedAndDeleted: " + Arrays.toString(reports)); > } > boolean success = false; > final long startTime = monotonicNow(); > try { > namenode.blockReceivedAndDeleted(registration, bpid, reports); > success = true; > } finally { > if (success) { > dnMetrics.addIncrementalBlockReport(monotonicNow() - startTime, > nnRpcLatencySuffix); > lastIBR = startTime; > } else { > // If we didn't succeed in sending the report, put all of the > // blocks back onto our queue, but only in the case where we > // didn't put something newer in the meantime. > putMissing(reports); > } > } > } > {code} > When call namenode.blockReceivedAndDelete failed, will put reports to > pendingIBRs. Maybe we should add log for failed case. It is helpful for > trouble shooting -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15062) Add LOG when sendIBRs failed
[ https://issues.apache.org/jira/browse/HDFS-15062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fei Hui updated HDFS-15062: --- Status: Patch Available (was: Open) > Add LOG when sendIBRs failed > > > Key: HDFS-15062 > URL: https://issues.apache.org/jira/browse/HDFS-15062 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 3.1.3, 3.2.1, 3.0.3 >Reporter: Fei Hui >Assignee: Fei Hui >Priority: Major > Attachments: HDFS-15062.001.patch > > > {code} > /** Send IBRs to namenode. */ > void sendIBRs(DatanodeProtocol namenode, DatanodeRegistration registration, > String bpid, String nnRpcLatencySuffix) throws IOException { > // Generate a list of the pending reports for each storage under the lock > final StorageReceivedDeletedBlocks[] reports = generateIBRs(); > if (reports.length == 0) { > // Nothing new to report. > return; > } > // Send incremental block reports to the Namenode outside the lock > if (LOG.isDebugEnabled()) { > LOG.debug("call blockReceivedAndDeleted: " + Arrays.toString(reports)); > } > boolean success = false; > final long startTime = monotonicNow(); > try { > namenode.blockReceivedAndDeleted(registration, bpid, reports); > success = true; > } finally { > if (success) { > dnMetrics.addIncrementalBlockReport(monotonicNow() - startTime, > nnRpcLatencySuffix); > lastIBR = startTime; > } else { > // If we didn't succeed in sending the report, put all of the > // blocks back onto our queue, but only in the case where we > // didn't put something newer in the meantime. > putMissing(reports); > } > } > } > {code} > When call namenode.blockReceivedAndDelete failed, will put reports to > pendingIBRs. Maybe we should add log for failed case. It is helpful for > trouble shooting -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15048) Fix findbug in DirectoryScanner
[ https://issues.apache.org/jira/browse/HDFS-15048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-15048: Fix Version/s: 3.3.0 Hadoop Flags: Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) Committed. Thanks, [~ayushtkn]. > Fix findbug in DirectoryScanner > --- > > Key: HDFS-15048 > URL: https://issues.apache.org/jira/browse/HDFS-15048 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Takanobu Asanuma >Assignee: Masatake Iwasaki >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-15048.001.patch > > > There is a findbug in DirectoryScanner. > {noformat} > Multithreaded correctness Warnings > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile() calls > Thread.sleep() with a lock held > Bug type SWL_SLEEP_WITH_LOCK_HELD (click for details) > In class org.apache.hadoop.hdfs.server.datanode.DirectoryScanner > In method org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile() > At DirectoryScanner.java:[line 441] > {noformat} > https://builds.apache.org/job/PreCommit-HDFS-Build/28498/artifact/out/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15048) Fix findbug in DirectoryScanner
[ https://issues.apache.org/jira/browse/HDFS-15048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16997133#comment-16997133 ] Hudson commented on HDFS-15048: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17766 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/17766/]) HDFS-15048. Fix findbug in DirectoryScanner. (iwasakims: rev dc6cf17b3405a5f03b75b1f7bf3b9e79663deaf1) * (edit) hadoop-hdfs-project/hadoop-hdfs/dev-support/findbugsExcludeFile.xml > Fix findbug in DirectoryScanner > --- > > Key: HDFS-15048 > URL: https://issues.apache.org/jira/browse/HDFS-15048 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Takanobu Asanuma >Assignee: Masatake Iwasaki >Priority: Major > Attachments: HDFS-15048.001.patch > > > There is a findbug in DirectoryScanner. > {noformat} > Multithreaded correctness Warnings > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile() calls > Thread.sleep() with a lock held > Bug type SWL_SLEEP_WITH_LOCK_HELD (click for details) > In class org.apache.hadoop.hdfs.server.datanode.DirectoryScanner > In method org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile() > At DirectoryScanner.java:[line 441] > {noformat} > https://builds.apache.org/job/PreCommit-HDFS-Build/28498/artifact/out/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15048) Fix findbug in DirectoryScanner
[ https://issues.apache.org/jira/browse/HDFS-15048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16997121#comment-16997121 ] Ayush Saxena commented on HDFS-15048: - +1 > Fix findbug in DirectoryScanner > --- > > Key: HDFS-15048 > URL: https://issues.apache.org/jira/browse/HDFS-15048 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Takanobu Asanuma >Assignee: Masatake Iwasaki >Priority: Major > Attachments: HDFS-15048.001.patch > > > There is a findbug in DirectoryScanner. > {noformat} > Multithreaded correctness Warnings > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile() calls > Thread.sleep() with a lock held > Bug type SWL_SLEEP_WITH_LOCK_HELD (click for details) > In class org.apache.hadoop.hdfs.server.datanode.DirectoryScanner > In method org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile() > At DirectoryScanner.java:[line 441] > {noformat} > https://builds.apache.org/job/PreCommit-HDFS-Build/28498/artifact/out/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15062) Add LOG when sendIBRs failed
[ https://issues.apache.org/jira/browse/HDFS-15062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fei Hui updated HDFS-15062: --- Attachment: HDFS-15062.001.patch > Add LOG when sendIBRs failed > > > Key: HDFS-15062 > URL: https://issues.apache.org/jira/browse/HDFS-15062 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 3.0.3, 3.2.1, 3.1.3 >Reporter: Fei Hui >Assignee: Fei Hui >Priority: Major > Attachments: HDFS-15062.001.patch > > > {code} > /** Send IBRs to namenode. */ > void sendIBRs(DatanodeProtocol namenode, DatanodeRegistration registration, > String bpid, String nnRpcLatencySuffix) throws IOException { > // Generate a list of the pending reports for each storage under the lock > final StorageReceivedDeletedBlocks[] reports = generateIBRs(); > if (reports.length == 0) { > // Nothing new to report. > return; > } > // Send incremental block reports to the Namenode outside the lock > if (LOG.isDebugEnabled()) { > LOG.debug("call blockReceivedAndDeleted: " + Arrays.toString(reports)); > } > boolean success = false; > final long startTime = monotonicNow(); > try { > namenode.blockReceivedAndDeleted(registration, bpid, reports); > success = true; > } finally { > if (success) { > dnMetrics.addIncrementalBlockReport(monotonicNow() - startTime, > nnRpcLatencySuffix); > lastIBR = startTime; > } else { > // If we didn't succeed in sending the report, put all of the > // blocks back onto our queue, but only in the case where we > // didn't put something newer in the meantime. > putMissing(reports); > } > } > } > {code} > When call namenode.blockReceivedAndDelete failed, will put reports to > pendingIBRs. Maybe we should add log for failed case. It is helpful for > trouble shooting -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15062) Add LOG when sendIBRs failed
Fei Hui created HDFS-15062: -- Summary: Add LOG when sendIBRs failed Key: HDFS-15062 URL: https://issues.apache.org/jira/browse/HDFS-15062 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Affects Versions: 3.1.3, 3.2.1, 3.0.3 Reporter: Fei Hui Assignee: Fei Hui {code} /** Send IBRs to namenode. */ void sendIBRs(DatanodeProtocol namenode, DatanodeRegistration registration, String bpid, String nnRpcLatencySuffix) throws IOException { // Generate a list of the pending reports for each storage under the lock final StorageReceivedDeletedBlocks[] reports = generateIBRs(); if (reports.length == 0) { // Nothing new to report. return; } // Send incremental block reports to the Namenode outside the lock if (LOG.isDebugEnabled()) { LOG.debug("call blockReceivedAndDeleted: " + Arrays.toString(reports)); } boolean success = false; final long startTime = monotonicNow(); try { namenode.blockReceivedAndDeleted(registration, bpid, reports); success = true; } finally { if (success) { dnMetrics.addIncrementalBlockReport(monotonicNow() - startTime, nnRpcLatencySuffix); lastIBR = startTime; } else { // If we didn't succeed in sending the report, put all of the // blocks back onto our queue, but only in the case where we // didn't put something newer in the meantime. putMissing(reports); } } } {code} When call namenode.blockReceivedAndDelete failed, will put reports to pendingIBRs. Maybe we should add log for failed case. It is helpful for trouble shooting -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org