[jira] [Work logged] (HDFS-16004) BackupNode and QJournal lack Permission check.
[ https://issues.apache.org/jira/browse/HDFS-16004?focusedWorklogId=593658=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-593658 ] ASF GitHub Bot logged work on HDFS-16004: - Author: ASF GitHub Bot Created on: 08/May/21 05:55 Start Date: 08/May/21 05:55 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2966: URL: https://github.com/apache/hadoop/pull/2966#issuecomment-835124045 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 42s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | -1 :x: | test4tests | 0m 0s | | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | _ trunk Compile Tests _ | | -1 :x: | mvninstall | 6m 8s | [/branch-mvninstall-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2966/2/artifact/out/branch-mvninstall-root.txt) | root in trunk failed. | | -1 :x: | compile | 0m 23s | [/branch-compile-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2966/2/artifact/out/branch-compile-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04.txt) | hadoop-hdfs in trunk failed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04. | | -1 :x: | compile | 0m 10s | [/branch-compile-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2966/2/artifact/out/branch-compile-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08.txt) | hadoop-hdfs in trunk failed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08. | | -0 :warning: | checkstyle | 0m 20s | [/buildtool-branch-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2966/2/artifact/out/buildtool-branch-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt) | The patch fails to run checkstyle in hadoop-hdfs | | -1 :x: | mvnsite | 0m 23s | [/branch-mvnsite-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2966/2/artifact/out/branch-mvnsite-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in trunk failed. | | -1 :x: | javadoc | 0m 23s | [/branch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2966/2/artifact/out/branch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04.txt) | hadoop-hdfs in trunk failed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04. | | -1 :x: | javadoc | 0m 23s | [/branch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2966/2/artifact/out/branch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08.txt) | hadoop-hdfs in trunk failed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08. | | -1 :x: | spotbugs | 0m 23s | [/branch-spotbugs-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2966/2/artifact/out/branch-spotbugs-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in trunk failed. | | +1 :green_heart: | shadedclient | 2m 16s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | -1 :x: | mvninstall | 0m 22s | [/patch-mvninstall-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2966/2/artifact/out/patch-mvninstall-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch failed. | | -1 :x: | compile | 0m 22s | [/patch-compile-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2966/2/artifact/out/patch-compile-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04.txt) | hadoop-hdfs in the patch failed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04. | | -1 :x: | javac | 0m 22s |
[jira] [Commented] (HDFS-16004) BackupNode and QJournal lack Permission check.
[ https://issues.apache.org/jira/browse/HDFS-16004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17341178#comment-17341178 ] lujie commented on HDFS-16004: -- push the new commit. All the interface of QJournal are checked. > BackupNode and QJournal lack Permission check. > -- > > Key: HDFS-16004 > URL: https://issues.apache.org/jira/browse/HDFS-16004 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: lujie >Assignee: lujie >Priority: Critical > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > I have some doubt when i configurate secure HDFS. I know we have Service > Level Authorization for protocols like NamenodeProtocol,DatanodeProtocol and > so on. > But i do not find such Authorization for JournalProtocol after reading the > code in HDFSPolicyProvider. And if we have, how can i configurate such > Authorization? > > Besides even NamenodeProtocol has Service Level Authorization, its methods > still have Permission check. Take startCheckpoint in NameNodeRpcServer who > implemented NamenodeProtocol for example: > > _public NamenodeCommand startCheckpoint(NamenodeRegistration registration)_ > _throws IOException {_ > _String operationName = "startCheckpoint";_ > _checkNNStartup();_ > _{color:#ff6600}namesystem.checkSuperuserPrivilege(operationName);{color}_ > _.._ > > I found that the methods in BackupNodeRpcServer who implemented > JournalProtocol lack of such Permission check. See below: > > > _public void startLogSegment(JournalInfo journalInfo, long epoch,_ > _long txid) throws IOException {_ > _namesystem.checkOperation(OperationCategory.JOURNAL);_ > _verifyJournalRequest(journalInfo);_ > _getBNImage().namenodeStartedLogSegment(txid);_ > _}_ > > _@Override_ > _public void journal(JournalInfo journalInfo, long epoch, long firstTxId,_ > _int numTxns, byte[] records) throws IOException {_ > _namesystem.checkOperation(OperationCategory.JOURNAL);_ > _verifyJournalRequest(journalInfo);_ > _getBNImage().journal(firstTxId, numTxns, records);_ > _}_ > > Do we need add Permission check for them? > > Please point out my mistakes if i am wrong or miss something. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16004) BackupNode and QJournal lack Permission check.
[ https://issues.apache.org/jira/browse/HDFS-16004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated HDFS-16004: - Summary: BackupNode and QJournal lack Permission check. (was: startLogSegment and journal in BackupNode lack Permission check.) > BackupNode and QJournal lack Permission check. > -- > > Key: HDFS-16004 > URL: https://issues.apache.org/jira/browse/HDFS-16004 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: lujie >Assignee: lujie >Priority: Critical > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > I have some doubt when i configurate secure HDFS. I know we have Service > Level Authorization for protocols like NamenodeProtocol,DatanodeProtocol and > so on. > But i do not find such Authorization for JournalProtocol after reading the > code in HDFSPolicyProvider. And if we have, how can i configurate such > Authorization? > > Besides even NamenodeProtocol has Service Level Authorization, its methods > still have Permission check. Take startCheckpoint in NameNodeRpcServer who > implemented NamenodeProtocol for example: > > _public NamenodeCommand startCheckpoint(NamenodeRegistration registration)_ > _throws IOException {_ > _String operationName = "startCheckpoint";_ > _checkNNStartup();_ > _{color:#ff6600}namesystem.checkSuperuserPrivilege(operationName);{color}_ > _.._ > > I found that the methods in BackupNodeRpcServer who implemented > JournalProtocol lack of such Permission check. See below: > > > _public void startLogSegment(JournalInfo journalInfo, long epoch,_ > _long txid) throws IOException {_ > _namesystem.checkOperation(OperationCategory.JOURNAL);_ > _verifyJournalRequest(journalInfo);_ > _getBNImage().namenodeStartedLogSegment(txid);_ > _}_ > > _@Override_ > _public void journal(JournalInfo journalInfo, long epoch, long firstTxId,_ > _int numTxns, byte[] records) throws IOException {_ > _namesystem.checkOperation(OperationCategory.JOURNAL);_ > _verifyJournalRequest(journalInfo);_ > _getBNImage().journal(firstTxId, numTxns, records);_ > _}_ > > Do we need add Permission check for them? > > Please point out my mistakes if i am wrong or miss something. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16001) TestOfflineEditsViewer.testStored() fails reading negative value of FSEditLogOpCodes
[ https://issues.apache.org/jira/browse/HDFS-16001?focusedWorklogId=593655=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-593655 ] ASF GitHub Bot logged work on HDFS-16001: - Author: ASF GitHub Bot Created on: 08/May/21 05:32 Start Date: 08/May/21 05:32 Worklog Time Spent: 10m Work Description: qizhu-lucas removed a comment on pull request #2989: URL: https://github.com/apache/hadoop/pull/2989#issuecomment-835112687 +1 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 593655) Time Spent: 1h 10m (was: 1h) > TestOfflineEditsViewer.testStored() fails reading negative value of > FSEditLogOpCodes > > > Key: HDFS-16001 > URL: https://issues.apache.org/jira/browse/HDFS-16001 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: Konstantin Shvachko >Assignee: Akira Ajisaka >Priority: Blocker > Labels: pull-request-available > Fix For: 3.3.1, 3.4.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > {{TestOfflineEditsViewer.testStored()}} fails consistently with an exception > {noformat} > java.io.IOException: Op -54 has size -1314247195, but the minimum op size is > 17 > {noformat} > Seems like there is a corrupt record in {{editsStored}} file. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16001) TestOfflineEditsViewer.testStored() fails reading negative value of FSEditLogOpCodes
[ https://issues.apache.org/jira/browse/HDFS-16001?focusedWorklogId=593652=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-593652 ] ASF GitHub Bot logged work on HDFS-16001: - Author: ASF GitHub Bot Created on: 08/May/21 05:30 Start Date: 08/May/21 05:30 Worklog Time Spent: 10m Work Description: qizhu-lucas commented on pull request #2989: URL: https://github.com/apache/hadoop/pull/2989#issuecomment-835112687 +1 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 593652) Time Spent: 1h (was: 50m) > TestOfflineEditsViewer.testStored() fails reading negative value of > FSEditLogOpCodes > > > Key: HDFS-16001 > URL: https://issues.apache.org/jira/browse/HDFS-16001 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: Konstantin Shvachko >Assignee: Akira Ajisaka >Priority: Blocker > Labels: pull-request-available > Fix For: 3.3.1, 3.4.0 > > Time Spent: 1h > Remaining Estimate: 0h > > {{TestOfflineEditsViewer.testStored()}} fails consistently with an exception > {noformat} > java.io.IOException: Op -54 has size -1314247195, but the minimum op size is > 17 > {noformat} > Seems like there is a corrupt record in {{editsStored}} file. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-16004) startLogSegment and journal in BackupNode lack Permission check.
[ https://issues.apache.org/jira/browse/HDFS-16004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17341165#comment-17341165 ] lujie edited comment on HDFS-16004 at 5/8/21, 4:42 AM: --- [~hexiaoqiao] I also think we need add permission checks for QJournal(including InterQJournal) and am working on a new patch! was (Author: xiaoheipangzi): [~hexiaoqiao] I also think we need add permission checks for QJournal and am working on a new patch! > startLogSegment and journal in BackupNode lack Permission check. > > > Key: HDFS-16004 > URL: https://issues.apache.org/jira/browse/HDFS-16004 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: lujie >Assignee: lujie >Priority: Critical > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > I have some doubt when i configurate secure HDFS. I know we have Service > Level Authorization for protocols like NamenodeProtocol,DatanodeProtocol and > so on. > But i do not find such Authorization for JournalProtocol after reading the > code in HDFSPolicyProvider. And if we have, how can i configurate such > Authorization? > > Besides even NamenodeProtocol has Service Level Authorization, its methods > still have Permission check. Take startCheckpoint in NameNodeRpcServer who > implemented NamenodeProtocol for example: > > _public NamenodeCommand startCheckpoint(NamenodeRegistration registration)_ > _throws IOException {_ > _String operationName = "startCheckpoint";_ > _checkNNStartup();_ > _{color:#ff6600}namesystem.checkSuperuserPrivilege(operationName);{color}_ > _.._ > > I found that the methods in BackupNodeRpcServer who implemented > JournalProtocol lack of such Permission check. See below: > > > _public void startLogSegment(JournalInfo journalInfo, long epoch,_ > _long txid) throws IOException {_ > _namesystem.checkOperation(OperationCategory.JOURNAL);_ > _verifyJournalRequest(journalInfo);_ > _getBNImage().namenodeStartedLogSegment(txid);_ > _}_ > > _@Override_ > _public void journal(JournalInfo journalInfo, long epoch, long firstTxId,_ > _int numTxns, byte[] records) throws IOException {_ > _namesystem.checkOperation(OperationCategory.JOURNAL);_ > _verifyJournalRequest(journalInfo);_ > _getBNImage().journal(firstTxId, numTxns, records);_ > _}_ > > Do we need add Permission check for them? > > Please point out my mistakes if i am wrong or miss something. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16004) startLogSegment and journal in BackupNode lack Permission check.
[ https://issues.apache.org/jira/browse/HDFS-16004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17341165#comment-17341165 ] lujie commented on HDFS-16004: -- [~hexiaoqiao] I also think we need add permission checks for QJournal and am working on a new patch! > startLogSegment and journal in BackupNode lack Permission check. > > > Key: HDFS-16004 > URL: https://issues.apache.org/jira/browse/HDFS-16004 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: lujie >Assignee: lujie >Priority: Critical > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > I have some doubt when i configurate secure HDFS. I know we have Service > Level Authorization for protocols like NamenodeProtocol,DatanodeProtocol and > so on. > But i do not find such Authorization for JournalProtocol after reading the > code in HDFSPolicyProvider. And if we have, how can i configurate such > Authorization? > > Besides even NamenodeProtocol has Service Level Authorization, its methods > still have Permission check. Take startCheckpoint in NameNodeRpcServer who > implemented NamenodeProtocol for example: > > _public NamenodeCommand startCheckpoint(NamenodeRegistration registration)_ > _throws IOException {_ > _String operationName = "startCheckpoint";_ > _checkNNStartup();_ > _{color:#ff6600}namesystem.checkSuperuserPrivilege(operationName);{color}_ > _.._ > > I found that the methods in BackupNodeRpcServer who implemented > JournalProtocol lack of such Permission check. See below: > > > _public void startLogSegment(JournalInfo journalInfo, long epoch,_ > _long txid) throws IOException {_ > _namesystem.checkOperation(OperationCategory.JOURNAL);_ > _verifyJournalRequest(journalInfo);_ > _getBNImage().namenodeStartedLogSegment(txid);_ > _}_ > > _@Override_ > _public void journal(JournalInfo journalInfo, long epoch, long firstTxId,_ > _int numTxns, byte[] records) throws IOException {_ > _namesystem.checkOperation(OperationCategory.JOURNAL);_ > _verifyJournalRequest(journalInfo);_ > _getBNImage().journal(firstTxId, numTxns, records);_ > _}_ > > Do we need add Permission check for them? > > Please point out my mistakes if i am wrong or miss something. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16004) startLogSegment and journal in BackupNode lack Permission check.
[ https://issues.apache.org/jira/browse/HDFS-16004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17341112#comment-17341112 ] Xiaoqiao He commented on HDFS-16004: Assign this issue to [~xiaoheipangzi]. Thanks [~xiaoheipangzi] for your report. It is good catch in my opinion. BTW I think we should involve every RPC interface of BackupNode and QJournal because just found both of them have no permission checks. cc [~shv] do you mind give another check? Thanks. > startLogSegment and journal in BackupNode lack Permission check. > > > Key: HDFS-16004 > URL: https://issues.apache.org/jira/browse/HDFS-16004 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: lujie >Assignee: lujie >Priority: Critical > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > I have some doubt when i configurate secure HDFS. I know we have Service > Level Authorization for protocols like NamenodeProtocol,DatanodeProtocol and > so on. > But i do not find such Authorization for JournalProtocol after reading the > code in HDFSPolicyProvider. And if we have, how can i configurate such > Authorization? > > Besides even NamenodeProtocol has Service Level Authorization, its methods > still have Permission check. Take startCheckpoint in NameNodeRpcServer who > implemented NamenodeProtocol for example: > > _public NamenodeCommand startCheckpoint(NamenodeRegistration registration)_ > _throws IOException {_ > _String operationName = "startCheckpoint";_ > _checkNNStartup();_ > _{color:#ff6600}namesystem.checkSuperuserPrivilege(operationName);{color}_ > _.._ > > I found that the methods in BackupNodeRpcServer who implemented > JournalProtocol lack of such Permission check. See below: > > > _public void startLogSegment(JournalInfo journalInfo, long epoch,_ > _long txid) throws IOException {_ > _namesystem.checkOperation(OperationCategory.JOURNAL);_ > _verifyJournalRequest(journalInfo);_ > _getBNImage().namenodeStartedLogSegment(txid);_ > _}_ > > _@Override_ > _public void journal(JournalInfo journalInfo, long epoch, long firstTxId,_ > _int numTxns, byte[] records) throws IOException {_ > _namesystem.checkOperation(OperationCategory.JOURNAL);_ > _verifyJournalRequest(journalInfo);_ > _getBNImage().journal(firstTxId, numTxns, records);_ > _}_ > > Do we need add Permission check for them? > > Please point out my mistakes if i am wrong or miss something. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-16004) startLogSegment and journal in BackupNode lack Permission check.
[ https://issues.apache.org/jira/browse/HDFS-16004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoqiao He reassigned HDFS-16004: -- Assignee: lujie > startLogSegment and journal in BackupNode lack Permission check. > > > Key: HDFS-16004 > URL: https://issues.apache.org/jira/browse/HDFS-16004 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: lujie >Assignee: lujie >Priority: Critical > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > I have some doubt when i configurate secure HDFS. I know we have Service > Level Authorization for protocols like NamenodeProtocol,DatanodeProtocol and > so on. > But i do not find such Authorization for JournalProtocol after reading the > code in HDFSPolicyProvider. And if we have, how can i configurate such > Authorization? > > Besides even NamenodeProtocol has Service Level Authorization, its methods > still have Permission check. Take startCheckpoint in NameNodeRpcServer who > implemented NamenodeProtocol for example: > > _public NamenodeCommand startCheckpoint(NamenodeRegistration registration)_ > _throws IOException {_ > _String operationName = "startCheckpoint";_ > _checkNNStartup();_ > _{color:#ff6600}namesystem.checkSuperuserPrivilege(operationName);{color}_ > _.._ > > I found that the methods in BackupNodeRpcServer who implemented > JournalProtocol lack of such Permission check. See below: > > > _public void startLogSegment(JournalInfo journalInfo, long epoch,_ > _long txid) throws IOException {_ > _namesystem.checkOperation(OperationCategory.JOURNAL);_ > _verifyJournalRequest(journalInfo);_ > _getBNImage().namenodeStartedLogSegment(txid);_ > _}_ > > _@Override_ > _public void journal(JournalInfo journalInfo, long epoch, long firstTxId,_ > _int numTxns, byte[] records) throws IOException {_ > _namesystem.checkOperation(OperationCategory.JOURNAL);_ > _verifyJournalRequest(journalInfo);_ > _getBNImage().journal(firstTxId, numTxns, records);_ > _}_ > > Do we need add Permission check for them? > > Please point out my mistakes if i am wrong or miss something. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16011) HDFS: Support viewfs nested mount
[ https://issues.apache.org/jira/browse/HDFS-16011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17341109#comment-17341109 ] zhu commented on HDFS-16011: [~hexiaoqiao],[~umamaheswararao],[~ayushtkn] Looking forward to your comments. > HDFS: Support viewfs nested mount > - > > Key: HDFS-16011 > URL: https://issues.apache.org/jira/browse/HDFS-16011 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs, viewfs >Affects Versions: 3.2.2, 3.3.1 >Reporter: zhu >Assignee: zhu >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > The current viewfs does not support nested mount points. The current viewfs > does not support nested mount points. > *E.g:* > 1./home/ => /home/ > 2./home/work => /home/work > If mount point 1 is loaded, mount point 2 cannot be added, and the following > exception will be thrown when loading 2. > {code:java} > throw new FileAlreadyExistsException("Path " + nextInode.fullPath + > " already exists as link"); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16011) HDFS: Support viewfs nested mount
[ https://issues.apache.org/jira/browse/HDFS-16011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhu updated HDFS-16011: --- Component/s: viewfs > HDFS: Support viewfs nested mount > - > > Key: HDFS-16011 > URL: https://issues.apache.org/jira/browse/HDFS-16011 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs, viewfs >Affects Versions: 3.2.2, 3.3.1 >Reporter: zhu >Assignee: zhu >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > The current viewfs does not support nested mount points. The current viewfs > does not support nested mount points. > *E.g:* > 1./home/ => /home/ > 2./home/work => /home/work > If mount point 1 is loaded, mount point 2 cannot be added, and the following > exception will be thrown when loading 2. > {code:java} > throw new FileAlreadyExistsException("Path " + nextInode.fullPath + > " already exists as link"); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16001) TestOfflineEditsViewer.testStored() fails reading negative value of FSEditLogOpCodes
[ https://issues.apache.org/jira/browse/HDFS-16001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoqiao He updated HDFS-16001: --- Fix Version/s: 3.3.1 Hadoop Flags: Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) Committed to branch-3.3. Thanks [~aajisaka] for your works! Thanks [~shv] for your reviews! > TestOfflineEditsViewer.testStored() fails reading negative value of > FSEditLogOpCodes > > > Key: HDFS-16001 > URL: https://issues.apache.org/jira/browse/HDFS-16001 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: Konstantin Shvachko >Assignee: Akira Ajisaka >Priority: Blocker > Labels: pull-request-available > Fix For: 3.3.1, 3.4.0 > > Time Spent: 50m > Remaining Estimate: 0h > > {{TestOfflineEditsViewer.testStored()}} fails consistently with an exception > {noformat} > java.io.IOException: Op -54 has size -1314247195, but the minimum op size is > 17 > {noformat} > Seems like there is a corrupt record in {{editsStored}} file. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16001) TestOfflineEditsViewer.testStored() fails reading negative value of FSEditLogOpCodes
[ https://issues.apache.org/jira/browse/HDFS-16001?focusedWorklogId=593621=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-593621 ] ASF GitHub Bot logged work on HDFS-16001: - Author: ASF GitHub Bot Created on: 08/May/21 02:49 Start Date: 08/May/21 02:49 Worklog Time Spent: 10m Work Description: Hexiaoqiao merged pull request #2989: URL: https://github.com/apache/hadoop/pull/2989 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 593621) Time Spent: 50m (was: 40m) > TestOfflineEditsViewer.testStored() fails reading negative value of > FSEditLogOpCodes > > > Key: HDFS-16001 > URL: https://issues.apache.org/jira/browse/HDFS-16001 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: Konstantin Shvachko >Assignee: Akira Ajisaka >Priority: Blocker > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 50m > Remaining Estimate: 0h > > {{TestOfflineEditsViewer.testStored()}} fails consistently with an exception > {noformat} > java.io.IOException: Op -54 has size -1314247195, but the minimum op size is > 17 > {noformat} > Seems like there is a corrupt record in {{editsStored}} file. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16008) RBF: Tool to initialize ViewFS Mapping to Router
[ https://issues.apache.org/jira/browse/HDFS-16008?focusedWorklogId=593614=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-593614 ] ASF GitHub Bot logged work on HDFS-16008: - Author: ASF GitHub Bot Created on: 08/May/21 02:01 Start Date: 08/May/21 02:01 Worklog Time Spent: 10m Work Description: zhuxiangyi commented on a change in pull request #2981: URL: https://github.com/apache/hadoop/pull/2981#discussion_r628667273 ## File path: hadoop-hdfs-project/hadoop-hdfs-rbf/src/site/markdown/HDFSRouterFederation.md ## @@ -241,6 +241,22 @@ Mount table permission can be set by following command: The option mode is UNIX-style permissions for the mount table. Permissions are specified in octal, e.g. 0755. By default, this is set to 0755. + Init ViewFs To Router +Router supports initializing the ViewFS mount point to the Router. The mapping directory protocol of ViewFS must be HDFS, and the initializer only supports one-to-one mapping. + +For example, use the following viewfs to configure the initial mount table to the router. + + + +fs.viewfs.mounttable.ClusterX.link./data +hdfs://nn1-clusterx.example.com:8020/data + + + +The ViewFS mount table can be initialized to the Router by using the following command: + +[hdfs]$ $HADOOP_HOME/bin/hdfs dfsrouteradmin -initViewFsToMountTable ClusterX Review comment: Do you mean that there are mount points for multiple clusters, can the user import all of them without specifying the cluster name? For example, the following configuration. fs.viewfs.mounttable.ClusterX.link./data hdfs://nn1-clusterx.example.com:8020/data fs.viewfs.mounttable.ClusterY.link./project hdfs://nn2-clusterx.example.com:8020/project -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 593614) Time Spent: 1h 40m (was: 1.5h) > RBF: Tool to initialize ViewFS Mapping to Router > > > Key: HDFS-16008 > URL: https://issues.apache.org/jira/browse/HDFS-16008 > Project: Hadoop HDFS > Issue Type: Improvement > Components: rbf >Affects Versions: 3.3.1 >Reporter: zhu >Assignee: zhu >Priority: Major > Labels: pull-request-available > Time Spent: 1h 40m > Remaining Estimate: 0h > > This is a tool for initializing ViewFS Mapping to Router. > Some companies are currently migrating from viewfs to router, I think they > need this tool. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16008) RBF: Tool to initialize ViewFS Mapping to Router
[ https://issues.apache.org/jira/browse/HDFS-16008?focusedWorklogId=593612=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-593612 ] ASF GitHub Bot logged work on HDFS-16008: - Author: ASF GitHub Bot Created on: 08/May/21 01:58 Start Date: 08/May/21 01:58 Worklog Time Spent: 10m Work Description: zhuxiangyi commented on a change in pull request #2981: URL: https://github.com/apache/hadoop/pull/2981#discussion_r628667273 ## File path: hadoop-hdfs-project/hadoop-hdfs-rbf/src/site/markdown/HDFSRouterFederation.md ## @@ -241,6 +241,22 @@ Mount table permission can be set by following command: The option mode is UNIX-style permissions for the mount table. Permissions are specified in octal, e.g. 0755. By default, this is set to 0755. + Init ViewFs To Router +Router supports initializing the ViewFS mount point to the Router. The mapping directory protocol of ViewFS must be HDFS, and the initializer only supports one-to-one mapping. + +For example, use the following viewfs to configure the initial mount table to the router. + + + +fs.viewfs.mounttable.ClusterX.link./data +hdfs://nn1-clusterx.example.com:8020/data + + + +The ViewFS mount table can be initialized to the Router by using the following command: + +[hdfs]$ $HADOOP_HOME/bin/hdfs dfsrouteradmin -initViewFsToMountTable ClusterX Review comment: Do you mean that there are mount points for multiple clusters, can the user import all of them without specifying the cluster name? For example, the following configuration. fs.viewfs.mounttable.ClusterX.link./data hdfs://nn1-clusterx.example.com:8020/data fs.viewfs.mounttable.ClusterY.link./project hdfs://nn1-clusterx.example.com:8020/project -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 593612) Time Spent: 1h 20m (was: 1h 10m) > RBF: Tool to initialize ViewFS Mapping to Router > > > Key: HDFS-16008 > URL: https://issues.apache.org/jira/browse/HDFS-16008 > Project: Hadoop HDFS > Issue Type: Improvement > Components: rbf >Affects Versions: 3.3.1 >Reporter: zhu >Assignee: zhu >Priority: Major > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > > This is a tool for initializing ViewFS Mapping to Router. > Some companies are currently migrating from viewfs to router, I think they > need this tool. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16008) RBF: Tool to initialize ViewFS Mapping to Router
[ https://issues.apache.org/jira/browse/HDFS-16008?focusedWorklogId=593613=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-593613 ] ASF GitHub Bot logged work on HDFS-16008: - Author: ASF GitHub Bot Created on: 08/May/21 01:58 Start Date: 08/May/21 01:58 Worklog Time Spent: 10m Work Description: zhuxiangyi commented on a change in pull request #2981: URL: https://github.com/apache/hadoop/pull/2981#discussion_r628667273 ## File path: hadoop-hdfs-project/hadoop-hdfs-rbf/src/site/markdown/HDFSRouterFederation.md ## @@ -241,6 +241,22 @@ Mount table permission can be set by following command: The option mode is UNIX-style permissions for the mount table. Permissions are specified in octal, e.g. 0755. By default, this is set to 0755. + Init ViewFs To Router +Router supports initializing the ViewFS mount point to the Router. The mapping directory protocol of ViewFS must be HDFS, and the initializer only supports one-to-one mapping. + +For example, use the following viewfs to configure the initial mount table to the router. + + + +fs.viewfs.mounttable.ClusterX.link./data +hdfs://nn1-clusterx.example.com:8020/data + + + +The ViewFS mount table can be initialized to the Router by using the following command: + +[hdfs]$ $HADOOP_HOME/bin/hdfs dfsrouteradmin -initViewFsToMountTable ClusterX Review comment: Do you mean that there are mount points for multiple clusters, can the user import all of them without specifying the cluster name? For example, the following configuration. ` fs.viewfs.mounttable.ClusterX.link./data hdfs://nn1-clusterx.example.com:8020/data fs.viewfs.mounttable.ClusterY.link./project hdfs://nn1-clusterx.example.com:8020/project ` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 593613) Time Spent: 1.5h (was: 1h 20m) > RBF: Tool to initialize ViewFS Mapping to Router > > > Key: HDFS-16008 > URL: https://issues.apache.org/jira/browse/HDFS-16008 > Project: Hadoop HDFS > Issue Type: Improvement > Components: rbf >Affects Versions: 3.3.1 >Reporter: zhu >Assignee: zhu >Priority: Major > Labels: pull-request-available > Time Spent: 1.5h > Remaining Estimate: 0h > > This is a tool for initializing ViewFS Mapping to Router. > Some companies are currently migrating from viewfs to router, I think they > need this tool. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-14703) NameNode Fine-Grained Locking via Metadata Partitioning
[ https://issues.apache.org/jira/browse/HDFS-14703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17341089#comment-17341089 ] Konstantin Shvachko edited comment on HDFS-14703 at 5/8/21, 1:05 AM: - Updated the POC patches to current trunk. There were indeed some missing parts in the first patch. See [^003-partitioned-inodeMap-POC.tar.gz]. was (Author: shv): Updated the POC patches. There were indeed some missing parts in the first patch. See [003-partitioned-inodeMap-POC.tar.gz|https://issues.apache.org/jira/secure/attachment/13025177/003-partitioned-inodeMap-POC.tar.gz]. > NameNode Fine-Grained Locking via Metadata Partitioning > --- > > Key: HDFS-14703 > URL: https://issues.apache.org/jira/browse/HDFS-14703 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs, namenode >Reporter: Konstantin Shvachko >Priority: Major > Attachments: 001-partitioned-inodeMap-POC.tar.gz, > 002-partitioned-inodeMap-POC.tar.gz, 003-partitioned-inodeMap-POC.tar.gz, > NameNode Fine-Grained Locking.pdf, NameNode Fine-Grained Locking.pdf > > > We target to enable fine-grained locking by splitting the in-memory namespace > into multiple partitions each having a separate lock. Intended to improve > performance of NameNode write operations. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-14703) NameNode Fine-Grained Locking via Metadata Partitioning
[ https://issues.apache.org/jira/browse/HDFS-14703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17341089#comment-17341089 ] Konstantin Shvachko edited comment on HDFS-14703 at 5/8/21, 1:04 AM: - Updated the POC patches. There were indeed some missing parts in the first patch. See [003-partitioned-inodeMap-POC.tar.gz|https://issues.apache.org/jira/secure/attachment/13025177/003-partitioned-inodeMap-POC.tar.gz]. was (Author: shv): Updated the POC patches. There were indeed some missing parts in the first patch. See [https://issues.apache.org/jira/secure/attachment/13025177/003-partitioned-inodeMap-POC.tar.gz|https://issues.apache.org/jira/secure/attachment/13025177/003-partitioned-inodeMap-POC.tar.gz]. > NameNode Fine-Grained Locking via Metadata Partitioning > --- > > Key: HDFS-14703 > URL: https://issues.apache.org/jira/browse/HDFS-14703 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs, namenode >Reporter: Konstantin Shvachko >Priority: Major > Attachments: 001-partitioned-inodeMap-POC.tar.gz, > 002-partitioned-inodeMap-POC.tar.gz, 003-partitioned-inodeMap-POC.tar.gz, > NameNode Fine-Grained Locking.pdf, NameNode Fine-Grained Locking.pdf > > > We target to enable fine-grained locking by splitting the in-memory namespace > into multiple partitions each having a separate lock. Intended to improve > performance of NameNode write operations. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14703) NameNode Fine-Grained Locking via Metadata Partitioning
[ https://issues.apache.org/jira/browse/HDFS-14703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17341089#comment-17341089 ] Konstantin Shvachko commented on HDFS-14703: Updated the POC patches. There were indeed some missing parts in the first patch. See [https://issues.apache.org/jira/secure/attachment/13025177/003-partitioned-inodeMap-POC.tar.gz|https://issues.apache.org/jira/secure/attachment/13025177/003-partitioned-inodeMap-POC.tar.gz]. > NameNode Fine-Grained Locking via Metadata Partitioning > --- > > Key: HDFS-14703 > URL: https://issues.apache.org/jira/browse/HDFS-14703 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs, namenode >Reporter: Konstantin Shvachko >Priority: Major > Attachments: 001-partitioned-inodeMap-POC.tar.gz, > 002-partitioned-inodeMap-POC.tar.gz, 003-partitioned-inodeMap-POC.tar.gz, > NameNode Fine-Grained Locking.pdf, NameNode Fine-Grained Locking.pdf > > > We target to enable fine-grained locking by splitting the in-memory namespace > into multiple partitions each having a separate lock. Intended to improve > performance of NameNode write operations. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14703) NameNode Fine-Grained Locking via Metadata Partitioning
[ https://issues.apache.org/jira/browse/HDFS-14703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated HDFS-14703: --- Attachment: 003-partitioned-inodeMap-POC.tar.gz > NameNode Fine-Grained Locking via Metadata Partitioning > --- > > Key: HDFS-14703 > URL: https://issues.apache.org/jira/browse/HDFS-14703 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs, namenode >Reporter: Konstantin Shvachko >Priority: Major > Attachments: 001-partitioned-inodeMap-POC.tar.gz, > 002-partitioned-inodeMap-POC.tar.gz, 003-partitioned-inodeMap-POC.tar.gz, > NameNode Fine-Grained Locking.pdf, NameNode Fine-Grained Locking.pdf > > > We target to enable fine-grained locking by splitting the in-memory namespace > into multiple partitions each having a separate lock. Intended to improve > performance of NameNode write operations. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16001) TestOfflineEditsViewer.testStored() fails reading negative value of FSEditLogOpCodes
[ https://issues.apache.org/jira/browse/HDFS-16001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17341046#comment-17341046 ] Konstantin Shvachko commented on HDFS-16001: Checked the test. This fixes it. +1 thanks [~aajisaka] > TestOfflineEditsViewer.testStored() fails reading negative value of > FSEditLogOpCodes > > > Key: HDFS-16001 > URL: https://issues.apache.org/jira/browse/HDFS-16001 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: Konstantin Shvachko >Assignee: Akira Ajisaka >Priority: Blocker > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 40m > Remaining Estimate: 0h > > {{TestOfflineEditsViewer.testStored()}} fails consistently with an exception > {noformat} > java.io.IOException: Op -54 has size -1314247195, but the minimum op size is > 17 > {noformat} > Seems like there is a corrupt record in {{editsStored}} file. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16012) Improve DataXceiver#copyBlock() log printing
[ https://issues.apache.org/jira/browse/HDFS-16012?focusedWorklogId=593537=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-593537 ] ASF GitHub Bot logged work on HDFS-16012: - Author: ASF GitHub Bot Created on: 07/May/21 21:13 Start Date: 07/May/21 21:13 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2988: URL: https://github.com/apache/hadoop/pull/2988#issuecomment-834784136 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 55s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | -1 :x: | test4tests | 0m 0s | | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 35m 10s | | trunk passed | | +1 :green_heart: | compile | 1m 23s | | trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | compile | 1m 16s | | trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 | | +1 :green_heart: | checkstyle | 1m 4s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 27s | | trunk passed | | +1 :green_heart: | javadoc | 0m 57s | | trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javadoc | 1m 32s | | trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 | | +1 :green_heart: | spotbugs | 3m 54s | | trunk passed | | +1 :green_heart: | shadedclient | 17m 34s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 16s | | the patch passed | | +1 :green_heart: | compile | 1m 19s | | the patch passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javac | 1m 19s | | the patch passed | | +1 :green_heart: | compile | 1m 10s | | the patch passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 | | +1 :green_heart: | javac | 1m 10s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 0m 51s | | the patch passed | | +1 :green_heart: | mvnsite | 1m 13s | | the patch passed | | +1 :green_heart: | javadoc | 0m 48s | | the patch passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javadoc | 1m 20s | | the patch passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 | | +1 :green_heart: | spotbugs | 3m 11s | | the patch passed | | +1 :green_heart: | shadedclient | 16m 27s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | -1 :x: | unit | 421m 3s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2988/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 44s | | The patch does not generate ASF License warnings. | | | | 511m 39s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.server.namenode.TestDecommissioningStatus | | | hadoop.hdfs.TestViewDistributedFileSystemContract | | | hadoop.hdfs.TestSnapshotCommands | | | hadoop.hdfs.TestPersistBlocks | | | hadoop.hdfs.TestDFSShell | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestFsVolumeList | | | hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer | | | hadoop.hdfs.TestMaintenanceState | | | hadoop.hdfs.TestStateAlignmentContextWithHA | | | hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes | | | hadoop.fs.viewfs.TestViewFSOverloadSchemeWithMountTableConfigInHDFS | | | hadoop.hdfs.server.namenode.TestDecommissioningStatusWithBackoffMonitor | | | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped | | | hadoop.hdfs.server.namenode.ha.TestEditLogTailer | | | hadoop.hdfs.server.datanode.TestBlockScanner | | | hadoop.hdfs.server.datanode.TestIncrementalBrVariations | | | hadoop.hdfs.server.diskbalancer.command.TestDiskBalancerCommand | | |
[jira] [Work logged] (HDFS-16001) TestOfflineEditsViewer.testStored() fails reading negative value of FSEditLogOpCodes
[ https://issues.apache.org/jira/browse/HDFS-16001?focusedWorklogId=593447=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-593447 ] ASF GitHub Bot logged work on HDFS-16001: - Author: ASF GitHub Bot Created on: 07/May/21 18:03 Start Date: 07/May/21 18:03 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2989: URL: https://github.com/apache/hadoop/pull/2989#issuecomment-834660569 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 21m 5s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | shelldocs | 0m 0s | | Shelldocs was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 2 new or modified test files. | _ branch-3.3 Compile Tests _ | | +1 :green_heart: | mvninstall | 30m 21s | | branch-3.3 passed | | +1 :green_heart: | shadedclient | 15m 21s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 13s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | shellcheck | 0m 0s | | No new issues. | | +1 :green_heart: | xml | 0m 1s | | The patch has no ill-formed XML file. | | +1 :green_heart: | shadedclient | 15m 8s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 1m 14s | | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 38s | | The patch does not generate ASF License warnings. | | | | 88m 4s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2989/1/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2989 | | Optional Tests | dupname asflicense unit codespell shellcheck shelldocs xml | | uname | Linux d37ad0da4914 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | branch-3.3 / 8046991fb77b53f555d63c2f51f9d1f8037dad9b | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2989/1/testReport/ | | Max. process+thread count | 631 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2989/1/console | | versions | git=2.17.1 maven=3.6.0 shellcheck=0.4.6 | | Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org | This message was automatically generated. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 593447) Time Spent: 40m (was: 0.5h) > TestOfflineEditsViewer.testStored() fails reading negative value of > FSEditLogOpCodes > > > Key: HDFS-16001 > URL: https://issues.apache.org/jira/browse/HDFS-16001 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: Konstantin Shvachko >Assignee: Akira Ajisaka >Priority: Blocker > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 40m > Remaining Estimate: 0h > > {{TestOfflineEditsViewer.testStored()}} fails consistently with an exception > {noformat} > java.io.IOException: Op -54 has size -1314247195, but the minimum op size is > 17 > {noformat} > Seems like there is a corrupt record in {{editsStored}} file. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-15878) RBF: Flaky test TestRouterWebHDFSContractCreate>AbstractContractCreateTest#testSyncable in Trunk
[ https://issues.apache.org/jira/browse/HDFS-15878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fengnan Li resolved HDFS-15878. --- Resolution: Not A Problem > RBF: Flaky test > TestRouterWebHDFSContractCreate>AbstractContractCreateTest#testSyncable in > Trunk > > > Key: HDFS-15878 > URL: https://issues.apache.org/jira/browse/HDFS-15878 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs, rbf >Reporter: Renukaprasad C >Assignee: Fengnan Li >Priority: Major > > ERROR] Tests run: 16, Failures: 0, Errors: 1, Skipped: 2, Time elapsed: > 24.627 s <<< FAILURE! - in > org.apache.hadoop.fs.contract.router.web.TestRouterWebHDFSContractCreate > [ERROR] > testSyncable(org.apache.hadoop.fs.contract.router.web.TestRouterWebHDFSContractCreate) > Time elapsed: 0.222 s <<< ERROR! > java.io.FileNotFoundException: File /test/testSyncable not found. > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at > org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:121) > at > org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:110) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toIOException(WebHdfsFileSystem.java:576) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$900(WebHdfsFileSystem.java:146) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.shouldRetry(WebHdfsFileSystem.java:892) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:858) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:652) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:690) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:686) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$ReadRunner.getRedirectedUrl(WebHdfsFileSystem.java:2307) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$ReadRunner.(WebHdfsFileSystem.java:2296) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$WebHdfsInputStream.(WebHdfsFileSystem.java:2176) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.open(WebHdfsFileSystem.java:1610) > at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:975) > at > org.apache.hadoop.fs.contract.AbstractContractCreateTest.validateSyncableSemantics(AbstractContractCreateTest.java:556) > at > org.apache.hadoop.fs.contract.AbstractContractCreateTest.testSyncable(AbstractContractCreateTest.java:459) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.lang.Thread.run(Thread.java:748) > Caused by: > org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): File > /test/testSyncable not found. > at > org.apache.hadoop.hdfs.web.JsonUtilClient.toRemoteException(JsonUtilClient.java:90) > at >
[jira] [Reopened] (HDFS-15878) RBF: Flaky test TestRouterWebHDFSContractCreate>AbstractContractCreateTest#testSyncable in Trunk
[ https://issues.apache.org/jira/browse/HDFS-15878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fengnan Li reopened HDFS-15878: --- Reopen to change the closing status. > RBF: Flaky test > TestRouterWebHDFSContractCreate>AbstractContractCreateTest#testSyncable in > Trunk > > > Key: HDFS-15878 > URL: https://issues.apache.org/jira/browse/HDFS-15878 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs, rbf >Reporter: Renukaprasad C >Assignee: Fengnan Li >Priority: Major > > ERROR] Tests run: 16, Failures: 0, Errors: 1, Skipped: 2, Time elapsed: > 24.627 s <<< FAILURE! - in > org.apache.hadoop.fs.contract.router.web.TestRouterWebHDFSContractCreate > [ERROR] > testSyncable(org.apache.hadoop.fs.contract.router.web.TestRouterWebHDFSContractCreate) > Time elapsed: 0.222 s <<< ERROR! > java.io.FileNotFoundException: File /test/testSyncable not found. > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at > org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:121) > at > org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:110) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toIOException(WebHdfsFileSystem.java:576) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$900(WebHdfsFileSystem.java:146) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.shouldRetry(WebHdfsFileSystem.java:892) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:858) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:652) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:690) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:686) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$ReadRunner.getRedirectedUrl(WebHdfsFileSystem.java:2307) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$ReadRunner.(WebHdfsFileSystem.java:2296) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$WebHdfsInputStream.(WebHdfsFileSystem.java:2176) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.open(WebHdfsFileSystem.java:1610) > at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:975) > at > org.apache.hadoop.fs.contract.AbstractContractCreateTest.validateSyncableSemantics(AbstractContractCreateTest.java:556) > at > org.apache.hadoop.fs.contract.AbstractContractCreateTest.testSyncable(AbstractContractCreateTest.java:459) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.lang.Thread.run(Thread.java:748) > Caused by: > org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): File > /test/testSyncable not found. > at > org.apache.hadoop.hdfs.web.JsonUtilClient.toRemoteException(JsonUtilClient.java:90) > at >
[jira] [Commented] (HDFS-15878) RBF: Flaky test TestRouterWebHDFSContractCreate>AbstractContractCreateTest#testSyncable in Trunk
[ https://issues.apache.org/jira/browse/HDFS-15878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17340967#comment-17340967 ] Fengnan Li commented on HDFS-15878: --- [~hexiaoqiao] [~ayushtkn] Updated the status as suggested. Thanks. > RBF: Flaky test > TestRouterWebHDFSContractCreate>AbstractContractCreateTest#testSyncable in > Trunk > > > Key: HDFS-15878 > URL: https://issues.apache.org/jira/browse/HDFS-15878 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs, rbf >Reporter: Renukaprasad C >Assignee: Fengnan Li >Priority: Major > > ERROR] Tests run: 16, Failures: 0, Errors: 1, Skipped: 2, Time elapsed: > 24.627 s <<< FAILURE! - in > org.apache.hadoop.fs.contract.router.web.TestRouterWebHDFSContractCreate > [ERROR] > testSyncable(org.apache.hadoop.fs.contract.router.web.TestRouterWebHDFSContractCreate) > Time elapsed: 0.222 s <<< ERROR! > java.io.FileNotFoundException: File /test/testSyncable not found. > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at > org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:121) > at > org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:110) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toIOException(WebHdfsFileSystem.java:576) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$900(WebHdfsFileSystem.java:146) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.shouldRetry(WebHdfsFileSystem.java:892) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:858) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:652) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:690) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:686) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$ReadRunner.getRedirectedUrl(WebHdfsFileSystem.java:2307) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$ReadRunner.(WebHdfsFileSystem.java:2296) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$WebHdfsInputStream.(WebHdfsFileSystem.java:2176) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.open(WebHdfsFileSystem.java:1610) > at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:975) > at > org.apache.hadoop.fs.contract.AbstractContractCreateTest.validateSyncableSemantics(AbstractContractCreateTest.java:556) > at > org.apache.hadoop.fs.contract.AbstractContractCreateTest.testSyncable(AbstractContractCreateTest.java:459) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.lang.Thread.run(Thread.java:748) > Caused by: > org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): File > /test/testSyncable not found. > at >
[jira] [Commented] (HDFS-16013) DirectoryScan operation holds dataset lock for long time
[ https://issues.apache.org/jira/browse/HDFS-16013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17340954#comment-17340954 ] Renukaprasad C commented on HDFS-16013: --- Thanks [~sodonnell] for review & quick update on the issue. We had all the patches (HDFS-15406, HDFS-15574, HDFS-14476, HDFS-14751 and HDFS-15048) already, except the Jira - HDFS-15415. We have tested with the changes - HDFS-15415, delay has come down almost 60% (inital delay 4-8 sec has reduced to 1-3 seconds). We continue to test & post if any further updates. > DirectoryScan operation holds dataset lock for long time > > > Key: HDFS-16013 > URL: https://issues.apache.org/jira/browse/HDFS-16013 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Renukaprasad C >Priority: Critical > > Environment: 3 Node cluster with around 2M files & same number of blocks. > All file operations are normal, only during directory scan, which take more > memory and some long GC Pause. This directory scan happens for every 6H > (default value) which cause slow response to any file operations. Delay is > around 5-8 seconds (In production this delay got increased to 30+ seconds > with 8M blocks) > GC Configuration: > -Xms6144M > -Xmx12288M /8G > -XX:NewSize=614M > -XX:MaxNewSize=1228M > -XX:MetaspaceSize=128M > -XX:MaxMetaspaceSize=128M > -XX:CMSFullGCsBeforeCompaction=1 > -XX:MaxDirectMemorySize=1G > -XX:+UseConcMarkSweepGC > -XX:+CMSParallelRemarkEnabled > -XX:+UseCMSCompactAtFullCollection > -XX:CMSInitiatingOccupancyFraction=80 > Also we tried with G1 GC, but couldnt find much difference in the result. > -XX:+UseG1GC > -XX:MaxGCPauseMillis=200 > -XX:InitiatingHeapOccupancyPercent=45 > -XX:G1ReservePercent=10 > {code:java} > 2021-05-07 16:32:23,508 INFO > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner: BlockPool > BP-345634799--1619695417333 Total blocks: 2767211, missing metadata > files: 22, missing block files: 22, missing blocks in memory: 0, mismatched > blocks: 0 > 2021-05-07 16:32:23,508 WARN > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Lock > held time above threshold: lock identifier: FsDatasetRWLock > lockHeldTimeMs=7061 ms. Suppressed 0 lock warnings. The stack trace is: > java.lang.Thread.getStackTrace(Thread.java:1559) > org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032) > org.apache.hadoop.util.InstrumentedLock.logWarning(InstrumentedLock.java:148) > org.apache.hadoop.util.InstrumentedLock.check(InstrumentedLock.java:186) > org.apache.hadoop.util.InstrumentedReadLock.unlock(InstrumentedReadLock.java:78) > org.apache.hadoop.util.AutoCloseableLock.release(AutoCloseableLock.java:84) > org.apache.hadoop.util.AutoCloseableLock.close(AutoCloseableLock.java:96) > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.scan(DirectoryScanner.java:539) > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile(DirectoryScanner.java:416) > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.run(DirectoryScanner.java:359) > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > {code} > We have the following Jiras our code already. But still facing long lock > held. - https://issues.apache.org/jira/browse/HDFS-15621, > https://issues.apache.org/jira/browse/HDFS-15150, > https://issues.apache.org/jira/browse/HDFS-15160, > https://issues.apache.org/jira/browse/HDFS-13947 > cc: [~brahma] [~belugabehr] [~sodonnell] [~ayushsaxena] [~weichiu] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16001) TestOfflineEditsViewer.testStored() fails reading negative value of FSEditLogOpCodes
[ https://issues.apache.org/jira/browse/HDFS-16001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira Ajisaka updated HDFS-16001: - Fix Version/s: 3.4.0 Merged PR #2980 into trunk. Opened #2989 for branch-3.3. > TestOfflineEditsViewer.testStored() fails reading negative value of > FSEditLogOpCodes > > > Key: HDFS-16001 > URL: https://issues.apache.org/jira/browse/HDFS-16001 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: Konstantin Shvachko >Assignee: Akira Ajisaka >Priority: Blocker > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > {{TestOfflineEditsViewer.testStored()}} fails consistently with an exception > {noformat} > java.io.IOException: Op -54 has size -1314247195, but the minimum op size is > 17 > {noformat} > Seems like there is a corrupt record in {{editsStored}} file. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16008) RBF: Tool to initialize ViewFS Mapping to Router
[ https://issues.apache.org/jira/browse/HDFS-16008?focusedWorklogId=593412=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-593412 ] ASF GitHub Bot logged work on HDFS-16008: - Author: ASF GitHub Bot Created on: 07/May/21 16:34 Start Date: 07/May/21 16:34 Worklog Time Spent: 10m Work Description: goiri commented on a change in pull request #2981: URL: https://github.com/apache/hadoop/pull/2981#discussion_r628356897 ## File path: hadoop-hdfs-project/hadoop-hdfs-rbf/src/site/markdown/HDFSRouterFederation.md ## @@ -241,6 +241,22 @@ Mount table permission can be set by following command: The option mode is UNIX-style permissions for the mount table. Permissions are specified in octal, e.g. 0755. By default, this is set to 0755. + Init ViewFs To Router +Router supports initializing the ViewFS mount point to the Router. The mapping directory protocol of ViewFS must be HDFS, and the initializer only supports one-to-one mapping. + +For example, use the following viewfs to configure the initial mount table to the router. + + + +fs.viewfs.mounttable.ClusterX.link./data +hdfs://nn1-clusterx.example.com:8020/data + + + +The ViewFS mount table can be initialized to the Router by using the following command: + +[hdfs]$ $HADOOP_HOME/bin/hdfs dfsrouteradmin -initViewFsToMountTable ClusterX Review comment: Can we also support an option to import the full table from ViewFs? ## File path: hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/router/TestRouterAdminCLI.java ## @@ -852,6 +908,7 @@ public void testInvalidArgumentMessage() throws Exception { + " ]\n" + "\t[-clrQuota ]\n" + "\t[-clrStorageTypeQuota ]\n" ++"\t[-initViewFsToMountTable ]\n" Review comment: Space after + ## File path: hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/router/TestRouterAdminCLI.java ## @@ -700,6 +706,49 @@ public void testAddMountTableIfParentExist() throws Exception { } } + @Test + public void testInitViewFsToMountTable() throws Exception { +// re-set system out for testing +System.setOut(new PrintStream(out)); +stateStore.loadCache(MountTableStoreImpl.class, true); +String nnAddress = cluster.getRandomNamenode(). +getNamenode().getHostAndPort(); + +String src = "/data"; +Path destPath = new Path("hdfs://" + nnAddress + "/data"); +String user = "user1"; +String group = "group1"; +String clusterName = "ClusterX"; + +// 0.mkdir destPath +hdfs.mkdirs(destPath); +// 1.set owner +hdfs.setOwner(destPath, user, group); +// 2.set viewFs mapping +admin.getConf().set( +"fs.viewfs.mounttable.ClusterX.link." + src, destPath.toString()); +// 3.run initialization +String[] argv = new String[]{"-initViewFsToMountTable", clusterName}; +assertEquals(0, ToolRunner.run(admin, argv)); +// 4.gets the mount point entries +stateStore.loadCache(MountTableStoreImpl.class, true); +GetMountTableEntriesRequest getRequest = GetMountTableEntriesRequest +.newInstance(src); +GetMountTableEntriesResponse getResponse = client.getMountTableManager() +.getMountTableEntries(getRequest); +List mountTables = getResponse.getEntries(); +// 5.check +assertEquals(1, mountTables.size()); +MountTable mountTable = mountTables.get(0); +assertEquals(user, mountTable.getOwnerName()); +assertEquals(group, mountTable.getGroupName()); +assertEquals(destPath.toUri().getPath(), mountTable. Review comment: Let's assert the size of mountTable.getDestinations() and extract the get(0) ## File path: hadoop-hdfs-project/hadoop-hdfs-rbf/src/site/markdown/HDFSRouterFederation.md ## @@ -241,6 +241,22 @@ Mount table permission can be set by following command: The option mode is UNIX-style permissions for the mount table. Permissions are specified in octal, e.g. 0755. By default, this is set to 0755. + Init ViewFs To Router +Router supports initializing the ViewFS mount point to the Router. The mapping directory protocol of ViewFS must be HDFS, and the initializer only supports one-to-one mapping. Review comment: Pointer to ViewFs doc? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 593412) Time Spent: 1h 10m (was: 1h) > RBF: Tool to initialize ViewFS Mapping to Router > > > Key: HDFS-16008
[jira] [Work logged] (HDFS-16001) TestOfflineEditsViewer.testStored() fails reading negative value of FSEditLogOpCodes
[ https://issues.apache.org/jira/browse/HDFS-16001?focusedWorklogId=593411=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-593411 ] ASF GitHub Bot logged work on HDFS-16001: - Author: ASF GitHub Bot Created on: 07/May/21 16:33 Start Date: 07/May/21 16:33 Worklog Time Spent: 10m Work Description: aajisaka opened a new pull request #2989: URL: https://github.com/apache/hadoop/pull/2989 JIRA: HDFS-16001 Backport of #2980 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 593411) Time Spent: 0.5h (was: 20m) > TestOfflineEditsViewer.testStored() fails reading negative value of > FSEditLogOpCodes > > > Key: HDFS-16001 > URL: https://issues.apache.org/jira/browse/HDFS-16001 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: Konstantin Shvachko >Assignee: Akira Ajisaka >Priority: Blocker > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > {{TestOfflineEditsViewer.testStored()}} fails consistently with an exception > {noformat} > java.io.IOException: Op -54 has size -1314247195, but the minimum op size is > 17 > {noformat} > Seems like there is a corrupt record in {{editsStored}} file. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16006) TestRouterFederationRename is flaky
[ https://issues.apache.org/jira/browse/HDFS-16006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17340922#comment-17340922 ] Íñigo Goiri commented on HDFS-16006: HDFS-15923 is adjusting the timeout for this test, is that fixing this issue? > TestRouterFederationRename is flaky > --- > > Key: HDFS-16006 > URL: https://issues.apache.org/jira/browse/HDFS-16006 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf >Reporter: Akira Ajisaka >Priority: Major > Attachments: patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt > > > {quote} > [ERROR] Errors: > [ERROR] > TestRouterFederationRename.testCounter:440->Object.wait:502->Object.wait:-2 ? > TestTimedOut > [ERROR] TestRouterFederationRename.testSetup:145 ? Remote The directory > /src cannot be... > [ERROR] TestRouterFederationRename.testSetup:145 ? Remote The directory > /src cannot be... > {quote} > https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2970/2/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16011) HDFS: Support viewfs nested mount
[ https://issues.apache.org/jira/browse/HDFS-16011?focusedWorklogId=593368=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-593368 ] ASF GitHub Bot logged work on HDFS-16011: - Author: ASF GitHub Bot Created on: 07/May/21 15:32 Start Date: 07/May/21 15:32 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2987: URL: https://github.com/apache/hadoop/pull/2987#issuecomment-834531151 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 50s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | -1 :x: | test4tests | 0m 0s | | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 35m 29s | | trunk passed | | +1 :green_heart: | compile | 21m 47s | | trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | compile | 19m 9s | | trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 | | +1 :green_heart: | checkstyle | 1m 5s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 34s | | trunk passed | | +1 :green_heart: | javadoc | 1m 2s | | trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javadoc | 1m 32s | | trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 | | +1 :green_heart: | spotbugs | 2m 24s | | trunk passed | | +1 :green_heart: | shadedclient | 16m 30s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 56s | | the patch passed | | +1 :green_heart: | compile | 20m 4s | | the patch passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javac | 20m 4s | | the patch passed | | +1 :green_heart: | compile | 18m 9s | | the patch passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 | | +1 :green_heart: | javac | 18m 9s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 1m 4s | [/results-checkstyle-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2987/1/artifact/out/results-checkstyle-hadoop-common-project_hadoop-common.txt) | hadoop-common-project/hadoop-common: The patch generated 13 new + 11 unchanged - 3 fixed = 24 total (was 14) | | +1 :green_heart: | mvnsite | 1m 30s | | the patch passed | | +1 :green_heart: | javadoc | 1m 3s | | the patch passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javadoc | 1m 38s | | the patch passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 | | +1 :green_heart: | spotbugs | 2m 31s | | the patch passed | | +1 :green_heart: | shadedclient | 15m 52s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | -1 :x: | unit | 16m 59s | [/patch-unit-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2987/1/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt) | hadoop-common in the patch passed. | | +1 :green_heart: | asflicense | 0m 55s | | The patch does not generate ASF License warnings. | | | | 182m 32s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.fs.viewfs.TestViewFsConfig | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2987/1/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2987 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell | | uname | Linux a4ca4648d84b 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / b78668be9a82e5fd252b2ed4b2e926446ff228d6 | | Default Java | Private
[jira] [Work logged] (HDFS-16001) TestOfflineEditsViewer.testStored() fails reading negative value of FSEditLogOpCodes
[ https://issues.apache.org/jira/browse/HDFS-16001?focusedWorklogId=593363=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-593363 ] ASF GitHub Bot logged work on HDFS-16001: - Author: ASF GitHub Bot Created on: 07/May/21 15:24 Start Date: 07/May/21 15:24 Worklog Time Spent: 10m Work Description: aajisaka commented on pull request #2980: URL: https://github.com/apache/hadoop/pull/2980#issuecomment-834519639 Merged. Thank you @Hexiaoqiao for your review. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 593363) Time Spent: 20m (was: 10m) > TestOfflineEditsViewer.testStored() fails reading negative value of > FSEditLogOpCodes > > > Key: HDFS-16001 > URL: https://issues.apache.org/jira/browse/HDFS-16001 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: Konstantin Shvachko >Assignee: Akira Ajisaka >Priority: Blocker > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > {{TestOfflineEditsViewer.testStored()}} fails consistently with an exception > {noformat} > java.io.IOException: Op -54 has size -1314247195, but the minimum op size is > 17 > {noformat} > Seems like there is a corrupt record in {{editsStored}} file. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16001) TestOfflineEditsViewer.testStored() fails reading negative value of FSEditLogOpCodes
[ https://issues.apache.org/jira/browse/HDFS-16001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDFS-16001: -- Labels: pull-request-available (was: ) > TestOfflineEditsViewer.testStored() fails reading negative value of > FSEditLogOpCodes > > > Key: HDFS-16001 > URL: https://issues.apache.org/jira/browse/HDFS-16001 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: Konstantin Shvachko >Assignee: Akira Ajisaka >Priority: Blocker > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > {{TestOfflineEditsViewer.testStored()}} fails consistently with an exception > {noformat} > java.io.IOException: Op -54 has size -1314247195, but the minimum op size is > 17 > {noformat} > Seems like there is a corrupt record in {{editsStored}} file. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16001) TestOfflineEditsViewer.testStored() fails reading negative value of FSEditLogOpCodes
[ https://issues.apache.org/jira/browse/HDFS-16001?focusedWorklogId=593362=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-593362 ] ASF GitHub Bot logged work on HDFS-16001: - Author: ASF GitHub Bot Created on: 07/May/21 15:24 Start Date: 07/May/21 15:24 Worklog Time Spent: 10m Work Description: aajisaka merged pull request #2980: URL: https://github.com/apache/hadoop/pull/2980 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 593362) Remaining Estimate: 0h Time Spent: 10m > TestOfflineEditsViewer.testStored() fails reading negative value of > FSEditLogOpCodes > > > Key: HDFS-16001 > URL: https://issues.apache.org/jira/browse/HDFS-16001 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: Konstantin Shvachko >Assignee: Akira Ajisaka >Priority: Blocker > Time Spent: 10m > Remaining Estimate: 0h > > {{TestOfflineEditsViewer.testStored()}} fails consistently with an exception > {noformat} > java.io.IOException: Op -54 has size -1314247195, but the minimum op size is > 17 > {noformat} > Seems like there is a corrupt record in {{editsStored}} file. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16013) DirectoryScan operation holds dataset lock for long time
[ https://issues.apache.org/jira/browse/HDFS-16013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17340866#comment-17340866 ] Stephen O'Donnell commented on HDFS-16013: -- You want these ones: HDFS-15406. Improve the speed of Datanode Block Scan. Contributed by hemanthboyina HDFS-15574. Remove unnecessary sort of block list in DirectoryScanner. Contributed by Stephen O'Donnell. HDFS-15583. Backport DirectoryScanner improvements HDFS-14476, HDFS-14751 and HDFS-15048 to branch 3.2 and 3.1. Contributed by Stephen O'Donnell HDFS-15415. Reduce locking in Datanode DirectoryScanner. Contributed by Stephen O'Donnell The last one is key, as it removes the lock completely, but the first one above makes a big different to the speed too. > DirectoryScan operation holds dataset lock for long time > > > Key: HDFS-16013 > URL: https://issues.apache.org/jira/browse/HDFS-16013 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Renukaprasad C >Priority: Critical > > Environment: 3 Node cluster with around 2M files & same number of blocks. > All file operations are normal, only during directory scan, which take more > memory and some long GC Pause. This directory scan happens for every 6H > (default value) which cause slow response to any file operations. Delay is > around 5-8 seconds (In production this delay got increased to 30+ seconds > with 8M blocks) > GC Configuration: > -Xms6144M > -Xmx12288M /8G > -XX:NewSize=614M > -XX:MaxNewSize=1228M > -XX:MetaspaceSize=128M > -XX:MaxMetaspaceSize=128M > -XX:CMSFullGCsBeforeCompaction=1 > -XX:MaxDirectMemorySize=1G > -XX:+UseConcMarkSweepGC > -XX:+CMSParallelRemarkEnabled > -XX:+UseCMSCompactAtFullCollection > -XX:CMSInitiatingOccupancyFraction=80 > Also we tried with G1 GC, but couldnt find much difference in the result. > -XX:+UseG1GC > -XX:MaxGCPauseMillis=200 > -XX:InitiatingHeapOccupancyPercent=45 > -XX:G1ReservePercent=10 > {code:java} > 2021-05-07 16:32:23,508 INFO > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner: BlockPool > BP-345634799--1619695417333 Total blocks: 2767211, missing metadata > files: 22, missing block files: 22, missing blocks in memory: 0, mismatched > blocks: 0 > 2021-05-07 16:32:23,508 WARN > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Lock > held time above threshold: lock identifier: FsDatasetRWLock > lockHeldTimeMs=7061 ms. Suppressed 0 lock warnings. The stack trace is: > java.lang.Thread.getStackTrace(Thread.java:1559) > org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032) > org.apache.hadoop.util.InstrumentedLock.logWarning(InstrumentedLock.java:148) > org.apache.hadoop.util.InstrumentedLock.check(InstrumentedLock.java:186) > org.apache.hadoop.util.InstrumentedReadLock.unlock(InstrumentedReadLock.java:78) > org.apache.hadoop.util.AutoCloseableLock.release(AutoCloseableLock.java:84) > org.apache.hadoop.util.AutoCloseableLock.close(AutoCloseableLock.java:96) > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.scan(DirectoryScanner.java:539) > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile(DirectoryScanner.java:416) > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.run(DirectoryScanner.java:359) > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > {code} > We have the following Jiras our code already. But still facing long lock > held. - https://issues.apache.org/jira/browse/HDFS-15621, > https://issues.apache.org/jira/browse/HDFS-15150, > https://issues.apache.org/jira/browse/HDFS-15160, > https://issues.apache.org/jira/browse/HDFS-13947 > cc: [~brahma] [~belugabehr] [~sodonnell] [~ayushsaxena] [~weichiu] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16007) Vulnerabilities found when serializing enum value
[ https://issues.apache.org/jira/browse/HDFS-16007?focusedWorklogId=593322=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-593322 ] ASF GitHub Bot logged work on HDFS-16007: - Author: ASF GitHub Bot Created on: 07/May/21 14:15 Start Date: 07/May/21 14:15 Worklog Time Spent: 10m Work Description: virajjasani commented on pull request #2982: URL: https://github.com/apache/hadoop/pull/2982#issuecomment-834440718 Thanks for the review @aajisaka. All test failures seem flakies. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 593322) Time Spent: 50m (was: 40m) > Vulnerabilities found when serializing enum value > - > > Key: HDFS-16007 > URL: https://issues.apache.org/jira/browse/HDFS-16007 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: junwen yang >Assignee: Viraj Jasani >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > ReplicaState enum is using ordinal to conduct serialization and > deserialization, which is vulnerable to the order, to cause issues similar to > HDFS-15624. > To avoid it, either adding comments to let later developer not to change this > enum, or add index checking in the read and getState function to avoid index > out of bound error. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-15994) Deletion should sleep some time, when there are too many pending deletion blocks.
[ https://issues.apache.org/jira/browse/HDFS-15994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17340857#comment-17340857 ] Qi Zhu edited comment on HDFS-15994 at 5/7/21, 2:08 PM: Thanks a lot [~hexiaoqiao] for reply. The deletion process: 1)Namespace Delete:remove the file related meta from Inode Tree 2)Remove Block: remove the blocks from BlockMap, and add the blocks to InvalidateBlocks 3)Waiting ReplicationMonitor to trigger Delete Work,send heartbeat to DN for deleting The 2 step dominant about 90% RPC handler for deletion, and the 3 step are async, will not affect the RPC handler. About the using `release lock - sleep - acquire lock` to avoid NameNode hang for long time, i am not sure how to avoid lock too busy so just give this choice to release. And about the multi-thread deletion may use too many handler, we can make the deletion async to release the handler in a following Jira. For this jira, we can discuss how to avoid the lock too busy beside the `release lock - sleep - acquire lock` choice. cc [~weichiu] [~sodonnell] [~ayushtkn] What's your opinions ? Thanks. was (Author: zhuqi): Thanks a lot [~hexiaoqiao] for reply. The deletion process: 1)Namespace Delete:remove the file related meta from Inode Tree; 2)Remove Block: remove the blocks from BlockMap, and add the blocks to InvalidateBlocks. 3)Waiting ReplicationMonitor to trigger Delete Work,send heartbeat to DN for deleting。 The 2 step dominant about 90% RPC handler for deletion, and the 3 step are async, will not affect the RPC handler. About the using `release lock - sleep - acquire lock` to avoid NameNode hang for long time, i am not sure how to avoid lock too busy so just give this choice to release. And about the multi-thread deletion may use too many handler, we can make the deletion async to release the handler in a following Jira. For this jira, we can discuss how to avoid the lock too busy beside the `release lock - sleep - acquire lock` choice. cc [~weichiu] [~sodonnell] [~ayushtkn] What's your opinions ? Thanks. > Deletion should sleep some time, when there are too many pending deletion > blocks. > - > > Key: HDFS-15994 > URL: https://issues.apache.org/jira/browse/HDFS-15994 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Major > Attachments: HDFS-15994.001.patch > > > HDFS-13831 realize that we can control the frequency of other waiters to get > the lock chance. > But actually in our big cluster with heavy deletion: > The problem still happened, and the pending deletion blocks will be more > than ten million somtimes, and the size become more than 1 million in regular > in huge clusters. > So i think we should sleep for some time when pending too many deletion > blocks. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15994) Deletion should sleep some time, when there are too many pending deletion blocks.
[ https://issues.apache.org/jira/browse/HDFS-15994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17340857#comment-17340857 ] Qi Zhu commented on HDFS-15994: --- Thanks a lot [~hexiaoqiao] for reply. The deletion process: 1)Namespace Delete:remove the file related meta from Inode Tree; 2)Remove Block: remove the blocks from BlockMap, and add the blocks to InvalidateBlocks. 3)Waiting ReplicationMonitor to trigger Delete Work,send heartbeat to DN for deleting。 The 2 step dominant about 90% RPC handler for deletion, and the 3 step are async, will not affect the RPC handler. About the using `release lock - sleep - acquire lock` to avoid NameNode hang for long time, i am not sure how to avoid lock too busy so just give this choice to release. And about the multi-thread deletion may use too many handler, we can make the deletion async to release the handler in a following Jira. For this jira, we can discuss how to avoid the lock too busy beside the `release lock - sleep - acquire lock` choice. cc [~weichiu] [~sodonnell] [~ayushtkn] What's your opinions ? Thanks. > Deletion should sleep some time, when there are too many pending deletion > blocks. > - > > Key: HDFS-15994 > URL: https://issues.apache.org/jira/browse/HDFS-15994 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Major > Attachments: HDFS-15994.001.patch > > > HDFS-13831 realize that we can control the frequency of other waiters to get > the lock chance. > But actually in our big cluster with heavy deletion: > The problem still happened, and the pending deletion blocks will be more > than ten million somtimes, and the size become more than 1 million in regular > in huge clusters. > So i think we should sleep for some time when pending too many deletion > blocks. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15878) RBF: Flaky test TestRouterWebHDFSContractCreate>AbstractContractCreateTest#testSyncable in Trunk
[ https://issues.apache.org/jira/browse/HDFS-15878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17340822#comment-17340822 ] Xiaoqiao He commented on HDFS-15878: Hi [~fengnanli], Just found that this issue is marked as `resolved` but without fix version and no attach patches or PR link. Is it more proper to mark as `Not A Problem`? Thanks. > RBF: Flaky test > TestRouterWebHDFSContractCreate>AbstractContractCreateTest#testSyncable in > Trunk > > > Key: HDFS-15878 > URL: https://issues.apache.org/jira/browse/HDFS-15878 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs, rbf >Reporter: Renukaprasad C >Assignee: Fengnan Li >Priority: Major > > ERROR] Tests run: 16, Failures: 0, Errors: 1, Skipped: 2, Time elapsed: > 24.627 s <<< FAILURE! - in > org.apache.hadoop.fs.contract.router.web.TestRouterWebHDFSContractCreate > [ERROR] > testSyncable(org.apache.hadoop.fs.contract.router.web.TestRouterWebHDFSContractCreate) > Time elapsed: 0.222 s <<< ERROR! > java.io.FileNotFoundException: File /test/testSyncable not found. > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at > org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:121) > at > org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:110) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toIOException(WebHdfsFileSystem.java:576) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$900(WebHdfsFileSystem.java:146) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.shouldRetry(WebHdfsFileSystem.java:892) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:858) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:652) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:690) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:686) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$ReadRunner.getRedirectedUrl(WebHdfsFileSystem.java:2307) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$ReadRunner.(WebHdfsFileSystem.java:2296) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$WebHdfsInputStream.(WebHdfsFileSystem.java:2176) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.open(WebHdfsFileSystem.java:1610) > at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:975) > at > org.apache.hadoop.fs.contract.AbstractContractCreateTest.validateSyncableSemantics(AbstractContractCreateTest.java:556) > at > org.apache.hadoop.fs.contract.AbstractContractCreateTest.testSyncable(AbstractContractCreateTest.java:459) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.lang.Thread.run(Thread.java:748) > Caused by: >
[jira] [Commented] (HDFS-15968) Improve the log for The DecayRpcScheduler
[ https://issues.apache.org/jira/browse/HDFS-15968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17340820#comment-17340820 ] Xiaoqiao He commented on HDFS-15968: +1 for v001. The failed unit test seems not related to this changes. Thanks [~bpatel] for your works. > Improve the log for The DecayRpcScheduler > -- > > Key: HDFS-15968 > URL: https://issues.apache.org/jira/browse/HDFS-15968 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Bhavik Patel >Assignee: Bhavik Patel >Priority: Minor > Attachments: HDFS-15968.001.patch > > > Improve the log for The DecayRpcScheduler to make use of the SELF4j logger > factory -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Issue Comment Deleted] (HDFS-14986) ReplicaCachingGetSpaceUsed throws ConcurrentModificationException
[ https://issues.apache.org/jira/browse/HDFS-14986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lei w updated HDFS-14986: - Comment: was deleted (was: hi [~Aiphag0] , Could you explain how you solve the deadlock problem in briefly.) > ReplicaCachingGetSpaceUsed throws ConcurrentModificationException > -- > > Key: HDFS-14986 > URL: https://issues.apache.org/jira/browse/HDFS-14986 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, performance >Affects Versions: 2.10.0 >Reporter: Ryan Wu >Assignee: Aiphago >Priority: Major > Fix For: 3.3.0, 3.1.4, 2.10.1 > > Attachments: HDFS-14986.001.patch, HDFS-14986.002.patch, > HDFS-14986.003.patch, HDFS-14986.004.patch, HDFS-14986.005.patch, > HDFS-14986.006.patch > > > Running DU across lots of disks is very expensive . We applied the patch > HDFS-14313 to get used space from ReplicaInfo in memory.However, new du > threads throw the exception > {code:java} > // 2019-11-08 18:07:13,858 ERROR > [refreshUsed-/home/vipshop/hard_disk/7/dfs/dn/current/BP-1203969992--1450855658517] > > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed: > ReplicaCachingGetSpaceUsed refresh error > java.util.ConcurrentModificationException: Tree has been modified outside of > iterator > at > org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.checkForModification(FoldedTreeSet.java:311) > > at > org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.hasNext(FoldedTreeSet.java:256) > > at java.util.AbstractCollection.addAll(AbstractCollection.java:343) > at java.util.HashSet.(HashSet.java:120) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.deepCopyReplica(FsDatasetImpl.java:1052) > > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed.refresh(ReplicaCachingGetSpaceUsed.java:73) > > at > org.apache.hadoop.fs.CachingGetSpaceUsed$RefreshThread.run(CachingGetSpaceUsed.java:178) > > at java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16009) HDFS tiered storage support
[ https://issues.apache.org/jira/browse/HDFS-16009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoqiao He resolved HDFS-16009. Resolution: Duplicate Close this issue based on above comments. Please feel free to reopen it if need to improve it. Thanks [~LeonG] and [~jianghuazhu]. > HDFS tiered storage support > --- > > Key: HDFS-16009 > URL: https://issues.apache.org/jira/browse/HDFS-16009 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: JiangHua Zhu >Priority: Major > > HDFS currently stores data blocks on different data nodes. These data nodes > are random within the scope of the network topology (local rack/different > racks/node group). > When the NameNode allocates DataNodes to the Block, it is randomly selected, > that is, each DataNode is selected equally. > We should make a hierarchical strategy. For example, DataNode1, DataNode2, > and DataNode3 are the same level, and DataNode4 and DataNode5 are the same > level. Because the hardware between DataNodes may be different, in this case, > it is necessary for us to do tiered storage, so that DataNodes with poor > hardware can store up to 50%, and 40% or even lower. This is conducive to > cluster maintenance and stability. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14986) ReplicaCachingGetSpaceUsed throws ConcurrentModificationException
[ https://issues.apache.org/jira/browse/HDFS-14986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17340787#comment-17340787 ] lei w commented on HDFS-14986: -- hi [~Aiphag0] , Could you explain how you solve the deadlock problem in briefly. > ReplicaCachingGetSpaceUsed throws ConcurrentModificationException > -- > > Key: HDFS-14986 > URL: https://issues.apache.org/jira/browse/HDFS-14986 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, performance >Affects Versions: 2.10.0 >Reporter: Ryan Wu >Assignee: Aiphago >Priority: Major > Fix For: 3.3.0, 3.1.4, 2.10.1 > > Attachments: HDFS-14986.001.patch, HDFS-14986.002.patch, > HDFS-14986.003.patch, HDFS-14986.004.patch, HDFS-14986.005.patch, > HDFS-14986.006.patch > > > Running DU across lots of disks is very expensive . We applied the patch > HDFS-14313 to get used space from ReplicaInfo in memory.However, new du > threads throw the exception > {code:java} > // 2019-11-08 18:07:13,858 ERROR > [refreshUsed-/home/vipshop/hard_disk/7/dfs/dn/current/BP-1203969992--1450855658517] > > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed: > ReplicaCachingGetSpaceUsed refresh error > java.util.ConcurrentModificationException: Tree has been modified outside of > iterator > at > org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.checkForModification(FoldedTreeSet.java:311) > > at > org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.hasNext(FoldedTreeSet.java:256) > > at java.util.AbstractCollection.addAll(AbstractCollection.java:343) > at java.util.HashSet.(HashSet.java:120) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.deepCopyReplica(FsDatasetImpl.java:1052) > > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed.refresh(ReplicaCachingGetSpaceUsed.java:73) > > at > org.apache.hadoop.fs.CachingGetSpaceUsed$RefreshThread.run(CachingGetSpaceUsed.java:178) > > at java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-16013) DirectoryScan operation holds dataset lock for long time
Renukaprasad C created HDFS-16013: - Summary: DirectoryScan operation holds dataset lock for long time Key: HDFS-16013 URL: https://issues.apache.org/jira/browse/HDFS-16013 Project: Hadoop HDFS Issue Type: Bug Reporter: Renukaprasad C Environment: 3 Node cluster with around 2M files & same number of blocks. All file operations are normal, only during directory scan, which take more memory and some long GC Pause. This directory scan happens for every 6H (default value) which cause slow response to any file operations. Delay is around 5-8 seconds (In production this delay got increased to 30+ seconds with 8M blocks) GC Configuration: -Xms6144M -Xmx12288M /8G -XX:NewSize=614M -XX:MaxNewSize=1228M -XX:MetaspaceSize=128M -XX:MaxMetaspaceSize=128M -XX:CMSFullGCsBeforeCompaction=1 -XX:MaxDirectMemorySize=1G -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:+UseCMSCompactAtFullCollection -XX:CMSInitiatingOccupancyFraction=80 Also we tried with G1 GC, but couldnt find much difference in the result. -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -XX:InitiatingHeapOccupancyPercent=45 -XX:G1ReservePercent=10 {code:java} 2021-05-07 16:32:23,508 INFO org.apache.hadoop.hdfs.server.datanode.DirectoryScanner: BlockPool BP-345634799--1619695417333 Total blocks: 2767211, missing metadata files: 22, missing block files: 22, missing blocks in memory: 0, mismatched blocks: 0 2021-05-07 16:32:23,508 WARN org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Lock held time above threshold: lock identifier: FsDatasetRWLock lockHeldTimeMs=7061 ms. Suppressed 0 lock warnings. The stack trace is: java.lang.Thread.getStackTrace(Thread.java:1559) org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032) org.apache.hadoop.util.InstrumentedLock.logWarning(InstrumentedLock.java:148) org.apache.hadoop.util.InstrumentedLock.check(InstrumentedLock.java:186) org.apache.hadoop.util.InstrumentedReadLock.unlock(InstrumentedReadLock.java:78) org.apache.hadoop.util.AutoCloseableLock.release(AutoCloseableLock.java:84) org.apache.hadoop.util.AutoCloseableLock.close(AutoCloseableLock.java:96) org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.scan(DirectoryScanner.java:539) org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile(DirectoryScanner.java:416) org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.run(DirectoryScanner.java:359) java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) {code} We have the following Jiras our code already. But still facing long lock held. - https://issues.apache.org/jira/browse/HDFS-15621, https://issues.apache.org/jira/browse/HDFS-15150, https://issues.apache.org/jira/browse/HDFS-15160, https://issues.apache.org/jira/browse/HDFS-13947 cc: [~brahma] [~belugabehr] [~sodonnell] [~ayushsaxena] [~weichiu] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15994) Deletion should sleep some time, when there are too many pending deletion blocks.
[ https://issues.apache.org/jira/browse/HDFS-15994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17340786#comment-17340786 ] Xiaoqiao He commented on HDFS-15994: Thanks [~zhuqi] for your report and contributions. I am totally +1 for this proposal. I have met NameNode hang for long time due to process delete request about very large directory many times. IIRC, some other guys try to improve it but not push it forward completely. IMO it is proper time to do that again. IIUC, there are two segments to hold global lock for long time and need to improve. a. traverse the whole sub-directory and collect the pending deleted blocks. b. execute to delete blocks. It could be one choice using `release lock - sleep - acquire lock` to avoid NameNode hang for long time. I am not sure if it is a best solution, welcome deep discussion. For patch v001, I am not fans to add extra configuration entry for every improvement because there are so many configuration and it is more and more confused to end user. I think we could give static parameter in this case. Thanks [~zhuqi] again. > Deletion should sleep some time, when there are too many pending deletion > blocks. > - > > Key: HDFS-15994 > URL: https://issues.apache.org/jira/browse/HDFS-15994 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Major > Attachments: HDFS-15994.001.patch > > > HDFS-13831 realize that we can control the frequency of other waiters to get > the lock chance. > But actually in our big cluster with heavy deletion: > The problem still happened, and the pending deletion blocks will be more > than ten million somtimes, and the size become more than 1 million in regular > in huge clusters. > So i think we should sleep for some time when pending too many deletion > blocks. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16012) Improve DataXceiver#copyBlock() log printing
[ https://issues.apache.org/jira/browse/HDFS-16012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDFS-16012: -- Labels: pull-request-available (was: ) > Improve DataXceiver#copyBlock() log printing > > > Key: HDFS-16012 > URL: https://issues.apache.org/jira/browse/HDFS-16012 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: JiangHua Zhu >Assignee: JiangHua Zhu >Priority: Major > Labels: pull-request-available > Attachments: image-2021-05-07-20-18-45-523.png > > Time Spent: 10m > Remaining Estimate: 0h > > When executing DataXceiver#copyBlock(), if an exception is encountered, the > following will be printed: > !image-2021-05-07-20-18-45-523.png! > The information printed here is not conducive to assist in troubleshooting. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16007) Vulnerabilities found when serializing enum value
[ https://issues.apache.org/jira/browse/HDFS-16007?focusedWorklogId=593265=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-593265 ] ASF GitHub Bot logged work on HDFS-16007: - Author: ASF GitHub Bot Created on: 07/May/21 12:40 Start Date: 07/May/21 12:40 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2982: URL: https://github.com/apache/hadoop/pull/2982#issuecomment-834341268 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 33s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 34m 39s | | trunk passed | | +1 :green_heart: | compile | 1m 23s | | trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | compile | 1m 16s | | trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 | | +1 :green_heart: | checkstyle | 1m 1s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 22s | | trunk passed | | +1 :green_heart: | javadoc | 0m 57s | | trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javadoc | 1m 24s | | trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 | | +1 :green_heart: | spotbugs | 3m 6s | | trunk passed | | +1 :green_heart: | shadedclient | 16m 13s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 12s | | the patch passed | | +1 :green_heart: | compile | 1m 16s | | the patch passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javac | 1m 16s | | the patch passed | | +1 :green_heart: | compile | 1m 10s | | the patch passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 | | +1 :green_heart: | javac | 1m 10s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 0m 52s | | hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 40 unchanged - 1 fixed = 40 total (was 41) | | +1 :green_heart: | mvnsite | 1m 14s | | the patch passed | | +1 :green_heart: | javadoc | 0m 46s | | the patch passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javadoc | 1m 16s | | the patch passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 | | +1 :green_heart: | spotbugs | 3m 8s | | the patch passed | | +1 :green_heart: | shadedclient | 16m 12s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | -1 :x: | unit | 230m 55s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2982/5/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 46s | | The patch does not generate ASF License warnings. | | | | 318m 35s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks | | | hadoop.hdfs.qjournal.server.TestJournalNodeRespectsBindHostKeys | | | hadoop.hdfs.server.namenode.snapshot.TestNestedSnapshots | | | hadoop.hdfs.server.datanode.TestDirectoryScanner | | | hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2982/5/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2982 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell | | uname | Linux 32b7e03c7fee 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 5014857973bfcbfdaf7d98978411f417e1be8d87 | | Default Java | Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 | | Multi-JDK versions
[jira] [Work logged] (HDFS-16012) Improve DataXceiver#copyBlock() log printing
[ https://issues.apache.org/jira/browse/HDFS-16012?focusedWorklogId=593266=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-593266 ] ASF GitHub Bot logged work on HDFS-16012: - Author: ASF GitHub Bot Created on: 07/May/21 12:40 Start Date: 07/May/21 12:40 Worklog Time Spent: 10m Work Description: jianghuazhu opened a new pull request #2988: URL: https://github.com/apache/hadoop/pull/2988 ## NOTICE Please create an issue in ASF JIRA before opening a pull request, and you need to set the title of the pull request which starts with the corresponding JIRA issue number. (e.g. HADOOP-X. Fix a typo in YYY.) For more details, please see https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 593266) Remaining Estimate: 0h Time Spent: 10m > Improve DataXceiver#copyBlock() log printing > > > Key: HDFS-16012 > URL: https://issues.apache.org/jira/browse/HDFS-16012 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: JiangHua Zhu >Assignee: JiangHua Zhu >Priority: Major > Attachments: image-2021-05-07-20-18-45-523.png > > Time Spent: 10m > Remaining Estimate: 0h > > When executing DataXceiver#copyBlock(), if an exception is encountered, the > following will be printed: > !image-2021-05-07-20-18-45-523.png! > The information printed here is not conducive to assist in troubleshooting. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16011) HDFS: Support viewfs nested mount
[ https://issues.apache.org/jira/browse/HDFS-16011?focusedWorklogId=593258=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-593258 ] ASF GitHub Bot logged work on HDFS-16011: - Author: ASF GitHub Bot Created on: 07/May/21 12:28 Start Date: 07/May/21 12:28 Worklog Time Spent: 10m Work Description: zhuxiangyi opened a new pull request #2987: URL: https://github.com/apache/hadoop/pull/2987 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 593258) Remaining Estimate: 0h Time Spent: 10m > HDFS: Support viewfs nested mount > - > > Key: HDFS-16011 > URL: https://issues.apache.org/jira/browse/HDFS-16011 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 3.2.2, 3.3.1 >Reporter: zhu >Assignee: zhu >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > The current viewfs does not support nested mount points. The current viewfs > does not support nested mount points. > *E.g:* > 1./home/ => /home/ > 2./home/work => /home/work > If mount point 1 is loaded, mount point 2 cannot be added, and the following > exception will be thrown when loading 2. > {code:java} > throw new FileAlreadyExistsException("Path " + nextInode.fullPath + > " already exists as link"); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16011) HDFS: Support viewfs nested mount
[ https://issues.apache.org/jira/browse/HDFS-16011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDFS-16011: -- Labels: pull-request-available (was: ) > HDFS: Support viewfs nested mount > - > > Key: HDFS-16011 > URL: https://issues.apache.org/jira/browse/HDFS-16011 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 3.2.2, 3.3.1 >Reporter: zhu >Assignee: zhu >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > The current viewfs does not support nested mount points. The current viewfs > does not support nested mount points. > *E.g:* > 1./home/ => /home/ > 2./home/work => /home/work > If mount point 1 is loaded, mount point 2 cannot be added, and the following > exception will be thrown when loading 2. > {code:java} > throw new FileAlreadyExistsException("Path " + nextInode.fullPath + > " already exists as link"); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16006) TestRouterFederationRename is flaky
[ https://issues.apache.org/jira/browse/HDFS-16006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17340776#comment-17340776 ] Xiaoqiao He commented on HDFS-16006: [~LiJinglun] do you mind have another check? Thanks. > TestRouterFederationRename is flaky > --- > > Key: HDFS-16006 > URL: https://issues.apache.org/jira/browse/HDFS-16006 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf >Reporter: Akira Ajisaka >Priority: Major > Attachments: patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt > > > {quote} > [ERROR] Errors: > [ERROR] > TestRouterFederationRename.testCounter:440->Object.wait:502->Object.wait:-2 ? > TestTimedOut > [ERROR] TestRouterFederationRename.testSetup:145 ? Remote The directory > /src cannot be... > [ERROR] TestRouterFederationRename.testSetup:145 ? Remote The directory > /src cannot be... > {quote} > https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2970/2/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-16012) Improve DataXceiver#copyBlock() log printing
[ https://issues.apache.org/jira/browse/HDFS-16012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] JiangHua Zhu reassigned HDFS-16012: --- Assignee: JiangHua Zhu > Improve DataXceiver#copyBlock() log printing > > > Key: HDFS-16012 > URL: https://issues.apache.org/jira/browse/HDFS-16012 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: JiangHua Zhu >Assignee: JiangHua Zhu >Priority: Major > Attachments: image-2021-05-07-20-18-45-523.png > > > When executing DataXceiver#copyBlock(), if an exception is encountered, the > following will be printed: > !image-2021-05-07-20-18-45-523.png! > The information printed here is not conducive to assist in troubleshooting. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-16012) Improve DataXceiver#copyBlock() log printing
JiangHua Zhu created HDFS-16012: --- Summary: Improve DataXceiver#copyBlock() log printing Key: HDFS-16012 URL: https://issues.apache.org/jira/browse/HDFS-16012 Project: Hadoop HDFS Issue Type: Improvement Reporter: JiangHua Zhu Attachments: image-2021-05-07-20-18-45-523.png When executing DataXceiver#copyBlock(), if an exception is encountered, the following will be printed: !image-2021-05-07-20-18-45-523.png! The information printed here is not conducive to assist in troubleshooting. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15757) RBF: Improving Router Connection Management
[ https://issues.apache.org/jira/browse/HDFS-15757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17340768#comment-17340768 ] Xiaoqiao He commented on HDFS-15757: Thanks [~fengnanli] for your pings and sorry for late response since I'm on vacation last week. In my production env, I have created another proxyCache replace to ConnectionManager. The result is expected and the TP99 cost of each RPC dispatch is less than 10ms which is far less than using ConnectionManager. I have to state that this improvement is based on our internal branch which is different to community branch such as async RPC dispatch etc. And the request mode could be very different to common case (ugi count is very large and hit rate is very low when using ConnectionManager). For this improvement, I think it is ready to enter trunk based on our staging env benchmark when backport it into my internal branch. I do not get the original thought to involve the ConnectionManager for Router after trace the history, So I wonder if ConnectionManager is effective and is it necessary for Router? Please correct me if something I missed. Thanks. cc [~elgoiri],[~ayushtkn] > RBF: Improving Router Connection Management > --- > > Key: HDFS-15757 > URL: https://issues.apache.org/jira/browse/HDFS-15757 > Project: Hadoop HDFS > Issue Type: Improvement > Components: rbf >Reporter: Fengnan Li >Assignee: Fengnan Li >Priority: Major > Labels: pull-request-available > Attachments: RBF_ Improving Router Connection Management_v2.pdf, RBF_ > Improving Router Connection Management_v3.pdf, RBF_ Router Connection > Management.pdf > > Time Spent: 3h 10m > Remaining Estimate: 0h > > We have seen high number of connections from Router to namenodes, leaving > namenodes unstable. > This ticket is trying to reduce connections through some changes. Please take > a look at the design and leave comments. > Thanks! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16011) HDFS: Support viewfs nested mount
[ https://issues.apache.org/jira/browse/HDFS-16011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhu updated HDFS-16011: --- Description: The current viewfs does not support nested mount points. The current viewfs does not support nested mount points. *E.g:* 1./home/ => /home/ 2./home/work => /home/work If mount point 1 is loaded, mount point 2 cannot be added, and the following exception will be thrown when loading 2. {code:java} throw new FileAlreadyExistsException("Path " + nextInode.fullPath + " already exists as link"); {code} was: The current viewfs does not support nested mount points. The current viewfs does not support nested mount points. *E.g:* 1./home/ => /home/ 2./home/work => /home/work If mount point 1 is loaded, mount point 2 cannot be added, and the following exception will be thrown when loading 2 {code:java} throw new FileAlreadyExistsException("Path " + nextInode.fullPath + " already exists as link"); {code} > HDFS: Support viewfs nested mount > - > > Key: HDFS-16011 > URL: https://issues.apache.org/jira/browse/HDFS-16011 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 3.2.2, 3.3.1 >Reporter: zhu >Assignee: zhu >Priority: Major > > The current viewfs does not support nested mount points. The current viewfs > does not support nested mount points. > *E.g:* > 1./home/ => /home/ > 2./home/work => /home/work > If mount point 1 is loaded, mount point 2 cannot be added, and the following > exception will be thrown when loading 2. > {code:java} > throw new FileAlreadyExistsException("Path " + nextInode.fullPath + > " already exists as link"); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-16011) HDFS: Support viewfs nested mount
zhu created HDFS-16011: -- Summary: HDFS: Support viewfs nested mount Key: HDFS-16011 URL: https://issues.apache.org/jira/browse/HDFS-16011 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs Affects Versions: 3.2.2, 3.3.1 Reporter: zhu Assignee: zhu The current viewfs does not support nested mount points. The current viewfs does not support nested mount points. *E.g:* 1./home/ => /home/ 2./home/work => /home/work If mount point 1 is loaded, mount point 2 cannot be added, and the following exception will be thrown when loading 2 {code:java} throw new FileAlreadyExistsException("Path " + nextInode.fullPath + " already exists as link"); {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Issue Comment Deleted] (HDFS-14986) ReplicaCachingGetSpaceUsed throws ConcurrentModificationException
[ https://issues.apache.org/jira/browse/HDFS-14986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lei w updated HDFS-14986: - Comment: was deleted (was: hi [~Aiphag0] ,Will hadoop 3.1.2 also appear deadlock as you said above ? https://issues.apache.org/jira/browse/HDFS-14986?focusedCommentId=16973314=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16973314) > ReplicaCachingGetSpaceUsed throws ConcurrentModificationException > -- > > Key: HDFS-14986 > URL: https://issues.apache.org/jira/browse/HDFS-14986 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, performance >Affects Versions: 2.10.0 >Reporter: Ryan Wu >Assignee: Aiphago >Priority: Major > Fix For: 3.3.0, 3.1.4, 2.10.1 > > Attachments: HDFS-14986.001.patch, HDFS-14986.002.patch, > HDFS-14986.003.patch, HDFS-14986.004.patch, HDFS-14986.005.patch, > HDFS-14986.006.patch > > > Running DU across lots of disks is very expensive . We applied the patch > HDFS-14313 to get used space from ReplicaInfo in memory.However, new du > threads throw the exception > {code:java} > // 2019-11-08 18:07:13,858 ERROR > [refreshUsed-/home/vipshop/hard_disk/7/dfs/dn/current/BP-1203969992--1450855658517] > > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed: > ReplicaCachingGetSpaceUsed refresh error > java.util.ConcurrentModificationException: Tree has been modified outside of > iterator > at > org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.checkForModification(FoldedTreeSet.java:311) > > at > org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.hasNext(FoldedTreeSet.java:256) > > at java.util.AbstractCollection.addAll(AbstractCollection.java:343) > at java.util.HashSet.(HashSet.java:120) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.deepCopyReplica(FsDatasetImpl.java:1052) > > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed.refresh(ReplicaCachingGetSpaceUsed.java:73) > > at > org.apache.hadoop.fs.CachingGetSpaceUsed$RefreshThread.run(CachingGetSpaceUsed.java:178) > > at java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Issue Comment Deleted] (HDFS-14986) ReplicaCachingGetSpaceUsed throws ConcurrentModificationException
[ https://issues.apache.org/jira/browse/HDFS-14986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lei w updated HDFS-14986: - Comment: was deleted (was: hi [~Aiphag0] , Will hadoop 3.1.2 also appear deadlock as you said above ?[comment-16973314](https://issues.apache.org/jira/browse/HDFS-14986?focusedCommentId=16973314=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16973314)) > ReplicaCachingGetSpaceUsed throws ConcurrentModificationException > -- > > Key: HDFS-14986 > URL: https://issues.apache.org/jira/browse/HDFS-14986 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, performance >Affects Versions: 2.10.0 >Reporter: Ryan Wu >Assignee: Aiphago >Priority: Major > Fix For: 3.3.0, 3.1.4, 2.10.1 > > Attachments: HDFS-14986.001.patch, HDFS-14986.002.patch, > HDFS-14986.003.patch, HDFS-14986.004.patch, HDFS-14986.005.patch, > HDFS-14986.006.patch > > > Running DU across lots of disks is very expensive . We applied the patch > HDFS-14313 to get used space from ReplicaInfo in memory.However, new du > threads throw the exception > {code:java} > // 2019-11-08 18:07:13,858 ERROR > [refreshUsed-/home/vipshop/hard_disk/7/dfs/dn/current/BP-1203969992--1450855658517] > > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed: > ReplicaCachingGetSpaceUsed refresh error > java.util.ConcurrentModificationException: Tree has been modified outside of > iterator > at > org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.checkForModification(FoldedTreeSet.java:311) > > at > org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.hasNext(FoldedTreeSet.java:256) > > at java.util.AbstractCollection.addAll(AbstractCollection.java:343) > at java.util.HashSet.(HashSet.java:120) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.deepCopyReplica(FsDatasetImpl.java:1052) > > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed.refresh(ReplicaCachingGetSpaceUsed.java:73) > > at > org.apache.hadoop.fs.CachingGetSpaceUsed$RefreshThread.run(CachingGetSpaceUsed.java:178) > > at java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14986) ReplicaCachingGetSpaceUsed throws ConcurrentModificationException
[ https://issues.apache.org/jira/browse/HDFS-14986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17340751#comment-17340751 ] lei w commented on HDFS-14986: -- hi [~Aiphag0] ,Will hadoop 3.1.2 also appear deadlock as you said above ? https://issues.apache.org/jira/browse/HDFS-14986?focusedCommentId=16973314=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16973314 > ReplicaCachingGetSpaceUsed throws ConcurrentModificationException > -- > > Key: HDFS-14986 > URL: https://issues.apache.org/jira/browse/HDFS-14986 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, performance >Affects Versions: 2.10.0 >Reporter: Ryan Wu >Assignee: Aiphago >Priority: Major > Fix For: 3.3.0, 3.1.4, 2.10.1 > > Attachments: HDFS-14986.001.patch, HDFS-14986.002.patch, > HDFS-14986.003.patch, HDFS-14986.004.patch, HDFS-14986.005.patch, > HDFS-14986.006.patch > > > Running DU across lots of disks is very expensive . We applied the patch > HDFS-14313 to get used space from ReplicaInfo in memory.However, new du > threads throw the exception > {code:java} > // 2019-11-08 18:07:13,858 ERROR > [refreshUsed-/home/vipshop/hard_disk/7/dfs/dn/current/BP-1203969992--1450855658517] > > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed: > ReplicaCachingGetSpaceUsed refresh error > java.util.ConcurrentModificationException: Tree has been modified outside of > iterator > at > org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.checkForModification(FoldedTreeSet.java:311) > > at > org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.hasNext(FoldedTreeSet.java:256) > > at java.util.AbstractCollection.addAll(AbstractCollection.java:343) > at java.util.HashSet.(HashSet.java:120) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.deepCopyReplica(FsDatasetImpl.java:1052) > > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed.refresh(ReplicaCachingGetSpaceUsed.java:73) > > at > org.apache.hadoop.fs.CachingGetSpaceUsed$RefreshThread.run(CachingGetSpaceUsed.java:178) > > at java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14986) ReplicaCachingGetSpaceUsed throws ConcurrentModificationException
[ https://issues.apache.org/jira/browse/HDFS-14986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17340750#comment-17340750 ] lei w commented on HDFS-14986: -- hi [~Aiphag0] , Will hadoop 3.1.2 also appear deadlock as you said above ?[comment-16973314](https://issues.apache.org/jira/browse/HDFS-14986?focusedCommentId=16973314=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16973314) > ReplicaCachingGetSpaceUsed throws ConcurrentModificationException > -- > > Key: HDFS-14986 > URL: https://issues.apache.org/jira/browse/HDFS-14986 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, performance >Affects Versions: 2.10.0 >Reporter: Ryan Wu >Assignee: Aiphago >Priority: Major > Fix For: 3.3.0, 3.1.4, 2.10.1 > > Attachments: HDFS-14986.001.patch, HDFS-14986.002.patch, > HDFS-14986.003.patch, HDFS-14986.004.patch, HDFS-14986.005.patch, > HDFS-14986.006.patch > > > Running DU across lots of disks is very expensive . We applied the patch > HDFS-14313 to get used space from ReplicaInfo in memory.However, new du > threads throw the exception > {code:java} > // 2019-11-08 18:07:13,858 ERROR > [refreshUsed-/home/vipshop/hard_disk/7/dfs/dn/current/BP-1203969992--1450855658517] > > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed: > ReplicaCachingGetSpaceUsed refresh error > java.util.ConcurrentModificationException: Tree has been modified outside of > iterator > at > org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.checkForModification(FoldedTreeSet.java:311) > > at > org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.hasNext(FoldedTreeSet.java:256) > > at java.util.AbstractCollection.addAll(AbstractCollection.java:343) > at java.util.HashSet.(HashSet.java:120) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.deepCopyReplica(FsDatasetImpl.java:1052) > > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed.refresh(ReplicaCachingGetSpaceUsed.java:73) > > at > org.apache.hadoop.fs.CachingGetSpaceUsed$RefreshThread.run(CachingGetSpaceUsed.java:178) > > at java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14986) ReplicaCachingGetSpaceUsed throws ConcurrentModificationException
[ https://issues.apache.org/jira/browse/HDFS-14986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17340749#comment-17340749 ] lei w commented on HDFS-14986: -- hi [~Aiphag0] , hadoop 3.1.2 also appear deadlock as you said above [comment-16973314](https://issues.apache.org/jira/browse/HDFS-14986?focusedCommentId=16973314=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16973314) > ReplicaCachingGetSpaceUsed throws ConcurrentModificationException > -- > > Key: HDFS-14986 > URL: https://issues.apache.org/jira/browse/HDFS-14986 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, performance >Affects Versions: 2.10.0 >Reporter: Ryan Wu >Assignee: Aiphago >Priority: Major > Fix For: 3.3.0, 3.1.4, 2.10.1 > > Attachments: HDFS-14986.001.patch, HDFS-14986.002.patch, > HDFS-14986.003.patch, HDFS-14986.004.patch, HDFS-14986.005.patch, > HDFS-14986.006.patch > > > Running DU across lots of disks is very expensive . We applied the patch > HDFS-14313 to get used space from ReplicaInfo in memory.However, new du > threads throw the exception > {code:java} > // 2019-11-08 18:07:13,858 ERROR > [refreshUsed-/home/vipshop/hard_disk/7/dfs/dn/current/BP-1203969992--1450855658517] > > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed: > ReplicaCachingGetSpaceUsed refresh error > java.util.ConcurrentModificationException: Tree has been modified outside of > iterator > at > org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.checkForModification(FoldedTreeSet.java:311) > > at > org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.hasNext(FoldedTreeSet.java:256) > > at java.util.AbstractCollection.addAll(AbstractCollection.java:343) > at java.util.HashSet.(HashSet.java:120) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.deepCopyReplica(FsDatasetImpl.java:1052) > > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed.refresh(ReplicaCachingGetSpaceUsed.java:73) > > at > org.apache.hadoop.fs.CachingGetSpaceUsed$RefreshThread.run(CachingGetSpaceUsed.java:178) > > at java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Issue Comment Deleted] (HDFS-14986) ReplicaCachingGetSpaceUsed throws ConcurrentModificationException
[ https://issues.apache.org/jira/browse/HDFS-14986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lei w updated HDFS-14986: - Comment: was deleted (was: hi [~Aiphag0] , hadoop 3.1.2 also appear deadlock as you said above [comment-16973314](https://issues.apache.org/jira/browse/HDFS-14986?focusedCommentId=16973314=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16973314)) > ReplicaCachingGetSpaceUsed throws ConcurrentModificationException > -- > > Key: HDFS-14986 > URL: https://issues.apache.org/jira/browse/HDFS-14986 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, performance >Affects Versions: 2.10.0 >Reporter: Ryan Wu >Assignee: Aiphago >Priority: Major > Fix For: 3.3.0, 3.1.4, 2.10.1 > > Attachments: HDFS-14986.001.patch, HDFS-14986.002.patch, > HDFS-14986.003.patch, HDFS-14986.004.patch, HDFS-14986.005.patch, > HDFS-14986.006.patch > > > Running DU across lots of disks is very expensive . We applied the patch > HDFS-14313 to get used space from ReplicaInfo in memory.However, new du > threads throw the exception > {code:java} > // 2019-11-08 18:07:13,858 ERROR > [refreshUsed-/home/vipshop/hard_disk/7/dfs/dn/current/BP-1203969992--1450855658517] > > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed: > ReplicaCachingGetSpaceUsed refresh error > java.util.ConcurrentModificationException: Tree has been modified outside of > iterator > at > org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.checkForModification(FoldedTreeSet.java:311) > > at > org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.hasNext(FoldedTreeSet.java:256) > > at java.util.AbstractCollection.addAll(AbstractCollection.java:343) > at java.util.HashSet.(HashSet.java:120) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.deepCopyReplica(FsDatasetImpl.java:1052) > > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed.refresh(ReplicaCachingGetSpaceUsed.java:73) > > at > org.apache.hadoop.fs.CachingGetSpaceUsed$RefreshThread.run(CachingGetSpaceUsed.java:178) > > at java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16009) HDFS tiered storage support
[ https://issues.apache.org/jira/browse/HDFS-16009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17340742#comment-17340742 ] JiangHua Zhu commented on HDFS-16009: - [~LeonG], hope we can communicate often. > HDFS tiered storage support > --- > > Key: HDFS-16009 > URL: https://issues.apache.org/jira/browse/HDFS-16009 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: JiangHua Zhu >Priority: Major > > HDFS currently stores data blocks on different data nodes. These data nodes > are random within the scope of the network topology (local rack/different > racks/node group). > When the NameNode allocates DataNodes to the Block, it is randomly selected, > that is, each DataNode is selected equally. > We should make a hierarchical strategy. For example, DataNode1, DataNode2, > and DataNode3 are the same level, and DataNode4 and DataNode5 are the same > level. Because the hardware between DataNodes may be different, in this case, > it is necessary for us to do tiered storage, so that DataNodes with poor > hardware can store up to 50%, and 40% or even lower. This is conducive to > cluster maintenance and stability. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Issue Comment Deleted] (HDFS-14986) ReplicaCachingGetSpaceUsed throws ConcurrentModificationException
[ https://issues.apache.org/jira/browse/HDFS-14986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lei w updated HDFS-14986: - Comment: was deleted (was: hi Aiphag0 , deadlock will appear in hadoop3.x ?) > ReplicaCachingGetSpaceUsed throws ConcurrentModificationException > -- > > Key: HDFS-14986 > URL: https://issues.apache.org/jira/browse/HDFS-14986 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, performance >Affects Versions: 2.10.0 >Reporter: Ryan Wu >Assignee: Aiphago >Priority: Major > Fix For: 3.3.0, 3.1.4, 2.10.1 > > Attachments: HDFS-14986.001.patch, HDFS-14986.002.patch, > HDFS-14986.003.patch, HDFS-14986.004.patch, HDFS-14986.005.patch, > HDFS-14986.006.patch > > > Running DU across lots of disks is very expensive . We applied the patch > HDFS-14313 to get used space from ReplicaInfo in memory.However, new du > threads throw the exception > {code:java} > // 2019-11-08 18:07:13,858 ERROR > [refreshUsed-/home/vipshop/hard_disk/7/dfs/dn/current/BP-1203969992--1450855658517] > > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed: > ReplicaCachingGetSpaceUsed refresh error > java.util.ConcurrentModificationException: Tree has been modified outside of > iterator > at > org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.checkForModification(FoldedTreeSet.java:311) > > at > org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.hasNext(FoldedTreeSet.java:256) > > at java.util.AbstractCollection.addAll(AbstractCollection.java:343) > at java.util.HashSet.(HashSet.java:120) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.deepCopyReplica(FsDatasetImpl.java:1052) > > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed.refresh(ReplicaCachingGetSpaceUsed.java:73) > > at > org.apache.hadoop.fs.CachingGetSpaceUsed$RefreshThread.run(CachingGetSpaceUsed.java:178) > > at java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14986) ReplicaCachingGetSpaceUsed throws ConcurrentModificationException
[ https://issues.apache.org/jira/browse/HDFS-14986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17340727#comment-17340727 ] lei w commented on HDFS-14986: -- hi Aiphag0 , deadlock will appear in hadoop3.x ? > ReplicaCachingGetSpaceUsed throws ConcurrentModificationException > -- > > Key: HDFS-14986 > URL: https://issues.apache.org/jira/browse/HDFS-14986 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, performance >Affects Versions: 2.10.0 >Reporter: Ryan Wu >Assignee: Aiphago >Priority: Major > Fix For: 3.3.0, 3.1.4, 2.10.1 > > Attachments: HDFS-14986.001.patch, HDFS-14986.002.patch, > HDFS-14986.003.patch, HDFS-14986.004.patch, HDFS-14986.005.patch, > HDFS-14986.006.patch > > > Running DU across lots of disks is very expensive . We applied the patch > HDFS-14313 to get used space from ReplicaInfo in memory.However, new du > threads throw the exception > {code:java} > // 2019-11-08 18:07:13,858 ERROR > [refreshUsed-/home/vipshop/hard_disk/7/dfs/dn/current/BP-1203969992--1450855658517] > > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed: > ReplicaCachingGetSpaceUsed refresh error > java.util.ConcurrentModificationException: Tree has been modified outside of > iterator > at > org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.checkForModification(FoldedTreeSet.java:311) > > at > org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.hasNext(FoldedTreeSet.java:256) > > at java.util.AbstractCollection.addAll(AbstractCollection.java:343) > at java.util.HashSet.(HashSet.java:120) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.deepCopyReplica(FsDatasetImpl.java:1052) > > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed.refresh(ReplicaCachingGetSpaceUsed.java:73) > > at > org.apache.hadoop.fs.CachingGetSpaceUsed$RefreshThread.run(CachingGetSpaceUsed.java:178) > > at java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16008) RBF: Tool to initialize ViewFS Mapping to Router
[ https://issues.apache.org/jira/browse/HDFS-16008?focusedWorklogId=593213=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-593213 ] ASF GitHub Bot logged work on HDFS-16008: - Author: ASF GitHub Bot Created on: 07/May/21 09:57 Start Date: 07/May/21 09:57 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2981: URL: https://github.com/apache/hadoop/pull/2981#issuecomment-834228785 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 50s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +0 :ok: | markdownlint | 0m 1s | | markdownlint was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 33m 50s | | trunk passed | | +1 :green_heart: | compile | 0m 38s | | trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | compile | 0m 33s | | trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | checkstyle | 0m 23s | | trunk passed | | +1 :green_heart: | mvnsite | 0m 38s | | trunk passed | | +1 :green_heart: | javadoc | 0m 37s | | trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javadoc | 0m 51s | | trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | spotbugs | 1m 14s | | trunk passed | | +1 :green_heart: | shadedclient | 17m 1s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 33s | | the patch passed | | +1 :green_heart: | compile | 0m 33s | | the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javac | 0m 33s | | the patch passed | | +1 :green_heart: | compile | 0m 27s | | the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | javac | 0m 27s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 16s | [/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs-rbf.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2981/4/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs-rbf.txt) | hadoop-hdfs-project/hadoop-hdfs-rbf: The patch generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) | | +1 :green_heart: | mvnsite | 0m 31s | | the patch passed | | +1 :green_heart: | javadoc | 0m 30s | | the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javadoc | 0m 45s | | the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | spotbugs | 1m 17s | | the patch passed | | +1 :green_heart: | shadedclient | 16m 41s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 21m 57s | | hadoop-hdfs-rbf in the patch passed. | | +1 :green_heart: | asflicense | 0m 30s | | The patch does not generate ASF License warnings. | | | | 101m 58s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2981/4/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2981 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell markdownlint | | uname | Linux 411e1f12e5ff 4.15.0-128-generic #131-Ubuntu SMP Wed Dec 9 06:57:35 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 7fa083ac87f720663afb7c52953c830e45cefd4a | | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2981/4/testReport/ | | Max.
[jira] [Work logged] (HDFS-16008) RBF: Tool to initialize ViewFS Mapping to Router
[ https://issues.apache.org/jira/browse/HDFS-16008?focusedWorklogId=593188=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-593188 ] ASF GitHub Bot logged work on HDFS-16008: - Author: ASF GitHub Bot Created on: 07/May/21 08:06 Start Date: 07/May/21 08:06 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2981: URL: https://github.com/apache/hadoop/pull/2981#issuecomment-834155907 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 48s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | markdownlint | 0m 0s | | markdownlint was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 33m 38s | | trunk passed | | +1 :green_heart: | compile | 0m 41s | | trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | compile | 0m 33s | | trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | checkstyle | 0m 22s | | trunk passed | | +1 :green_heart: | mvnsite | 0m 39s | | trunk passed | | +1 :green_heart: | javadoc | 0m 37s | | trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javadoc | 0m 51s | | trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | spotbugs | 1m 15s | | trunk passed | | +1 :green_heart: | shadedclient | 17m 2s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 32s | | the patch passed | | +1 :green_heart: | compile | 0m 34s | | the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javac | 0m 34s | | the patch passed | | +1 :green_heart: | compile | 0m 28s | | the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | javac | 0m 28s | | the patch passed | | -1 :x: | blanks | 0m 0s | [/blanks-eol.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2981/3/artifact/out/blanks-eol.txt) | The patch has 1 line(s) that end in blanks. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply | | -0 :warning: | checkstyle | 0m 16s | [/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs-rbf.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2981/3/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs-rbf.txt) | hadoop-hdfs-project/hadoop-hdfs-rbf: The patch generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) | | +1 :green_heart: | mvnsite | 0m 31s | | the patch passed | | +1 :green_heart: | javadoc | 0m 31s | | the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javadoc | 0m 48s | | the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | spotbugs | 1m 19s | | the patch passed | | +1 :green_heart: | shadedclient | 16m 46s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 22m 5s | | hadoop-hdfs-rbf in the patch passed. | | +1 :green_heart: | asflicense | 0m 29s | | The patch does not generate ASF License warnings. | | | | 102m 1s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2981/3/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2981 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell markdownlint | | uname | Linux 19992cf6eb20 4.15.0-128-generic #131-Ubuntu SMP Wed Dec 9 06:57:35 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / e0b806f234d7217010f9ac1d804f08b77025bc70 | | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
[jira] [Commented] (HDFS-15923) RBF: Authentication failed when rename accross sub clusters
[ https://issues.apache.org/jira/browse/HDFS-15923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17340649#comment-17340649 ] Jinglun commented on HDFS-15923: +1 on v03. Waiting one day for further comments. After that I'll commit this. > RBF: Authentication failed when rename accross sub clusters > > > Key: HDFS-15923 > URL: https://issues.apache.org/jira/browse/HDFS-15923 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf >Reporter: zhuobin zheng >Assignee: zhuobin zheng >Priority: Major > Labels: RBF, pull-request-available, rename > Attachments: HDFS-15923.001.patch, HDFS-15923.002.patch, > HDFS-15923.003.patch, HDFS-15923.stack-trace, > hdfs-15923-fix-security-issue.patch > > Time Spent: 1h 20m > Remaining Estimate: 0h > > Rename accross subcluster with RBF and Kerberos environment. Will encounter > the following two errors: > # Save Object to journal. > # Precheck try to get src file status > So, we need use Router Login UGI doAs create DistcpProcedure and > TrashProcedure and submit Job. > > Beside, we should check user permission for src and dst path in router side > before do rename internal. (HDFS-15973) > First: Save Object to journal. > {code:java} > // code placeholder > 2021-03-23 14:01:16,233 WARN org.apache.hadoop.ipc.Client: Exception > encountered while connecting to the server > javax.security.sasl.SaslException: GSS initiate failed [Caused by > GSSException: No valid credentials provided (Mechanism level: Failed to find > any Kerberos tgt)] > at > com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211) > at > org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:408) > at > org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:622) > at > org.apache.hadoop.ipc.Client$Connection.access$2300(Client.java:413) > at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:822) > at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:818) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762) > at > org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:818) > at > org.apache.hadoop.ipc.Client$Connection.access$3800(Client.java:413) > at org.apache.hadoop.ipc.Client.getConnection(Client.java:1636) > at org.apache.hadoop.ipc.Client.call(Client.java:1452) > at org.apache.hadoop.ipc.Client.call(Client.java:1405) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118) > at com.sun.proxy.$Proxy11.create(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:376) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359) > at com.sun.proxy.$Proxy12.create(Unknown Source) > at > org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:277) > at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1240) > at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1219) > at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1201) > at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1139) > at > org.apache.hadoop.hdfs.DistributedFileSystem$8.doCall(DistributedFileSystem.java:533) > at > org.apache.hadoop.hdfs.DistributedFileSystem$8.doCall(DistributedFileSystem.java:530) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at >
[jira] [Commented] (HDFS-16009) HDFS tiered storage support
[ https://issues.apache.org/jira/browse/HDFS-16009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17340623#comment-17340623 ] Leon Gao commented on HDFS-16009: - Cool, we have been running that in our production, please feel free to give it a try > HDFS tiered storage support > --- > > Key: HDFS-16009 > URL: https://issues.apache.org/jira/browse/HDFS-16009 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: JiangHua Zhu >Priority: Major > > HDFS currently stores data blocks on different data nodes. These data nodes > are random within the scope of the network topology (local rack/different > racks/node group). > When the NameNode allocates DataNodes to the Block, it is randomly selected, > that is, each DataNode is selected equally. > We should make a hierarchical strategy. For example, DataNode1, DataNode2, > and DataNode3 are the same level, and DataNode4 and DataNode5 are the same > level. Because the hardware between DataNodes may be different, in this case, > it is necessary for us to do tiered storage, so that DataNodes with poor > hardware can store up to 50%, and 40% or even lower. This is conducive to > cluster maintenance and stability. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-16007) Vulnerabilities found when serializing enum value
[ https://issues.apache.org/jira/browse/HDFS-16007?focusedWorklogId=593160=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-593160 ] ASF GitHub Bot logged work on HDFS-16007: - Author: ASF GitHub Bot Created on: 07/May/21 06:34 Start Date: 07/May/21 06:34 Worklog Time Spent: 10m Work Description: virajjasani commented on pull request #2982: URL: https://github.com/apache/hadoop/pull/2982#issuecomment-834106326 > I prefer Validate#validIndex in commons-lang3 rather than Guava. We should avoid guava as possible. Sure, let me make this change right away. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 593160) Time Spent: 0.5h (was: 20m) > Vulnerabilities found when serializing enum value > - > > Key: HDFS-16007 > URL: https://issues.apache.org/jira/browse/HDFS-16007 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: junwen yang >Assignee: Viraj Jasani >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > ReplicaState enum is using ordinal to conduct serialization and > deserialization, which is vulnerable to the order, to cause issues similar to > HDFS-15624. > To avoid it, either adding comments to let later developer not to change this > enum, or add index checking in the read and getState function to avoid index > out of bound error. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org