[jira] [Work logged] (HDFS-16004) BackupNode and QJournal lack Permission check.

2021-05-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16004?focusedWorklogId=593658=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-593658
 ]

ASF GitHub Bot logged work on HDFS-16004:
-

Author: ASF GitHub Bot
Created on: 08/May/21 05:55
Start Date: 08/May/21 05:55
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #2966:
URL: https://github.com/apache/hadoop/pull/2966#issuecomment-835124045


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 42s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | -1 :x: |  mvninstall  |   6m  8s | 
[/branch-mvninstall-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2966/2/artifact/out/branch-mvninstall-root.txt)
 |  root in trunk failed.  |
   | -1 :x: |  compile  |   0m 23s | 
[/branch-compile-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2966/2/artifact/out/branch-compile-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04.txt)
 |  hadoop-hdfs in trunk failed with JDK 
Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04.  |
   | -1 :x: |  compile  |   0m 10s | 
[/branch-compile-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2966/2/artifact/out/branch-compile-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08.txt)
 |  hadoop-hdfs in trunk failed with JDK Private 
Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08.  |
   | -0 :warning: |  checkstyle  |   0m 20s | 
[/buildtool-branch-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2966/2/artifact/out/buildtool-branch-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  The patch fails to run checkstyle in hadoop-hdfs  |
   | -1 :x: |  mvnsite  |   0m 23s | 
[/branch-mvnsite-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2966/2/artifact/out/branch-mvnsite-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in trunk failed.  |
   | -1 :x: |  javadoc  |   0m 23s | 
[/branch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2966/2/artifact/out/branch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04.txt)
 |  hadoop-hdfs in trunk failed with JDK 
Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04.  |
   | -1 :x: |  javadoc  |   0m 23s | 
[/branch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2966/2/artifact/out/branch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08.txt)
 |  hadoop-hdfs in trunk failed with JDK Private 
Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08.  |
   | -1 :x: |  spotbugs  |   0m 23s | 
[/branch-spotbugs-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2966/2/artifact/out/branch-spotbugs-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in trunk failed.  |
   | +1 :green_heart: |  shadedclient  |   2m 16s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | -1 :x: |  mvninstall  |   0m 22s | 
[/patch-mvninstall-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2966/2/artifact/out/patch-mvninstall-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch failed.  |
   | -1 :x: |  compile  |   0m 22s | 
[/patch-compile-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2966/2/artifact/out/patch-compile-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04.txt)
 |  hadoop-hdfs in the patch failed with JDK 
Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04.  |
   | -1 :x: |  javac  |   0m 22s | 

[jira] [Commented] (HDFS-16004) BackupNode and QJournal lack Permission check.

2021-05-07 Thread lujie (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17341178#comment-17341178
 ] 

lujie commented on HDFS-16004:
--

push the new commit. All the interface of QJournal  are checked.

> BackupNode and QJournal lack Permission check.
> --
>
> Key: HDFS-16004
> URL: https://issues.apache.org/jira/browse/HDFS-16004
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: lujie
>Assignee: lujie
>Priority: Critical
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> I have some doubt when i configurate secure HDFS.  I know we have Service 
> Level Authorization  for protocols like NamenodeProtocol,DatanodeProtocol and 
> so on.
> But i do not find such Authorization   for JournalProtocol after reading the 
> code in HDFSPolicyProvider.  And if we have, how can i configurate such 
> Authorization?
>  
> Besides  even NamenodeProtocol has Service Level Authorization, its methods 
> still have Permission check. Take startCheckpoint in NameNodeRpcServer who 
> implemented NamenodeProtocol  for example:
>  
> _public NamenodeCommand startCheckpoint(NamenodeRegistration registration)_
>       _throws IOException {_
>     _String operationName = "startCheckpoint";_
>     _checkNNStartup();_
>     _{color:#ff6600}namesystem.checkSuperuserPrivilege(operationName);{color}_
> _.._
>  
> I found that the methods in  BackupNodeRpcServer who implemented 
> JournalProtocol  lack of such  Permission check. See below:
>  
>  
>     _public void startLogSegment(JournalInfo journalInfo, long epoch,_
>         _long txid) throws IOException {_
>       _namesystem.checkOperation(OperationCategory.JOURNAL);_
>       _verifyJournalRequest(journalInfo);_
>       _getBNImage().namenodeStartedLogSegment(txid);_
>     _}_
>  
>     _@Override_
>     _public void journal(JournalInfo journalInfo, long epoch, long firstTxId,_
>         _int numTxns, byte[] records) throws IOException {_
>       _namesystem.checkOperation(OperationCategory.JOURNAL);_
>       _verifyJournalRequest(journalInfo);_
>       _getBNImage().journal(firstTxId, numTxns, records);_
>     _}_
>  
> Do we need add Permission check for them?
>  
> Please point out my mistakes if i am wrong or miss something. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16004) BackupNode and QJournal lack Permission check.

2021-05-07 Thread lujie (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lujie updated HDFS-16004:
-
Summary: BackupNode and QJournal lack Permission check.  (was: 
startLogSegment and journal in BackupNode lack Permission check.)

> BackupNode and QJournal lack Permission check.
> --
>
> Key: HDFS-16004
> URL: https://issues.apache.org/jira/browse/HDFS-16004
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: lujie
>Assignee: lujie
>Priority: Critical
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> I have some doubt when i configurate secure HDFS.  I know we have Service 
> Level Authorization  for protocols like NamenodeProtocol,DatanodeProtocol and 
> so on.
> But i do not find such Authorization   for JournalProtocol after reading the 
> code in HDFSPolicyProvider.  And if we have, how can i configurate such 
> Authorization?
>  
> Besides  even NamenodeProtocol has Service Level Authorization, its methods 
> still have Permission check. Take startCheckpoint in NameNodeRpcServer who 
> implemented NamenodeProtocol  for example:
>  
> _public NamenodeCommand startCheckpoint(NamenodeRegistration registration)_
>       _throws IOException {_
>     _String operationName = "startCheckpoint";_
>     _checkNNStartup();_
>     _{color:#ff6600}namesystem.checkSuperuserPrivilege(operationName);{color}_
> _.._
>  
> I found that the methods in  BackupNodeRpcServer who implemented 
> JournalProtocol  lack of such  Permission check. See below:
>  
>  
>     _public void startLogSegment(JournalInfo journalInfo, long epoch,_
>         _long txid) throws IOException {_
>       _namesystem.checkOperation(OperationCategory.JOURNAL);_
>       _verifyJournalRequest(journalInfo);_
>       _getBNImage().namenodeStartedLogSegment(txid);_
>     _}_
>  
>     _@Override_
>     _public void journal(JournalInfo journalInfo, long epoch, long firstTxId,_
>         _int numTxns, byte[] records) throws IOException {_
>       _namesystem.checkOperation(OperationCategory.JOURNAL);_
>       _verifyJournalRequest(journalInfo);_
>       _getBNImage().journal(firstTxId, numTxns, records);_
>     _}_
>  
> Do we need add Permission check for them?
>  
> Please point out my mistakes if i am wrong or miss something. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16001) TestOfflineEditsViewer.testStored() fails reading negative value of FSEditLogOpCodes

2021-05-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16001?focusedWorklogId=593655=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-593655
 ]

ASF GitHub Bot logged work on HDFS-16001:
-

Author: ASF GitHub Bot
Created on: 08/May/21 05:32
Start Date: 08/May/21 05:32
Worklog Time Spent: 10m 
  Work Description: qizhu-lucas removed a comment on pull request #2989:
URL: https://github.com/apache/hadoop/pull/2989#issuecomment-835112687


   +1


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 593655)
Time Spent: 1h 10m  (was: 1h)

> TestOfflineEditsViewer.testStored() fails reading negative value of 
> FSEditLogOpCodes
> 
>
> Key: HDFS-16001
> URL: https://issues.apache.org/jira/browse/HDFS-16001
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Konstantin Shvachko
>Assignee: Akira Ajisaka
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> {{TestOfflineEditsViewer.testStored()}} fails consistently with an exception
> {noformat}
> java.io.IOException: Op -54 has size -1314247195, but the minimum op size is 
> 17
> {noformat}
> Seems like there is a corrupt record in {{editsStored}} file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16001) TestOfflineEditsViewer.testStored() fails reading negative value of FSEditLogOpCodes

2021-05-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16001?focusedWorklogId=593652=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-593652
 ]

ASF GitHub Bot logged work on HDFS-16001:
-

Author: ASF GitHub Bot
Created on: 08/May/21 05:30
Start Date: 08/May/21 05:30
Worklog Time Spent: 10m 
  Work Description: qizhu-lucas commented on pull request #2989:
URL: https://github.com/apache/hadoop/pull/2989#issuecomment-835112687


   +1


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 593652)
Time Spent: 1h  (was: 50m)

> TestOfflineEditsViewer.testStored() fails reading negative value of 
> FSEditLogOpCodes
> 
>
> Key: HDFS-16001
> URL: https://issues.apache.org/jira/browse/HDFS-16001
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Konstantin Shvachko
>Assignee: Akira Ajisaka
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> {{TestOfflineEditsViewer.testStored()}} fails consistently with an exception
> {noformat}
> java.io.IOException: Op -54 has size -1314247195, but the minimum op size is 
> 17
> {noformat}
> Seems like there is a corrupt record in {{editsStored}} file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-16004) startLogSegment and journal in BackupNode lack Permission check.

2021-05-07 Thread lujie (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17341165#comment-17341165
 ] 

lujie edited comment on HDFS-16004 at 5/8/21, 4:42 AM:
---

[~hexiaoqiao] 

I also think we need add   permission checks for QJournal(including 
InterQJournal)  and am working on a new patch!


was (Author: xiaoheipangzi):
[~hexiaoqiao] 

I also think we need add   permission checks for QJournal  and am working on a 
new patch!

> startLogSegment and journal in BackupNode lack Permission check.
> 
>
> Key: HDFS-16004
> URL: https://issues.apache.org/jira/browse/HDFS-16004
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: lujie
>Assignee: lujie
>Priority: Critical
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> I have some doubt when i configurate secure HDFS.  I know we have Service 
> Level Authorization  for protocols like NamenodeProtocol,DatanodeProtocol and 
> so on.
> But i do not find such Authorization   for JournalProtocol after reading the 
> code in HDFSPolicyProvider.  And if we have, how can i configurate such 
> Authorization?
>  
> Besides  even NamenodeProtocol has Service Level Authorization, its methods 
> still have Permission check. Take startCheckpoint in NameNodeRpcServer who 
> implemented NamenodeProtocol  for example:
>  
> _public NamenodeCommand startCheckpoint(NamenodeRegistration registration)_
>       _throws IOException {_
>     _String operationName = "startCheckpoint";_
>     _checkNNStartup();_
>     _{color:#ff6600}namesystem.checkSuperuserPrivilege(operationName);{color}_
> _.._
>  
> I found that the methods in  BackupNodeRpcServer who implemented 
> JournalProtocol  lack of such  Permission check. See below:
>  
>  
>     _public void startLogSegment(JournalInfo journalInfo, long epoch,_
>         _long txid) throws IOException {_
>       _namesystem.checkOperation(OperationCategory.JOURNAL);_
>       _verifyJournalRequest(journalInfo);_
>       _getBNImage().namenodeStartedLogSegment(txid);_
>     _}_
>  
>     _@Override_
>     _public void journal(JournalInfo journalInfo, long epoch, long firstTxId,_
>         _int numTxns, byte[] records) throws IOException {_
>       _namesystem.checkOperation(OperationCategory.JOURNAL);_
>       _verifyJournalRequest(journalInfo);_
>       _getBNImage().journal(firstTxId, numTxns, records);_
>     _}_
>  
> Do we need add Permission check for them?
>  
> Please point out my mistakes if i am wrong or miss something. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16004) startLogSegment and journal in BackupNode lack Permission check.

2021-05-07 Thread lujie (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17341165#comment-17341165
 ] 

lujie commented on HDFS-16004:
--

[~hexiaoqiao] 

I also think we need add   permission checks for QJournal  and am working on a 
new patch!

> startLogSegment and journal in BackupNode lack Permission check.
> 
>
> Key: HDFS-16004
> URL: https://issues.apache.org/jira/browse/HDFS-16004
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: lujie
>Assignee: lujie
>Priority: Critical
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> I have some doubt when i configurate secure HDFS.  I know we have Service 
> Level Authorization  for protocols like NamenodeProtocol,DatanodeProtocol and 
> so on.
> But i do not find such Authorization   for JournalProtocol after reading the 
> code in HDFSPolicyProvider.  And if we have, how can i configurate such 
> Authorization?
>  
> Besides  even NamenodeProtocol has Service Level Authorization, its methods 
> still have Permission check. Take startCheckpoint in NameNodeRpcServer who 
> implemented NamenodeProtocol  for example:
>  
> _public NamenodeCommand startCheckpoint(NamenodeRegistration registration)_
>       _throws IOException {_
>     _String operationName = "startCheckpoint";_
>     _checkNNStartup();_
>     _{color:#ff6600}namesystem.checkSuperuserPrivilege(operationName);{color}_
> _.._
>  
> I found that the methods in  BackupNodeRpcServer who implemented 
> JournalProtocol  lack of such  Permission check. See below:
>  
>  
>     _public void startLogSegment(JournalInfo journalInfo, long epoch,_
>         _long txid) throws IOException {_
>       _namesystem.checkOperation(OperationCategory.JOURNAL);_
>       _verifyJournalRequest(journalInfo);_
>       _getBNImage().namenodeStartedLogSegment(txid);_
>     _}_
>  
>     _@Override_
>     _public void journal(JournalInfo journalInfo, long epoch, long firstTxId,_
>         _int numTxns, byte[] records) throws IOException {_
>       _namesystem.checkOperation(OperationCategory.JOURNAL);_
>       _verifyJournalRequest(journalInfo);_
>       _getBNImage().journal(firstTxId, numTxns, records);_
>     _}_
>  
> Do we need add Permission check for them?
>  
> Please point out my mistakes if i am wrong or miss something. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16004) startLogSegment and journal in BackupNode lack Permission check.

2021-05-07 Thread Xiaoqiao He (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17341112#comment-17341112
 ] 

Xiaoqiao He commented on HDFS-16004:


Assign this issue to [~xiaoheipangzi].
Thanks [~xiaoheipangzi] for your report. It is good catch in my opinion. BTW I 
think we should involve every RPC interface of BackupNode and QJournal because 
just found both of them have no permission checks. cc [~shv] do you mind give 
another check? Thanks.

> startLogSegment and journal in BackupNode lack Permission check.
> 
>
> Key: HDFS-16004
> URL: https://issues.apache.org/jira/browse/HDFS-16004
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: lujie
>Assignee: lujie
>Priority: Critical
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> I have some doubt when i configurate secure HDFS.  I know we have Service 
> Level Authorization  for protocols like NamenodeProtocol,DatanodeProtocol and 
> so on.
> But i do not find such Authorization   for JournalProtocol after reading the 
> code in HDFSPolicyProvider.  And if we have, how can i configurate such 
> Authorization?
>  
> Besides  even NamenodeProtocol has Service Level Authorization, its methods 
> still have Permission check. Take startCheckpoint in NameNodeRpcServer who 
> implemented NamenodeProtocol  for example:
>  
> _public NamenodeCommand startCheckpoint(NamenodeRegistration registration)_
>       _throws IOException {_
>     _String operationName = "startCheckpoint";_
>     _checkNNStartup();_
>     _{color:#ff6600}namesystem.checkSuperuserPrivilege(operationName);{color}_
> _.._
>  
> I found that the methods in  BackupNodeRpcServer who implemented 
> JournalProtocol  lack of such  Permission check. See below:
>  
>  
>     _public void startLogSegment(JournalInfo journalInfo, long epoch,_
>         _long txid) throws IOException {_
>       _namesystem.checkOperation(OperationCategory.JOURNAL);_
>       _verifyJournalRequest(journalInfo);_
>       _getBNImage().namenodeStartedLogSegment(txid);_
>     _}_
>  
>     _@Override_
>     _public void journal(JournalInfo journalInfo, long epoch, long firstTxId,_
>         _int numTxns, byte[] records) throws IOException {_
>       _namesystem.checkOperation(OperationCategory.JOURNAL);_
>       _verifyJournalRequest(journalInfo);_
>       _getBNImage().journal(firstTxId, numTxns, records);_
>     _}_
>  
> Do we need add Permission check for them?
>  
> Please point out my mistakes if i am wrong or miss something. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-16004) startLogSegment and journal in BackupNode lack Permission check.

2021-05-07 Thread Xiaoqiao He (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoqiao He reassigned HDFS-16004:
--

Assignee: lujie

> startLogSegment and journal in BackupNode lack Permission check.
> 
>
> Key: HDFS-16004
> URL: https://issues.apache.org/jira/browse/HDFS-16004
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: lujie
>Assignee: lujie
>Priority: Critical
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> I have some doubt when i configurate secure HDFS.  I know we have Service 
> Level Authorization  for protocols like NamenodeProtocol,DatanodeProtocol and 
> so on.
> But i do not find such Authorization   for JournalProtocol after reading the 
> code in HDFSPolicyProvider.  And if we have, how can i configurate such 
> Authorization?
>  
> Besides  even NamenodeProtocol has Service Level Authorization, its methods 
> still have Permission check. Take startCheckpoint in NameNodeRpcServer who 
> implemented NamenodeProtocol  for example:
>  
> _public NamenodeCommand startCheckpoint(NamenodeRegistration registration)_
>       _throws IOException {_
>     _String operationName = "startCheckpoint";_
>     _checkNNStartup();_
>     _{color:#ff6600}namesystem.checkSuperuserPrivilege(operationName);{color}_
> _.._
>  
> I found that the methods in  BackupNodeRpcServer who implemented 
> JournalProtocol  lack of such  Permission check. See below:
>  
>  
>     _public void startLogSegment(JournalInfo journalInfo, long epoch,_
>         _long txid) throws IOException {_
>       _namesystem.checkOperation(OperationCategory.JOURNAL);_
>       _verifyJournalRequest(journalInfo);_
>       _getBNImage().namenodeStartedLogSegment(txid);_
>     _}_
>  
>     _@Override_
>     _public void journal(JournalInfo journalInfo, long epoch, long firstTxId,_
>         _int numTxns, byte[] records) throws IOException {_
>       _namesystem.checkOperation(OperationCategory.JOURNAL);_
>       _verifyJournalRequest(journalInfo);_
>       _getBNImage().journal(firstTxId, numTxns, records);_
>     _}_
>  
> Do we need add Permission check for them?
>  
> Please point out my mistakes if i am wrong or miss something. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16011) HDFS: Support viewfs nested mount

2021-05-07 Thread zhu (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17341109#comment-17341109
 ] 

zhu commented on HDFS-16011:


[~hexiaoqiao],[~umamaheswararao],[~ayushtkn]  Looking forward to your comments.

> HDFS: Support viewfs nested mount
> -
>
> Key: HDFS-16011
> URL: https://issues.apache.org/jira/browse/HDFS-16011
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, viewfs
>Affects Versions: 3.2.2, 3.3.1
>Reporter: zhu
>Assignee: zhu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The current viewfs does not support nested mount points. The current viewfs 
> does not support nested mount points. 
> *E.g:*
>     1./home/ => /home/
>     2./home/work => /home/work
> If mount point 1 is loaded, mount point 2 cannot be added, and the following 
> exception will be thrown when loading 2.
> {code:java}
> throw new FileAlreadyExistsException("Path " + nextInode.fullPath +
>  " already exists as link");
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16011) HDFS: Support viewfs nested mount

2021-05-07 Thread zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhu updated HDFS-16011:
---
Component/s: viewfs

> HDFS: Support viewfs nested mount
> -
>
> Key: HDFS-16011
> URL: https://issues.apache.org/jira/browse/HDFS-16011
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, viewfs
>Affects Versions: 3.2.2, 3.3.1
>Reporter: zhu
>Assignee: zhu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The current viewfs does not support nested mount points. The current viewfs 
> does not support nested mount points. 
> *E.g:*
>     1./home/ => /home/
>     2./home/work => /home/work
> If mount point 1 is loaded, mount point 2 cannot be added, and the following 
> exception will be thrown when loading 2.
> {code:java}
> throw new FileAlreadyExistsException("Path " + nextInode.fullPath +
>  " already exists as link");
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16001) TestOfflineEditsViewer.testStored() fails reading negative value of FSEditLogOpCodes

2021-05-07 Thread Xiaoqiao He (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoqiao He updated HDFS-16001:
---
Fix Version/s: 3.3.1
 Hadoop Flags: Reviewed
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Committed to branch-3.3. Thanks [~aajisaka] for your works! Thanks [~shv] for 
your reviews!

> TestOfflineEditsViewer.testStored() fails reading negative value of 
> FSEditLogOpCodes
> 
>
> Key: HDFS-16001
> URL: https://issues.apache.org/jira/browse/HDFS-16001
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Konstantin Shvachko
>Assignee: Akira Ajisaka
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> {{TestOfflineEditsViewer.testStored()}} fails consistently with an exception
> {noformat}
> java.io.IOException: Op -54 has size -1314247195, but the minimum op size is 
> 17
> {noformat}
> Seems like there is a corrupt record in {{editsStored}} file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16001) TestOfflineEditsViewer.testStored() fails reading negative value of FSEditLogOpCodes

2021-05-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16001?focusedWorklogId=593621=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-593621
 ]

ASF GitHub Bot logged work on HDFS-16001:
-

Author: ASF GitHub Bot
Created on: 08/May/21 02:49
Start Date: 08/May/21 02:49
Worklog Time Spent: 10m 
  Work Description: Hexiaoqiao merged pull request #2989:
URL: https://github.com/apache/hadoop/pull/2989


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 593621)
Time Spent: 50m  (was: 40m)

> TestOfflineEditsViewer.testStored() fails reading negative value of 
> FSEditLogOpCodes
> 
>
> Key: HDFS-16001
> URL: https://issues.apache.org/jira/browse/HDFS-16001
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Konstantin Shvachko
>Assignee: Akira Ajisaka
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> {{TestOfflineEditsViewer.testStored()}} fails consistently with an exception
> {noformat}
> java.io.IOException: Op -54 has size -1314247195, but the minimum op size is 
> 17
> {noformat}
> Seems like there is a corrupt record in {{editsStored}} file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16008) RBF: Tool to initialize ViewFS Mapping to Router

2021-05-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16008?focusedWorklogId=593614=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-593614
 ]

ASF GitHub Bot logged work on HDFS-16008:
-

Author: ASF GitHub Bot
Created on: 08/May/21 02:01
Start Date: 08/May/21 02:01
Worklog Time Spent: 10m 
  Work Description: zhuxiangyi commented on a change in pull request #2981:
URL: https://github.com/apache/hadoop/pull/2981#discussion_r628667273



##
File path: 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/site/markdown/HDFSRouterFederation.md
##
@@ -241,6 +241,22 @@ Mount table permission can be set by following command:
 
 The option mode is UNIX-style permissions for the mount table. Permissions are 
specified in octal, e.g. 0755. By default, this is set to 0755.
 
+ Init ViewFs To Router
+Router supports initializing the ViewFS mount point to the Router. The mapping 
directory protocol of ViewFS must be HDFS, and the initializer only supports 
one-to-one mapping.
+
+For example, use the following viewfs to configure the initial mount table to 
the router.
+
+
+  
+fs.viewfs.mounttable.ClusterX.link./data
+hdfs://nn1-clusterx.example.com:8020/data
+  
+
+
+The ViewFS mount table can be initialized to the Router by using the following 
command:
+
+[hdfs]$ $HADOOP_HOME/bin/hdfs dfsrouteradmin -initViewFsToMountTable 
ClusterX

Review comment:
   Do you mean that there are mount points for multiple clusters, can the 
user import all of them without specifying the cluster name? For example, the 
following configuration.
   
   
 
   fs.viewfs.mounttable.ClusterX.link./data
   hdfs://nn1-clusterx.example.com:8020/data
 
 
   fs.viewfs.mounttable.ClusterY.link./project
   hdfs://nn2-clusterx.example.com:8020/project
 
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 593614)
Time Spent: 1h 40m  (was: 1.5h)

> RBF: Tool to initialize ViewFS Mapping to Router
> 
>
> Key: HDFS-16008
> URL: https://issues.apache.org/jira/browse/HDFS-16008
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rbf
>Affects Versions: 3.3.1
>Reporter: zhu
>Assignee: zhu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> This is a tool for initializing ViewFS Mapping to Router.
> Some companies are currently migrating from viewfs to router, I think they 
> need this tool.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16008) RBF: Tool to initialize ViewFS Mapping to Router

2021-05-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16008?focusedWorklogId=593612=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-593612
 ]

ASF GitHub Bot logged work on HDFS-16008:
-

Author: ASF GitHub Bot
Created on: 08/May/21 01:58
Start Date: 08/May/21 01:58
Worklog Time Spent: 10m 
  Work Description: zhuxiangyi commented on a change in pull request #2981:
URL: https://github.com/apache/hadoop/pull/2981#discussion_r628667273



##
File path: 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/site/markdown/HDFSRouterFederation.md
##
@@ -241,6 +241,22 @@ Mount table permission can be set by following command:
 
 The option mode is UNIX-style permissions for the mount table. Permissions are 
specified in octal, e.g. 0755. By default, this is set to 0755.
 
+ Init ViewFs To Router
+Router supports initializing the ViewFS mount point to the Router. The mapping 
directory protocol of ViewFS must be HDFS, and the initializer only supports 
one-to-one mapping.
+
+For example, use the following viewfs to configure the initial mount table to 
the router.
+
+
+  
+fs.viewfs.mounttable.ClusterX.link./data
+hdfs://nn1-clusterx.example.com:8020/data
+  
+
+
+The ViewFS mount table can be initialized to the Router by using the following 
command:
+
+[hdfs]$ $HADOOP_HOME/bin/hdfs dfsrouteradmin -initViewFsToMountTable 
ClusterX

Review comment:
   Do you mean that there are mount points for multiple clusters, can the 
user import all of them without specifying the cluster name? For example, the 
following configuration.
 
 
   fs.viewfs.mounttable.ClusterX.link./data
   hdfs://nn1-clusterx.example.com:8020/data
 
 
   fs.viewfs.mounttable.ClusterY.link./project
   hdfs://nn1-clusterx.example.com:8020/project
 
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 593612)
Time Spent: 1h 20m  (was: 1h 10m)

> RBF: Tool to initialize ViewFS Mapping to Router
> 
>
> Key: HDFS-16008
> URL: https://issues.apache.org/jira/browse/HDFS-16008
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rbf
>Affects Versions: 3.3.1
>Reporter: zhu
>Assignee: zhu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> This is a tool for initializing ViewFS Mapping to Router.
> Some companies are currently migrating from viewfs to router, I think they 
> need this tool.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16008) RBF: Tool to initialize ViewFS Mapping to Router

2021-05-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16008?focusedWorklogId=593613=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-593613
 ]

ASF GitHub Bot logged work on HDFS-16008:
-

Author: ASF GitHub Bot
Created on: 08/May/21 01:58
Start Date: 08/May/21 01:58
Worklog Time Spent: 10m 
  Work Description: zhuxiangyi commented on a change in pull request #2981:
URL: https://github.com/apache/hadoop/pull/2981#discussion_r628667273



##
File path: 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/site/markdown/HDFSRouterFederation.md
##
@@ -241,6 +241,22 @@ Mount table permission can be set by following command:
 
 The option mode is UNIX-style permissions for the mount table. Permissions are 
specified in octal, e.g. 0755. By default, this is set to 0755.
 
+ Init ViewFs To Router
+Router supports initializing the ViewFS mount point to the Router. The mapping 
directory protocol of ViewFS must be HDFS, and the initializer only supports 
one-to-one mapping.
+
+For example, use the following viewfs to configure the initial mount table to 
the router.
+
+
+  
+fs.viewfs.mounttable.ClusterX.link./data
+hdfs://nn1-clusterx.example.com:8020/data
+  
+
+
+The ViewFS mount table can be initialized to the Router by using the following 
command:
+
+[hdfs]$ $HADOOP_HOME/bin/hdfs dfsrouteradmin -initViewFsToMountTable 
ClusterX

Review comment:
   Do you mean that there are mount points for multiple clusters, can the 
user import all of them without specifying the cluster name? For example, the 
following configuration.
   `  
 
   fs.viewfs.mounttable.ClusterX.link./data
   hdfs://nn1-clusterx.example.com:8020/data
 
 
   fs.viewfs.mounttable.ClusterY.link./project
   hdfs://nn1-clusterx.example.com:8020/project
 
   `
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 593613)
Time Spent: 1.5h  (was: 1h 20m)

> RBF: Tool to initialize ViewFS Mapping to Router
> 
>
> Key: HDFS-16008
> URL: https://issues.apache.org/jira/browse/HDFS-16008
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rbf
>Affects Versions: 3.3.1
>Reporter: zhu
>Assignee: zhu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> This is a tool for initializing ViewFS Mapping to Router.
> Some companies are currently migrating from viewfs to router, I think they 
> need this tool.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14703) NameNode Fine-Grained Locking via Metadata Partitioning

2021-05-07 Thread Konstantin Shvachko (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17341089#comment-17341089
 ] 

Konstantin Shvachko edited comment on HDFS-14703 at 5/8/21, 1:05 AM:
-

Updated the POC patches to current trunk. There were indeed some missing parts 
in the first patch.
 See [^003-partitioned-inodeMap-POC.tar.gz].


was (Author: shv):
Updated the POC patches. There were indeed some missing parts in the first 
patch.
See 
[003-partitioned-inodeMap-POC.tar.gz|https://issues.apache.org/jira/secure/attachment/13025177/003-partitioned-inodeMap-POC.tar.gz].

> NameNode Fine-Grained Locking via Metadata Partitioning
> ---
>
> Key: HDFS-14703
> URL: https://issues.apache.org/jira/browse/HDFS-14703
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, namenode
>Reporter: Konstantin Shvachko
>Priority: Major
> Attachments: 001-partitioned-inodeMap-POC.tar.gz, 
> 002-partitioned-inodeMap-POC.tar.gz, 003-partitioned-inodeMap-POC.tar.gz, 
> NameNode Fine-Grained Locking.pdf, NameNode Fine-Grained Locking.pdf
>
>
> We target to enable fine-grained locking by splitting the in-memory namespace 
> into multiple partitions each having a separate lock. Intended to improve 
> performance of NameNode write operations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14703) NameNode Fine-Grained Locking via Metadata Partitioning

2021-05-07 Thread Konstantin Shvachko (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17341089#comment-17341089
 ] 

Konstantin Shvachko edited comment on HDFS-14703 at 5/8/21, 1:04 AM:
-

Updated the POC patches. There were indeed some missing parts in the first 
patch.
See 
[003-partitioned-inodeMap-POC.tar.gz|https://issues.apache.org/jira/secure/attachment/13025177/003-partitioned-inodeMap-POC.tar.gz].


was (Author: shv):
Updated the POC patches. There were indeed some missing parts in the first 
patch. See 
[https://issues.apache.org/jira/secure/attachment/13025177/003-partitioned-inodeMap-POC.tar.gz|https://issues.apache.org/jira/secure/attachment/13025177/003-partitioned-inodeMap-POC.tar.gz].

> NameNode Fine-Grained Locking via Metadata Partitioning
> ---
>
> Key: HDFS-14703
> URL: https://issues.apache.org/jira/browse/HDFS-14703
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, namenode
>Reporter: Konstantin Shvachko
>Priority: Major
> Attachments: 001-partitioned-inodeMap-POC.tar.gz, 
> 002-partitioned-inodeMap-POC.tar.gz, 003-partitioned-inodeMap-POC.tar.gz, 
> NameNode Fine-Grained Locking.pdf, NameNode Fine-Grained Locking.pdf
>
>
> We target to enable fine-grained locking by splitting the in-memory namespace 
> into multiple partitions each having a separate lock. Intended to improve 
> performance of NameNode write operations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14703) NameNode Fine-Grained Locking via Metadata Partitioning

2021-05-07 Thread Konstantin Shvachko (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17341089#comment-17341089
 ] 

Konstantin Shvachko commented on HDFS-14703:


Updated the POC patches. There were indeed some missing parts in the first 
patch. See 
[https://issues.apache.org/jira/secure/attachment/13025177/003-partitioned-inodeMap-POC.tar.gz|https://issues.apache.org/jira/secure/attachment/13025177/003-partitioned-inodeMap-POC.tar.gz].

> NameNode Fine-Grained Locking via Metadata Partitioning
> ---
>
> Key: HDFS-14703
> URL: https://issues.apache.org/jira/browse/HDFS-14703
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, namenode
>Reporter: Konstantin Shvachko
>Priority: Major
> Attachments: 001-partitioned-inodeMap-POC.tar.gz, 
> 002-partitioned-inodeMap-POC.tar.gz, 003-partitioned-inodeMap-POC.tar.gz, 
> NameNode Fine-Grained Locking.pdf, NameNode Fine-Grained Locking.pdf
>
>
> We target to enable fine-grained locking by splitting the in-memory namespace 
> into multiple partitions each having a separate lock. Intended to improve 
> performance of NameNode write operations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14703) NameNode Fine-Grained Locking via Metadata Partitioning

2021-05-07 Thread Konstantin Shvachko (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated HDFS-14703:
---
Attachment: 003-partitioned-inodeMap-POC.tar.gz

> NameNode Fine-Grained Locking via Metadata Partitioning
> ---
>
> Key: HDFS-14703
> URL: https://issues.apache.org/jira/browse/HDFS-14703
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, namenode
>Reporter: Konstantin Shvachko
>Priority: Major
> Attachments: 001-partitioned-inodeMap-POC.tar.gz, 
> 002-partitioned-inodeMap-POC.tar.gz, 003-partitioned-inodeMap-POC.tar.gz, 
> NameNode Fine-Grained Locking.pdf, NameNode Fine-Grained Locking.pdf
>
>
> We target to enable fine-grained locking by splitting the in-memory namespace 
> into multiple partitions each having a separate lock. Intended to improve 
> performance of NameNode write operations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16001) TestOfflineEditsViewer.testStored() fails reading negative value of FSEditLogOpCodes

2021-05-07 Thread Konstantin Shvachko (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17341046#comment-17341046
 ] 

Konstantin Shvachko commented on HDFS-16001:


Checked the test. This fixes it.
+1 thanks [~aajisaka]

> TestOfflineEditsViewer.testStored() fails reading negative value of 
> FSEditLogOpCodes
> 
>
> Key: HDFS-16001
> URL: https://issues.apache.org/jira/browse/HDFS-16001
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Konstantin Shvachko
>Assignee: Akira Ajisaka
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> {{TestOfflineEditsViewer.testStored()}} fails consistently with an exception
> {noformat}
> java.io.IOException: Op -54 has size -1314247195, but the minimum op size is 
> 17
> {noformat}
> Seems like there is a corrupt record in {{editsStored}} file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16012) Improve DataXceiver#copyBlock() log printing

2021-05-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16012?focusedWorklogId=593537=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-593537
 ]

ASF GitHub Bot logged work on HDFS-16012:
-

Author: ASF GitHub Bot
Created on: 07/May/21 21:13
Start Date: 07/May/21 21:13
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #2988:
URL: https://github.com/apache/hadoop/pull/2988#issuecomment-834784136


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 55s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  35m 10s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 23s |  |  trunk passed with JDK 
Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  compile  |   1m 16s |  |  trunk passed with JDK 
Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   1m  4s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 27s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 57s |  |  trunk passed with JDK 
Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 32s |  |  trunk passed with JDK 
Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   3m 54s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  17m 34s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 16s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 19s |  |  the patch passed with JDK 
Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javac  |   1m 19s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 10s |  |  the patch passed with JDK 
Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08  |
   | +1 :green_heart: |  javac  |   1m 10s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 51s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 13s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 48s |  |  the patch passed with JDK 
Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 20s |  |  the patch passed with JDK 
Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   3m 11s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  16m 27s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 421m  3s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2988/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 44s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 511m 39s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.server.namenode.TestDecommissioningStatus 
|
   |   | hadoop.hdfs.TestViewDistributedFileSystemContract |
   |   | hadoop.hdfs.TestSnapshotCommands |
   |   | hadoop.hdfs.TestPersistBlocks |
   |   | hadoop.hdfs.TestDFSShell |
   |   | hadoop.hdfs.server.datanode.fsdataset.impl.TestFsVolumeList |
   |   | hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer |
   |   | hadoop.hdfs.TestMaintenanceState |
   |   | hadoop.hdfs.TestStateAlignmentContextWithHA |
   |   | hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes |
   |   | hadoop.fs.viewfs.TestViewFSOverloadSchemeWithMountTableConfigInHDFS |
   |   | 
hadoop.hdfs.server.namenode.TestDecommissioningStatusWithBackoffMonitor |
   |   | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped |
   |   | hadoop.hdfs.server.namenode.ha.TestEditLogTailer |
   |   | hadoop.hdfs.server.datanode.TestBlockScanner |
   |   | hadoop.hdfs.server.datanode.TestIncrementalBrVariations |
   |   | hadoop.hdfs.server.diskbalancer.command.TestDiskBalancerCommand |
   |   | 

[jira] [Work logged] (HDFS-16001) TestOfflineEditsViewer.testStored() fails reading negative value of FSEditLogOpCodes

2021-05-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16001?focusedWorklogId=593447=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-593447
 ]

ASF GitHub Bot logged work on HDFS-16001:
-

Author: ASF GitHub Bot
Created on: 07/May/21 18:03
Start Date: 07/May/21 18:03
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #2989:
URL: https://github.com/apache/hadoop/pull/2989#issuecomment-834660569


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |  21m  5s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  shelldocs  |   0m  0s |  |  Shelldocs was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 2 new or modified test files.  |
    _ branch-3.3 Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  30m 21s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  shadedclient  |  15m 21s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 13s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  shellcheck  |   0m  0s |  |  No new issues.  |
   | +1 :green_heart: |  xml  |   0m  1s |  |  The patch has no ill-formed XML 
file.  |
   | +1 :green_heart: |  shadedclient  |  15m  8s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   1m 14s |  |  hadoop-hdfs in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 38s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   |  88m  4s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2989/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2989 |
   | Optional Tests | dupname asflicense unit codespell shellcheck shelldocs 
xml |
   | uname | Linux d37ad0da4914 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 
23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | branch-3.3 / 8046991fb77b53f555d63c2f51f9d1f8037dad9b |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2989/1/testReport/ |
   | Max. process+thread count | 631 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2989/1/console |
   | versions | git=2.17.1 maven=3.6.0 shellcheck=0.4.6 |
   | Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 593447)
Time Spent: 40m  (was: 0.5h)

> TestOfflineEditsViewer.testStored() fails reading negative value of 
> FSEditLogOpCodes
> 
>
> Key: HDFS-16001
> URL: https://issues.apache.org/jira/browse/HDFS-16001
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Konstantin Shvachko
>Assignee: Akira Ajisaka
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> {{TestOfflineEditsViewer.testStored()}} fails consistently with an exception
> {noformat}
> java.io.IOException: Op -54 has size -1314247195, but the minimum op size is 
> 17
> {noformat}
> Seems like there is a corrupt record in {{editsStored}} file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-15878) RBF: Flaky test TestRouterWebHDFSContractCreate>AbstractContractCreateTest#testSyncable in Trunk

2021-05-07 Thread Fengnan Li (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fengnan Li resolved HDFS-15878.
---
Resolution: Not A Problem

> RBF: Flaky test 
> TestRouterWebHDFSContractCreate>AbstractContractCreateTest#testSyncable in 
> Trunk
> 
>
> Key: HDFS-15878
> URL: https://issues.apache.org/jira/browse/HDFS-15878
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs, rbf
>Reporter: Renukaprasad C
>Assignee: Fengnan Li
>Priority: Major
>
> ERROR] Tests run: 16, Failures: 0, Errors: 1, Skipped: 2, Time elapsed: 
> 24.627 s <<< FAILURE! - in 
> org.apache.hadoop.fs.contract.router.web.TestRouterWebHDFSContractCreate
> [ERROR] 
> testSyncable(org.apache.hadoop.fs.contract.router.web.TestRouterWebHDFSContractCreate)
>   Time elapsed: 0.222 s  <<< ERROR!
> java.io.FileNotFoundException: File /test/testSyncable not found.
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:121)
>   at 
> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:110)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toIOException(WebHdfsFileSystem.java:576)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$900(WebHdfsFileSystem.java:146)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.shouldRetry(WebHdfsFileSystem.java:892)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:858)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:652)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:690)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:686)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$ReadRunner.getRedirectedUrl(WebHdfsFileSystem.java:2307)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$ReadRunner.(WebHdfsFileSystem.java:2296)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$WebHdfsInputStream.(WebHdfsFileSystem.java:2176)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.open(WebHdfsFileSystem.java:1610)
>   at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:975)
>   at 
> org.apache.hadoop.fs.contract.AbstractContractCreateTest.validateSyncableSemantics(AbstractContractCreateTest.java:556)
>   at 
> org.apache.hadoop.fs.contract.AbstractContractCreateTest.testSyncable(AbstractContractCreateTest.java:459)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: 
> org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): File 
> /test/testSyncable not found.
>   at 
> org.apache.hadoop.hdfs.web.JsonUtilClient.toRemoteException(JsonUtilClient.java:90)
>   at 
> 

[jira] [Reopened] (HDFS-15878) RBF: Flaky test TestRouterWebHDFSContractCreate>AbstractContractCreateTest#testSyncable in Trunk

2021-05-07 Thread Fengnan Li (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fengnan Li reopened HDFS-15878:
---

Reopen to change the closing status.

> RBF: Flaky test 
> TestRouterWebHDFSContractCreate>AbstractContractCreateTest#testSyncable in 
> Trunk
> 
>
> Key: HDFS-15878
> URL: https://issues.apache.org/jira/browse/HDFS-15878
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs, rbf
>Reporter: Renukaprasad C
>Assignee: Fengnan Li
>Priority: Major
>
> ERROR] Tests run: 16, Failures: 0, Errors: 1, Skipped: 2, Time elapsed: 
> 24.627 s <<< FAILURE! - in 
> org.apache.hadoop.fs.contract.router.web.TestRouterWebHDFSContractCreate
> [ERROR] 
> testSyncable(org.apache.hadoop.fs.contract.router.web.TestRouterWebHDFSContractCreate)
>   Time elapsed: 0.222 s  <<< ERROR!
> java.io.FileNotFoundException: File /test/testSyncable not found.
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:121)
>   at 
> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:110)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toIOException(WebHdfsFileSystem.java:576)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$900(WebHdfsFileSystem.java:146)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.shouldRetry(WebHdfsFileSystem.java:892)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:858)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:652)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:690)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:686)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$ReadRunner.getRedirectedUrl(WebHdfsFileSystem.java:2307)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$ReadRunner.(WebHdfsFileSystem.java:2296)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$WebHdfsInputStream.(WebHdfsFileSystem.java:2176)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.open(WebHdfsFileSystem.java:1610)
>   at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:975)
>   at 
> org.apache.hadoop.fs.contract.AbstractContractCreateTest.validateSyncableSemantics(AbstractContractCreateTest.java:556)
>   at 
> org.apache.hadoop.fs.contract.AbstractContractCreateTest.testSyncable(AbstractContractCreateTest.java:459)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: 
> org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): File 
> /test/testSyncable not found.
>   at 
> org.apache.hadoop.hdfs.web.JsonUtilClient.toRemoteException(JsonUtilClient.java:90)
>   at 
> 

[jira] [Commented] (HDFS-15878) RBF: Flaky test TestRouterWebHDFSContractCreate>AbstractContractCreateTest#testSyncable in Trunk

2021-05-07 Thread Fengnan Li (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17340967#comment-17340967
 ] 

Fengnan Li commented on HDFS-15878:
---

[~hexiaoqiao] [~ayushtkn]  Updated the status as suggested. Thanks.

> RBF: Flaky test 
> TestRouterWebHDFSContractCreate>AbstractContractCreateTest#testSyncable in 
> Trunk
> 
>
> Key: HDFS-15878
> URL: https://issues.apache.org/jira/browse/HDFS-15878
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs, rbf
>Reporter: Renukaprasad C
>Assignee: Fengnan Li
>Priority: Major
>
> ERROR] Tests run: 16, Failures: 0, Errors: 1, Skipped: 2, Time elapsed: 
> 24.627 s <<< FAILURE! - in 
> org.apache.hadoop.fs.contract.router.web.TestRouterWebHDFSContractCreate
> [ERROR] 
> testSyncable(org.apache.hadoop.fs.contract.router.web.TestRouterWebHDFSContractCreate)
>   Time elapsed: 0.222 s  <<< ERROR!
> java.io.FileNotFoundException: File /test/testSyncable not found.
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:121)
>   at 
> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:110)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toIOException(WebHdfsFileSystem.java:576)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$900(WebHdfsFileSystem.java:146)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.shouldRetry(WebHdfsFileSystem.java:892)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:858)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:652)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:690)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:686)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$ReadRunner.getRedirectedUrl(WebHdfsFileSystem.java:2307)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$ReadRunner.(WebHdfsFileSystem.java:2296)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$WebHdfsInputStream.(WebHdfsFileSystem.java:2176)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.open(WebHdfsFileSystem.java:1610)
>   at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:975)
>   at 
> org.apache.hadoop.fs.contract.AbstractContractCreateTest.validateSyncableSemantics(AbstractContractCreateTest.java:556)
>   at 
> org.apache.hadoop.fs.contract.AbstractContractCreateTest.testSyncable(AbstractContractCreateTest.java:459)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: 
> org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): File 
> /test/testSyncable not found.
>   at 
> 

[jira] [Commented] (HDFS-16013) DirectoryScan operation holds dataset lock for long time

2021-05-07 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17340954#comment-17340954
 ] 

Renukaprasad C commented on HDFS-16013:
---

Thanks [~sodonnell] for review & quick update on the issue.
We had all the patches (HDFS-15406, HDFS-15574, HDFS-14476, HDFS-14751 and 
HDFS-15048) already, except the Jira - HDFS-15415. We have tested with the 
changes - HDFS-15415, delay has come down almost 60% (inital delay 4-8 sec has 
reduced to 1-3 seconds).
We continue to test & post if any further updates. 

> DirectoryScan operation holds dataset lock for long time
> 
>
> Key: HDFS-16013
> URL: https://issues.apache.org/jira/browse/HDFS-16013
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Renukaprasad C
>Priority: Critical
>
> Environment: 3 Node cluster with around 2M files & same number of blocks.
> All file operations are normal, only during directory scan, which take more 
> memory and some long GC Pause. This directory scan happens for every 6H 
> (default value) which cause slow response to any file operations. Delay is 
> around 5-8 seconds (In production this delay got increased to 30+ seconds 
> with 8M blocks)
> GC Configuration:
> -Xms6144M 
> -Xmx12288M /8G
> -XX:NewSize=614M 
> -XX:MaxNewSize=1228M 
> -XX:MetaspaceSize=128M 
> -XX:MaxMetaspaceSize=128M 
> -XX:CMSFullGCsBeforeCompaction=1 
> -XX:MaxDirectMemorySize=1G 
> -XX:+UseConcMarkSweepGC 
> -XX:+CMSParallelRemarkEnabled 
> -XX:+UseCMSCompactAtFullCollection 
> -XX:CMSInitiatingOccupancyFraction=80 
> Also we tried with G1 GC, but couldnt find much difference in the result.
> -XX:+UseG1GC
> -XX:MaxGCPauseMillis=200
> -XX:InitiatingHeapOccupancyPercent=45
> -XX:G1ReservePercent=10
> {code:java}
> 2021-05-07 16:32:23,508 INFO 
> org.apache.hadoop.hdfs.server.datanode.DirectoryScanner: BlockPool 
> BP-345634799--1619695417333 Total blocks: 2767211, missing metadata 
> files: 22, missing block files: 22, missing blocks in memory: 0, mismatched 
> blocks: 0
> 2021-05-07 16:32:23,508 WARN 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Lock 
> held time above threshold: lock identifier: FsDatasetRWLock 
> lockHeldTimeMs=7061 ms. Suppressed 0 lock warnings. The stack trace is: 
> java.lang.Thread.getStackTrace(Thread.java:1559)
> org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032)
> org.apache.hadoop.util.InstrumentedLock.logWarning(InstrumentedLock.java:148)
> org.apache.hadoop.util.InstrumentedLock.check(InstrumentedLock.java:186)
> org.apache.hadoop.util.InstrumentedReadLock.unlock(InstrumentedReadLock.java:78)
> org.apache.hadoop.util.AutoCloseableLock.release(AutoCloseableLock.java:84)
> org.apache.hadoop.util.AutoCloseableLock.close(AutoCloseableLock.java:96)
> org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.scan(DirectoryScanner.java:539)
> org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile(DirectoryScanner.java:416)
> org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.run(DirectoryScanner.java:359)
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> {code}
> We have the following Jiras our code already. But still facing long lock 
> held. - https://issues.apache.org/jira/browse/HDFS-15621, 
> https://issues.apache.org/jira/browse/HDFS-15150, 
> https://issues.apache.org/jira/browse/HDFS-15160, 
> https://issues.apache.org/jira/browse/HDFS-13947
> cc: [~brahma] [~belugabehr] [~sodonnell] [~ayushsaxena]  [~weichiu] 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16001) TestOfflineEditsViewer.testStored() fails reading negative value of FSEditLogOpCodes

2021-05-07 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-16001:
-
Fix Version/s: 3.4.0

Merged PR #2980 into trunk. Opened #2989 for branch-3.3.

> TestOfflineEditsViewer.testStored() fails reading negative value of 
> FSEditLogOpCodes
> 
>
> Key: HDFS-16001
> URL: https://issues.apache.org/jira/browse/HDFS-16001
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Konstantin Shvachko
>Assignee: Akira Ajisaka
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> {{TestOfflineEditsViewer.testStored()}} fails consistently with an exception
> {noformat}
> java.io.IOException: Op -54 has size -1314247195, but the minimum op size is 
> 17
> {noformat}
> Seems like there is a corrupt record in {{editsStored}} file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16008) RBF: Tool to initialize ViewFS Mapping to Router

2021-05-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16008?focusedWorklogId=593412=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-593412
 ]

ASF GitHub Bot logged work on HDFS-16008:
-

Author: ASF GitHub Bot
Created on: 07/May/21 16:34
Start Date: 07/May/21 16:34
Worklog Time Spent: 10m 
  Work Description: goiri commented on a change in pull request #2981:
URL: https://github.com/apache/hadoop/pull/2981#discussion_r628356897



##
File path: 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/site/markdown/HDFSRouterFederation.md
##
@@ -241,6 +241,22 @@ Mount table permission can be set by following command:
 
 The option mode is UNIX-style permissions for the mount table. Permissions are 
specified in octal, e.g. 0755. By default, this is set to 0755.
 
+ Init ViewFs To Router
+Router supports initializing the ViewFS mount point to the Router. The mapping 
directory protocol of ViewFS must be HDFS, and the initializer only supports 
one-to-one mapping.
+
+For example, use the following viewfs to configure the initial mount table to 
the router.
+
+
+  
+fs.viewfs.mounttable.ClusterX.link./data
+hdfs://nn1-clusterx.example.com:8020/data
+  
+
+
+The ViewFS mount table can be initialized to the Router by using the following 
command:
+
+[hdfs]$ $HADOOP_HOME/bin/hdfs dfsrouteradmin -initViewFsToMountTable 
ClusterX

Review comment:
   Can we also support an option to import the full table from ViewFs?

##
File path: 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/router/TestRouterAdminCLI.java
##
@@ -852,6 +908,7 @@ public void testInvalidArgumentMessage() throws Exception {
 + " ]\n"
 + "\t[-clrQuota ]\n"
 + "\t[-clrStorageTypeQuota ]\n"
++"\t[-initViewFsToMountTable ]\n"

Review comment:
   Space after +

##
File path: 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/router/TestRouterAdminCLI.java
##
@@ -700,6 +706,49 @@ public void testAddMountTableIfParentExist() throws 
Exception {
 }
   }
 
+  @Test
+  public void testInitViewFsToMountTable() throws Exception {
+// re-set system out for testing
+System.setOut(new PrintStream(out));
+stateStore.loadCache(MountTableStoreImpl.class, true);
+String nnAddress = cluster.getRandomNamenode().
+getNamenode().getHostAndPort();
+
+String src = "/data";
+Path destPath = new Path("hdfs://" + nnAddress + "/data");
+String user = "user1";
+String group = "group1";
+String clusterName = "ClusterX";
+
+// 0.mkdir destPath
+hdfs.mkdirs(destPath);
+// 1.set owner
+hdfs.setOwner(destPath, user, group);
+// 2.set viewFs mapping
+admin.getConf().set(
+"fs.viewfs.mounttable.ClusterX.link." + src, destPath.toString());
+// 3.run initialization
+String[] argv = new String[]{"-initViewFsToMountTable", clusterName};
+assertEquals(0, ToolRunner.run(admin, argv));
+// 4.gets the mount point entries
+stateStore.loadCache(MountTableStoreImpl.class, true);
+GetMountTableEntriesRequest getRequest = GetMountTableEntriesRequest
+.newInstance(src);
+GetMountTableEntriesResponse getResponse = client.getMountTableManager()
+.getMountTableEntries(getRequest);
+List mountTables = getResponse.getEntries();
+// 5.check
+assertEquals(1, mountTables.size());
+MountTable mountTable = mountTables.get(0);
+assertEquals(user, mountTable.getOwnerName());
+assertEquals(group, mountTable.getGroupName());
+assertEquals(destPath.toUri().getPath(), mountTable.

Review comment:
   Let's assert the size of mountTable.getDestinations() and extract the 
get(0)

##
File path: 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/site/markdown/HDFSRouterFederation.md
##
@@ -241,6 +241,22 @@ Mount table permission can be set by following command:
 
 The option mode is UNIX-style permissions for the mount table. Permissions are 
specified in octal, e.g. 0755. By default, this is set to 0755.
 
+ Init ViewFs To Router
+Router supports initializing the ViewFS mount point to the Router. The mapping 
directory protocol of ViewFS must be HDFS, and the initializer only supports 
one-to-one mapping.

Review comment:
   Pointer to ViewFs doc?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 593412)
Time Spent: 1h 10m  (was: 1h)

> RBF: Tool to initialize ViewFS Mapping to Router
> 
>
> Key: HDFS-16008

[jira] [Work logged] (HDFS-16001) TestOfflineEditsViewer.testStored() fails reading negative value of FSEditLogOpCodes

2021-05-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16001?focusedWorklogId=593411=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-593411
 ]

ASF GitHub Bot logged work on HDFS-16001:
-

Author: ASF GitHub Bot
Created on: 07/May/21 16:33
Start Date: 07/May/21 16:33
Worklog Time Spent: 10m 
  Work Description: aajisaka opened a new pull request #2989:
URL: https://github.com/apache/hadoop/pull/2989


   JIRA: HDFS-16001
   
   Backport of #2980


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 593411)
Time Spent: 0.5h  (was: 20m)

> TestOfflineEditsViewer.testStored() fails reading negative value of 
> FSEditLogOpCodes
> 
>
> Key: HDFS-16001
> URL: https://issues.apache.org/jira/browse/HDFS-16001
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Konstantin Shvachko
>Assignee: Akira Ajisaka
>Priority: Blocker
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> {{TestOfflineEditsViewer.testStored()}} fails consistently with an exception
> {noformat}
> java.io.IOException: Op -54 has size -1314247195, but the minimum op size is 
> 17
> {noformat}
> Seems like there is a corrupt record in {{editsStored}} file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16006) TestRouterFederationRename is flaky

2021-05-07 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-16006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17340922#comment-17340922
 ] 

Íñigo Goiri commented on HDFS-16006:


HDFS-15923 is adjusting the timeout for this test, is that fixing this issue?

> TestRouterFederationRename is flaky
> ---
>
> Key: HDFS-16006
> URL: https://issues.apache.org/jira/browse/HDFS-16006
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: Akira Ajisaka
>Priority: Major
> Attachments: patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt
>
>
> {quote}
> [ERROR] Errors: 
> [ERROR]   
> TestRouterFederationRename.testCounter:440->Object.wait:502->Object.wait:-2 ? 
> TestTimedOut
> [ERROR]   TestRouterFederationRename.testSetup:145 ? Remote The directory 
> /src cannot be...
> [ERROR]   TestRouterFederationRename.testSetup:145 ? Remote The directory 
> /src cannot be...
> {quote}
> https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2970/2/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16011) HDFS: Support viewfs nested mount

2021-05-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16011?focusedWorklogId=593368=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-593368
 ]

ASF GitHub Bot logged work on HDFS-16011:
-

Author: ASF GitHub Bot
Created on: 07/May/21 15:32
Start Date: 07/May/21 15:32
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #2987:
URL: https://github.com/apache/hadoop/pull/2987#issuecomment-834531151


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 50s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  35m 29s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  21m 47s |  |  trunk passed with JDK 
Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  compile  |  19m  9s |  |  trunk passed with JDK 
Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   1m  5s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 34s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m  2s |  |  trunk passed with JDK 
Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 32s |  |  trunk passed with JDK 
Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   2m 24s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  16m 30s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 56s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  20m  4s |  |  the patch passed with JDK 
Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javac  |  20m  4s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  18m  9s |  |  the patch passed with JDK 
Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08  |
   | +1 :green_heart: |  javac  |  18m  9s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   1m  4s | 
[/results-checkstyle-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2987/1/artifact/out/results-checkstyle-hadoop-common-project_hadoop-common.txt)
 |  hadoop-common-project/hadoop-common: The patch generated 13 new + 11 
unchanged - 3 fixed = 24 total (was 14)  |
   | +1 :green_heart: |  mvnsite  |   1m 30s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   1m  3s |  |  the patch passed with JDK 
Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 38s |  |  the patch passed with JDK 
Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   2m 31s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  15m 52s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  |  16m 59s | 
[/patch-unit-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2987/1/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt)
 |  hadoop-common in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 55s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 182m 32s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.fs.viewfs.TestViewFsConfig |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2987/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2987 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux a4ca4648d84b 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 
23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / b78668be9a82e5fd252b2ed4b2e926446ff228d6 |
   | Default Java | Private 

[jira] [Work logged] (HDFS-16001) TestOfflineEditsViewer.testStored() fails reading negative value of FSEditLogOpCodes

2021-05-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16001?focusedWorklogId=593363=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-593363
 ]

ASF GitHub Bot logged work on HDFS-16001:
-

Author: ASF GitHub Bot
Created on: 07/May/21 15:24
Start Date: 07/May/21 15:24
Worklog Time Spent: 10m 
  Work Description: aajisaka commented on pull request #2980:
URL: https://github.com/apache/hadoop/pull/2980#issuecomment-834519639


   Merged. Thank you @Hexiaoqiao for your review.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 593363)
Time Spent: 20m  (was: 10m)

> TestOfflineEditsViewer.testStored() fails reading negative value of 
> FSEditLogOpCodes
> 
>
> Key: HDFS-16001
> URL: https://issues.apache.org/jira/browse/HDFS-16001
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Konstantin Shvachko
>Assignee: Akira Ajisaka
>Priority: Blocker
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {{TestOfflineEditsViewer.testStored()}} fails consistently with an exception
> {noformat}
> java.io.IOException: Op -54 has size -1314247195, but the minimum op size is 
> 17
> {noformat}
> Seems like there is a corrupt record in {{editsStored}} file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16001) TestOfflineEditsViewer.testStored() fails reading negative value of FSEditLogOpCodes

2021-05-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-16001:
--
Labels: pull-request-available  (was: )

> TestOfflineEditsViewer.testStored() fails reading negative value of 
> FSEditLogOpCodes
> 
>
> Key: HDFS-16001
> URL: https://issues.apache.org/jira/browse/HDFS-16001
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Konstantin Shvachko
>Assignee: Akira Ajisaka
>Priority: Blocker
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> {{TestOfflineEditsViewer.testStored()}} fails consistently with an exception
> {noformat}
> java.io.IOException: Op -54 has size -1314247195, but the minimum op size is 
> 17
> {noformat}
> Seems like there is a corrupt record in {{editsStored}} file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16001) TestOfflineEditsViewer.testStored() fails reading negative value of FSEditLogOpCodes

2021-05-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16001?focusedWorklogId=593362=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-593362
 ]

ASF GitHub Bot logged work on HDFS-16001:
-

Author: ASF GitHub Bot
Created on: 07/May/21 15:24
Start Date: 07/May/21 15:24
Worklog Time Spent: 10m 
  Work Description: aajisaka merged pull request #2980:
URL: https://github.com/apache/hadoop/pull/2980


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 593362)
Remaining Estimate: 0h
Time Spent: 10m

> TestOfflineEditsViewer.testStored() fails reading negative value of 
> FSEditLogOpCodes
> 
>
> Key: HDFS-16001
> URL: https://issues.apache.org/jira/browse/HDFS-16001
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Konstantin Shvachko
>Assignee: Akira Ajisaka
>Priority: Blocker
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> {{TestOfflineEditsViewer.testStored()}} fails consistently with an exception
> {noformat}
> java.io.IOException: Op -54 has size -1314247195, but the minimum op size is 
> 17
> {noformat}
> Seems like there is a corrupt record in {{editsStored}} file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16013) DirectoryScan operation holds dataset lock for long time

2021-05-07 Thread Stephen O'Donnell (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17340866#comment-17340866
 ] 

Stephen O'Donnell commented on HDFS-16013:
--

You want these ones:

HDFS-15406. Improve the speed of Datanode Block Scan. Contributed by 
hemanthboyina
HDFS-15574. Remove unnecessary sort of block list in DirectoryScanner. 
Contributed by Stephen O'Donnell.
HDFS-15583. Backport DirectoryScanner improvements HDFS-14476, HDFS-14751 and 
HDFS-15048 to branch 3.2 and 3.1. Contributed by Stephen O'Donnell
HDFS-15415. Reduce locking in Datanode DirectoryScanner. Contributed by Stephen 
O'Donnell

The last one is key, as it removes the lock completely, but the first one above 
makes a big different to the speed too.

> DirectoryScan operation holds dataset lock for long time
> 
>
> Key: HDFS-16013
> URL: https://issues.apache.org/jira/browse/HDFS-16013
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Renukaprasad C
>Priority: Critical
>
> Environment: 3 Node cluster with around 2M files & same number of blocks.
> All file operations are normal, only during directory scan, which take more 
> memory and some long GC Pause. This directory scan happens for every 6H 
> (default value) which cause slow response to any file operations. Delay is 
> around 5-8 seconds (In production this delay got increased to 30+ seconds 
> with 8M blocks)
> GC Configuration:
> -Xms6144M 
> -Xmx12288M /8G
> -XX:NewSize=614M 
> -XX:MaxNewSize=1228M 
> -XX:MetaspaceSize=128M 
> -XX:MaxMetaspaceSize=128M 
> -XX:CMSFullGCsBeforeCompaction=1 
> -XX:MaxDirectMemorySize=1G 
> -XX:+UseConcMarkSweepGC 
> -XX:+CMSParallelRemarkEnabled 
> -XX:+UseCMSCompactAtFullCollection 
> -XX:CMSInitiatingOccupancyFraction=80 
> Also we tried with G1 GC, but couldnt find much difference in the result.
> -XX:+UseG1GC
> -XX:MaxGCPauseMillis=200
> -XX:InitiatingHeapOccupancyPercent=45
> -XX:G1ReservePercent=10
> {code:java}
> 2021-05-07 16:32:23,508 INFO 
> org.apache.hadoop.hdfs.server.datanode.DirectoryScanner: BlockPool 
> BP-345634799--1619695417333 Total blocks: 2767211, missing metadata 
> files: 22, missing block files: 22, missing blocks in memory: 0, mismatched 
> blocks: 0
> 2021-05-07 16:32:23,508 WARN 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Lock 
> held time above threshold: lock identifier: FsDatasetRWLock 
> lockHeldTimeMs=7061 ms. Suppressed 0 lock warnings. The stack trace is: 
> java.lang.Thread.getStackTrace(Thread.java:1559)
> org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032)
> org.apache.hadoop.util.InstrumentedLock.logWarning(InstrumentedLock.java:148)
> org.apache.hadoop.util.InstrumentedLock.check(InstrumentedLock.java:186)
> org.apache.hadoop.util.InstrumentedReadLock.unlock(InstrumentedReadLock.java:78)
> org.apache.hadoop.util.AutoCloseableLock.release(AutoCloseableLock.java:84)
> org.apache.hadoop.util.AutoCloseableLock.close(AutoCloseableLock.java:96)
> org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.scan(DirectoryScanner.java:539)
> org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile(DirectoryScanner.java:416)
> org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.run(DirectoryScanner.java:359)
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> {code}
> We have the following Jiras our code already. But still facing long lock 
> held. - https://issues.apache.org/jira/browse/HDFS-15621, 
> https://issues.apache.org/jira/browse/HDFS-15150, 
> https://issues.apache.org/jira/browse/HDFS-15160, 
> https://issues.apache.org/jira/browse/HDFS-13947
> cc: [~brahma] [~belugabehr] [~sodonnell] [~ayushsaxena]  [~weichiu] 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16007) Vulnerabilities found when serializing enum value

2021-05-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16007?focusedWorklogId=593322=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-593322
 ]

ASF GitHub Bot logged work on HDFS-16007:
-

Author: ASF GitHub Bot
Created on: 07/May/21 14:15
Start Date: 07/May/21 14:15
Worklog Time Spent: 10m 
  Work Description: virajjasani commented on pull request #2982:
URL: https://github.com/apache/hadoop/pull/2982#issuecomment-834440718


   Thanks for the review @aajisaka. All test failures seem flakies.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 593322)
Time Spent: 50m  (was: 40m)

> Vulnerabilities found when serializing enum value
> -
>
> Key: HDFS-16007
> URL: https://issues.apache.org/jira/browse/HDFS-16007
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: junwen yang
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> ReplicaState enum is using ordinal to conduct serialization and 
> deserialization, which is vulnerable to the order, to cause issues similar to 
> HDFS-15624.
> To avoid it, either adding comments to let later developer not to change this 
> enum, or add index checking in the read and getState function to avoid index 
> out of bound error. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-15994) Deletion should sleep some time, when there are too many pending deletion blocks.

2021-05-07 Thread Qi Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17340857#comment-17340857
 ] 

Qi Zhu edited comment on HDFS-15994 at 5/7/21, 2:08 PM:


Thanks a lot [~hexiaoqiao] for reply.

The deletion process:

1)Namespace Delete:remove the file related meta from Inode Tree

2)Remove Block: remove the blocks from BlockMap, and add the blocks to 
InvalidateBlocks

3)Waiting ReplicationMonitor to trigger Delete Work,send heartbeat to DN for 
deleting

 

The 2 step dominant about 90% RPC handler for deletion, and the 3 step are 
async, will not affect the RPC handler. 

About the using `release lock - sleep - acquire lock` to avoid NameNode hang 
for long time, i am not sure how to avoid lock too busy so just give this 
choice to release.

And about the multi-thread deletion may use too many handler, we can make the 
deletion async to release the handler in a following Jira.

For this jira, we can discuss how to avoid the lock too busy beside the  
`release lock - sleep - acquire lock` choice.

cc [~weichiu]  [~sodonnell] [~ayushtkn]  

What's your opinions ?  

Thanks.

 

 


was (Author: zhuqi):
Thanks a lot [~hexiaoqiao] for reply.

The deletion process:

1)Namespace Delete:remove the file related meta from Inode Tree;

2)Remove Block: remove the blocks from BlockMap, and add the blocks to 
InvalidateBlocks. 

3)Waiting ReplicationMonitor to trigger Delete Work,send heartbeat to DN for 
deleting。

 

The 2 step dominant about 90% RPC handler for deletion, and the 3 step are 
async, will not affect the RPC handler. 

About the using `release lock - sleep - acquire lock` to avoid NameNode hang 
for long time, i am not sure how to avoid lock too busy so just give this 
choice to release.

And about the multi-thread deletion may use too many handler, we can make the 
deletion async to release the handler in a following Jira.

For this jira, we can discuss how to avoid the lock too busy beside the  
`release lock - sleep - acquire lock` choice.

cc [~weichiu]  [~sodonnell] [~ayushtkn]  

What's your opinions ?  

Thanks.

 

 

> Deletion should sleep some time, when there are too many pending deletion 
> blocks.
> -
>
> Key: HDFS-15994
> URL: https://issues.apache.org/jira/browse/HDFS-15994
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Qi Zhu
>Assignee: Qi Zhu
>Priority: Major
> Attachments: HDFS-15994.001.patch
>
>
> HDFS-13831 realize that we can control the frequency of other waiters to get 
> the lock chance.
> But actually in our big cluster with heavy deletion:
> The problem still happened,  and the pending deletion blocks will be more 
> than ten million somtimes, and the size become more than 1 million in regular 
> in huge clusters.
> So i think we should sleep for some time when pending too many deletion 
> blocks. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15994) Deletion should sleep some time, when there are too many pending deletion blocks.

2021-05-07 Thread Qi Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17340857#comment-17340857
 ] 

Qi Zhu commented on HDFS-15994:
---

Thanks a lot [~hexiaoqiao] for reply.

The deletion process:

1)Namespace Delete:remove the file related meta from Inode Tree;

2)Remove Block: remove the blocks from BlockMap, and add the blocks to 
InvalidateBlocks. 

3)Waiting ReplicationMonitor to trigger Delete Work,send heartbeat to DN for 
deleting。

 

The 2 step dominant about 90% RPC handler for deletion, and the 3 step are 
async, will not affect the RPC handler. 

About the using `release lock - sleep - acquire lock` to avoid NameNode hang 
for long time, i am not sure how to avoid lock too busy so just give this 
choice to release.

And about the multi-thread deletion may use too many handler, we can make the 
deletion async to release the handler in a following Jira.

For this jira, we can discuss how to avoid the lock too busy beside the  
`release lock - sleep - acquire lock` choice.

cc [~weichiu]  [~sodonnell] [~ayushtkn]  

What's your opinions ?  

Thanks.

 

 

> Deletion should sleep some time, when there are too many pending deletion 
> blocks.
> -
>
> Key: HDFS-15994
> URL: https://issues.apache.org/jira/browse/HDFS-15994
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Qi Zhu
>Assignee: Qi Zhu
>Priority: Major
> Attachments: HDFS-15994.001.patch
>
>
> HDFS-13831 realize that we can control the frequency of other waiters to get 
> the lock chance.
> But actually in our big cluster with heavy deletion:
> The problem still happened,  and the pending deletion blocks will be more 
> than ten million somtimes, and the size become more than 1 million in regular 
> in huge clusters.
> So i think we should sleep for some time when pending too many deletion 
> blocks. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15878) RBF: Flaky test TestRouterWebHDFSContractCreate>AbstractContractCreateTest#testSyncable in Trunk

2021-05-07 Thread Xiaoqiao He (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17340822#comment-17340822
 ] 

Xiaoqiao He commented on HDFS-15878:


Hi [~fengnanli], Just found that this issue is marked as `resolved` but without 
fix version and no attach patches or PR link. Is it more proper to mark as `Not 
A Problem`? Thanks.

> RBF: Flaky test 
> TestRouterWebHDFSContractCreate>AbstractContractCreateTest#testSyncable in 
> Trunk
> 
>
> Key: HDFS-15878
> URL: https://issues.apache.org/jira/browse/HDFS-15878
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs, rbf
>Reporter: Renukaprasad C
>Assignee: Fengnan Li
>Priority: Major
>
> ERROR] Tests run: 16, Failures: 0, Errors: 1, Skipped: 2, Time elapsed: 
> 24.627 s <<< FAILURE! - in 
> org.apache.hadoop.fs.contract.router.web.TestRouterWebHDFSContractCreate
> [ERROR] 
> testSyncable(org.apache.hadoop.fs.contract.router.web.TestRouterWebHDFSContractCreate)
>   Time elapsed: 0.222 s  <<< ERROR!
> java.io.FileNotFoundException: File /test/testSyncable not found.
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:121)
>   at 
> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:110)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toIOException(WebHdfsFileSystem.java:576)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$900(WebHdfsFileSystem.java:146)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.shouldRetry(WebHdfsFileSystem.java:892)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:858)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:652)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:690)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:686)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$ReadRunner.getRedirectedUrl(WebHdfsFileSystem.java:2307)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$ReadRunner.(WebHdfsFileSystem.java:2296)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$WebHdfsInputStream.(WebHdfsFileSystem.java:2176)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.open(WebHdfsFileSystem.java:1610)
>   at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:975)
>   at 
> org.apache.hadoop.fs.contract.AbstractContractCreateTest.validateSyncableSemantics(AbstractContractCreateTest.java:556)
>   at 
> org.apache.hadoop.fs.contract.AbstractContractCreateTest.testSyncable(AbstractContractCreateTest.java:459)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: 
> 

[jira] [Commented] (HDFS-15968) Improve the log for The DecayRpcScheduler

2021-05-07 Thread Xiaoqiao He (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17340820#comment-17340820
 ] 

Xiaoqiao He commented on HDFS-15968:


+1 for v001. The failed unit test seems not related to this changes. Thanks 
[~bpatel] for your works.

> Improve the log for The DecayRpcScheduler 
> --
>
> Key: HDFS-15968
> URL: https://issues.apache.org/jira/browse/HDFS-15968
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Bhavik Patel
>Assignee: Bhavik Patel
>Priority: Minor
> Attachments: HDFS-15968.001.patch
>
>
> Improve the log for The DecayRpcScheduler to make use of the SELF4j logger 
> factory



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Issue Comment Deleted] (HDFS-14986) ReplicaCachingGetSpaceUsed throws ConcurrentModificationException

2021-05-07 Thread lei w (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lei w updated HDFS-14986:
-
Comment: was deleted

(was: hi [~Aiphag0] , Could you explain how you solve the deadlock problem in 
briefly.)

> ReplicaCachingGetSpaceUsed throws  ConcurrentModificationException
> --
>
> Key: HDFS-14986
> URL: https://issues.apache.org/jira/browse/HDFS-14986
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, performance
>Affects Versions: 2.10.0
>Reporter: Ryan Wu
>Assignee: Aiphago
>Priority: Major
> Fix For: 3.3.0, 3.1.4, 2.10.1
>
> Attachments: HDFS-14986.001.patch, HDFS-14986.002.patch, 
> HDFS-14986.003.patch, HDFS-14986.004.patch, HDFS-14986.005.patch, 
> HDFS-14986.006.patch
>
>
> Running DU across lots of disks is very expensive . We applied the patch 
> HDFS-14313 to get  used space from ReplicaInfo in memory.However, new du 
> threads throw the exception
> {code:java}
> // 2019-11-08 18:07:13,858 ERROR 
> [refreshUsed-/home/vipshop/hard_disk/7/dfs/dn/current/BP-1203969992--1450855658517]
>  
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed:
>  ReplicaCachingGetSpaceUsed refresh error
> java.util.ConcurrentModificationException: Tree has been modified outside of 
> iterator
> at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.checkForModification(FoldedTreeSet.java:311)
> 
> at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.hasNext(FoldedTreeSet.java:256)
> 
> at java.util.AbstractCollection.addAll(AbstractCollection.java:343)
> at java.util.HashSet.(HashSet.java:120)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.deepCopyReplica(FsDatasetImpl.java:1052)
> 
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed.refresh(ReplicaCachingGetSpaceUsed.java:73)
> 
> at 
> org.apache.hadoop.fs.CachingGetSpaceUsed$RefreshThread.run(CachingGetSpaceUsed.java:178)
>    
> at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16009) HDFS tiered storage support

2021-05-07 Thread Xiaoqiao He (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoqiao He resolved HDFS-16009.

Resolution: Duplicate

Close this issue based on above comments. Please feel free to reopen it if need 
to improve it. Thanks [~LeonG] and [~jianghuazhu].

> HDFS tiered storage support
> ---
>
> Key: HDFS-16009
> URL: https://issues.apache.org/jira/browse/HDFS-16009
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: JiangHua Zhu
>Priority: Major
>
> HDFS currently stores data blocks on different data nodes. These data nodes 
> are random within the scope of the network topology (local rack/different 
> racks/node group).
> When the NameNode allocates DataNodes to the Block, it is randomly selected, 
> that is, each DataNode is selected equally.
> We should make a hierarchical strategy. For example, DataNode1, DataNode2, 
> and DataNode3 are the same level, and DataNode4 and DataNode5 are the same 
> level. Because the hardware between DataNodes may be different, in this case, 
> it is necessary for us to do tiered storage, so that DataNodes with poor 
> hardware can store up to 50%, and 40% or even lower. This is conducive to 
> cluster maintenance and stability.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14986) ReplicaCachingGetSpaceUsed throws ConcurrentModificationException

2021-05-07 Thread lei w (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17340787#comment-17340787
 ] 

lei w commented on HDFS-14986:
--

hi [~Aiphag0] , Could you explain how you solve the deadlock problem in briefly.

> ReplicaCachingGetSpaceUsed throws  ConcurrentModificationException
> --
>
> Key: HDFS-14986
> URL: https://issues.apache.org/jira/browse/HDFS-14986
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, performance
>Affects Versions: 2.10.0
>Reporter: Ryan Wu
>Assignee: Aiphago
>Priority: Major
> Fix For: 3.3.0, 3.1.4, 2.10.1
>
> Attachments: HDFS-14986.001.patch, HDFS-14986.002.patch, 
> HDFS-14986.003.patch, HDFS-14986.004.patch, HDFS-14986.005.patch, 
> HDFS-14986.006.patch
>
>
> Running DU across lots of disks is very expensive . We applied the patch 
> HDFS-14313 to get  used space from ReplicaInfo in memory.However, new du 
> threads throw the exception
> {code:java}
> // 2019-11-08 18:07:13,858 ERROR 
> [refreshUsed-/home/vipshop/hard_disk/7/dfs/dn/current/BP-1203969992--1450855658517]
>  
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed:
>  ReplicaCachingGetSpaceUsed refresh error
> java.util.ConcurrentModificationException: Tree has been modified outside of 
> iterator
> at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.checkForModification(FoldedTreeSet.java:311)
> 
> at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.hasNext(FoldedTreeSet.java:256)
> 
> at java.util.AbstractCollection.addAll(AbstractCollection.java:343)
> at java.util.HashSet.(HashSet.java:120)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.deepCopyReplica(FsDatasetImpl.java:1052)
> 
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed.refresh(ReplicaCachingGetSpaceUsed.java:73)
> 
> at 
> org.apache.hadoop.fs.CachingGetSpaceUsed$RefreshThread.run(CachingGetSpaceUsed.java:178)
>    
> at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-16013) DirectoryScan operation holds dataset lock for long time

2021-05-07 Thread Renukaprasad C (Jira)
Renukaprasad C created HDFS-16013:
-

 Summary: DirectoryScan operation holds dataset lock for long time
 Key: HDFS-16013
 URL: https://issues.apache.org/jira/browse/HDFS-16013
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Renukaprasad C


Environment: 3 Node cluster with around 2M files & same number of blocks.

All file operations are normal, only during directory scan, which take more 
memory and some long GC Pause. This directory scan happens for every 6H 
(default value) which cause slow response to any file operations. Delay is 
around 5-8 seconds (In production this delay got increased to 30+ seconds with 
8M blocks)

GC Configuration:
-Xms6144M 
-Xmx12288M /8G
-XX:NewSize=614M 
-XX:MaxNewSize=1228M 
-XX:MetaspaceSize=128M 
-XX:MaxMetaspaceSize=128M 
-XX:CMSFullGCsBeforeCompaction=1 
-XX:MaxDirectMemorySize=1G 
-XX:+UseConcMarkSweepGC 
-XX:+CMSParallelRemarkEnabled 
-XX:+UseCMSCompactAtFullCollection 
-XX:CMSInitiatingOccupancyFraction=80 

Also we tried with G1 GC, but couldnt find much difference in the result.
-XX:+UseG1GC
-XX:MaxGCPauseMillis=200
-XX:InitiatingHeapOccupancyPercent=45
-XX:G1ReservePercent=10


{code:java}
2021-05-07 16:32:23,508 INFO 
org.apache.hadoop.hdfs.server.datanode.DirectoryScanner: BlockPool 
BP-345634799--1619695417333 Total blocks: 2767211, missing metadata files: 
22, missing block files: 22, missing blocks in memory: 0, mismatched blocks: 0
2021-05-07 16:32:23,508 WARN 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Lock held 
time above threshold: lock identifier: FsDatasetRWLock lockHeldTimeMs=7061 ms. 
Suppressed 0 lock warnings. The stack trace is: 
java.lang.Thread.getStackTrace(Thread.java:1559)
org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032)
org.apache.hadoop.util.InstrumentedLock.logWarning(InstrumentedLock.java:148)
org.apache.hadoop.util.InstrumentedLock.check(InstrumentedLock.java:186)
org.apache.hadoop.util.InstrumentedReadLock.unlock(InstrumentedReadLock.java:78)
org.apache.hadoop.util.AutoCloseableLock.release(AutoCloseableLock.java:84)
org.apache.hadoop.util.AutoCloseableLock.close(AutoCloseableLock.java:96)
org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.scan(DirectoryScanner.java:539)
org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile(DirectoryScanner.java:416)
org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.run(DirectoryScanner.java:359)
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
{code}

We have the following Jiras our code already. But still facing long lock held. 
- https://issues.apache.org/jira/browse/HDFS-15621, 
https://issues.apache.org/jira/browse/HDFS-15150, 
https://issues.apache.org/jira/browse/HDFS-15160, 
https://issues.apache.org/jira/browse/HDFS-13947


cc: [~brahma] [~belugabehr] [~sodonnell] [~ayushsaxena]  [~weichiu] 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15994) Deletion should sleep some time, when there are too many pending deletion blocks.

2021-05-07 Thread Xiaoqiao He (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17340786#comment-17340786
 ] 

Xiaoqiao He commented on HDFS-15994:


Thanks [~zhuqi] for your report and contributions. I am totally +1 for this 
proposal. I have met NameNode hang for long time due to process delete request 
about very large directory many times.
IIRC, some other guys try to improve it but not push it forward completely. IMO 
it is proper time to do that again.
IIUC, there are two segments to hold global lock for long time and need to 
improve.
a. traverse the whole sub-directory and collect the pending deleted blocks.
b. execute to delete blocks.
It could be one choice using `release lock - sleep - acquire lock` to avoid 
NameNode hang for long time. I am not sure if it is a best solution, welcome 
deep discussion.
For patch v001, I am not fans to add extra configuration entry for every 
improvement because there are so many configuration and it is more and more 
confused to end user. I think we could give static parameter in this case.
Thanks [~zhuqi] again.

> Deletion should sleep some time, when there are too many pending deletion 
> blocks.
> -
>
> Key: HDFS-15994
> URL: https://issues.apache.org/jira/browse/HDFS-15994
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Qi Zhu
>Assignee: Qi Zhu
>Priority: Major
> Attachments: HDFS-15994.001.patch
>
>
> HDFS-13831 realize that we can control the frequency of other waiters to get 
> the lock chance.
> But actually in our big cluster with heavy deletion:
> The problem still happened,  and the pending deletion blocks will be more 
> than ten million somtimes, and the size become more than 1 million in regular 
> in huge clusters.
> So i think we should sleep for some time when pending too many deletion 
> blocks. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16012) Improve DataXceiver#copyBlock() log printing

2021-05-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-16012:
--
Labels: pull-request-available  (was: )

> Improve DataXceiver#copyBlock() log printing
> 
>
> Key: HDFS-16012
> URL: https://issues.apache.org/jira/browse/HDFS-16012
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2021-05-07-20-18-45-523.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When executing DataXceiver#copyBlock(), if an exception is encountered, the 
> following will be printed:
>  !image-2021-05-07-20-18-45-523.png! 
> The information printed here is not conducive to assist in troubleshooting.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16007) Vulnerabilities found when serializing enum value

2021-05-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16007?focusedWorklogId=593265=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-593265
 ]

ASF GitHub Bot logged work on HDFS-16007:
-

Author: ASF GitHub Bot
Created on: 07/May/21 12:40
Start Date: 07/May/21 12:40
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #2982:
URL: https://github.com/apache/hadoop/pull/2982#issuecomment-834341268


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 33s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  34m 39s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 23s |  |  trunk passed with JDK 
Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  compile  |   1m 16s |  |  trunk passed with JDK 
Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   1m  1s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 22s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 57s |  |  trunk passed with JDK 
Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 24s |  |  trunk passed with JDK 
Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   3m  6s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  16m 13s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 12s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 16s |  |  the patch passed with JDK 
Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javac  |   1m 16s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 10s |  |  the patch passed with JDK 
Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08  |
   | +1 :green_heart: |  javac  |   1m 10s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 52s |  |  
hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 40 unchanged - 1 
fixed = 40 total (was 41)  |
   | +1 :green_heart: |  mvnsite  |   1m 14s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 46s |  |  the patch passed with JDK 
Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 16s |  |  the patch passed with JDK 
Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   3m  8s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  16m 12s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 230m 55s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2982/5/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 46s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 318m 35s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks |
   |   | hadoop.hdfs.qjournal.server.TestJournalNodeRespectsBindHostKeys |
   |   | hadoop.hdfs.server.namenode.snapshot.TestNestedSnapshots |
   |   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
   |   | hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2982/5/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2982 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux 32b7e03c7fee 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 
16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 5014857973bfcbfdaf7d98978411f417e1be8d87 |
   | Default Java | Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 |
   | Multi-JDK versions 

[jira] [Work logged] (HDFS-16012) Improve DataXceiver#copyBlock() log printing

2021-05-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16012?focusedWorklogId=593266=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-593266
 ]

ASF GitHub Bot logged work on HDFS-16012:
-

Author: ASF GitHub Bot
Created on: 07/May/21 12:40
Start Date: 07/May/21 12:40
Worklog Time Spent: 10m 
  Work Description: jianghuazhu opened a new pull request #2988:
URL: https://github.com/apache/hadoop/pull/2988


   ## NOTICE
   
   Please create an issue in ASF JIRA before opening a pull request,
   and you need to set the title of the pull request which starts with
   the corresponding JIRA issue number. (e.g. HADOOP-X. Fix a typo in YYY.)
   For more details, please see 
https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 593266)
Remaining Estimate: 0h
Time Spent: 10m

> Improve DataXceiver#copyBlock() log printing
> 
>
> Key: HDFS-16012
> URL: https://issues.apache.org/jira/browse/HDFS-16012
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
> Attachments: image-2021-05-07-20-18-45-523.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When executing DataXceiver#copyBlock(), if an exception is encountered, the 
> following will be printed:
>  !image-2021-05-07-20-18-45-523.png! 
> The information printed here is not conducive to assist in troubleshooting.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16011) HDFS: Support viewfs nested mount

2021-05-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16011?focusedWorklogId=593258=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-593258
 ]

ASF GitHub Bot logged work on HDFS-16011:
-

Author: ASF GitHub Bot
Created on: 07/May/21 12:28
Start Date: 07/May/21 12:28
Worklog Time Spent: 10m 
  Work Description: zhuxiangyi opened a new pull request #2987:
URL: https://github.com/apache/hadoop/pull/2987


   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 593258)
Remaining Estimate: 0h
Time Spent: 10m

> HDFS: Support viewfs nested mount
> -
>
> Key: HDFS-16011
> URL: https://issues.apache.org/jira/browse/HDFS-16011
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.2.2, 3.3.1
>Reporter: zhu
>Assignee: zhu
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The current viewfs does not support nested mount points. The current viewfs 
> does not support nested mount points. 
> *E.g:*
>     1./home/ => /home/
>     2./home/work => /home/work
> If mount point 1 is loaded, mount point 2 cannot be added, and the following 
> exception will be thrown when loading 2.
> {code:java}
> throw new FileAlreadyExistsException("Path " + nextInode.fullPath +
>  " already exists as link");
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16011) HDFS: Support viewfs nested mount

2021-05-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-16011:
--
Labels: pull-request-available  (was: )

> HDFS: Support viewfs nested mount
> -
>
> Key: HDFS-16011
> URL: https://issues.apache.org/jira/browse/HDFS-16011
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.2.2, 3.3.1
>Reporter: zhu
>Assignee: zhu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The current viewfs does not support nested mount points. The current viewfs 
> does not support nested mount points. 
> *E.g:*
>     1./home/ => /home/
>     2./home/work => /home/work
> If mount point 1 is loaded, mount point 2 cannot be added, and the following 
> exception will be thrown when loading 2.
> {code:java}
> throw new FileAlreadyExistsException("Path " + nextInode.fullPath +
>  " already exists as link");
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16006) TestRouterFederationRename is flaky

2021-05-07 Thread Xiaoqiao He (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17340776#comment-17340776
 ] 

Xiaoqiao He commented on HDFS-16006:


[~LiJinglun] do you mind have another check? Thanks.

> TestRouterFederationRename is flaky
> ---
>
> Key: HDFS-16006
> URL: https://issues.apache.org/jira/browse/HDFS-16006
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: Akira Ajisaka
>Priority: Major
> Attachments: patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt
>
>
> {quote}
> [ERROR] Errors: 
> [ERROR]   
> TestRouterFederationRename.testCounter:440->Object.wait:502->Object.wait:-2 ? 
> TestTimedOut
> [ERROR]   TestRouterFederationRename.testSetup:145 ? Remote The directory 
> /src cannot be...
> [ERROR]   TestRouterFederationRename.testSetup:145 ? Remote The directory 
> /src cannot be...
> {quote}
> https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2970/2/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-16012) Improve DataXceiver#copyBlock() log printing

2021-05-07 Thread JiangHua Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

JiangHua Zhu reassigned HDFS-16012:
---

Assignee: JiangHua Zhu

> Improve DataXceiver#copyBlock() log printing
> 
>
> Key: HDFS-16012
> URL: https://issues.apache.org/jira/browse/HDFS-16012
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
> Attachments: image-2021-05-07-20-18-45-523.png
>
>
> When executing DataXceiver#copyBlock(), if an exception is encountered, the 
> following will be printed:
>  !image-2021-05-07-20-18-45-523.png! 
> The information printed here is not conducive to assist in troubleshooting.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-16012) Improve DataXceiver#copyBlock() log printing

2021-05-07 Thread JiangHua Zhu (Jira)
JiangHua Zhu created HDFS-16012:
---

 Summary: Improve DataXceiver#copyBlock() log printing
 Key: HDFS-16012
 URL: https://issues.apache.org/jira/browse/HDFS-16012
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: JiangHua Zhu
 Attachments: image-2021-05-07-20-18-45-523.png

When executing DataXceiver#copyBlock(), if an exception is encountered, the 
following will be printed:
 !image-2021-05-07-20-18-45-523.png! 

The information printed here is not conducive to assist in troubleshooting.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15757) RBF: Improving Router Connection Management

2021-05-07 Thread Xiaoqiao He (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17340768#comment-17340768
 ] 

Xiaoqiao He commented on HDFS-15757:


Thanks [~fengnanli] for your pings and sorry for late response since I'm on 
vacation last week.
In my production env, I have created another proxyCache replace to 
ConnectionManager. The result is expected and the TP99 cost of each RPC 
dispatch is less than 10ms which is far less than using ConnectionManager. I 
have to state that this improvement is based on our internal branch which is 
different to community branch such as async RPC dispatch etc. And the request 
mode could be very different to common case (ugi count is very large and hit 
rate is very low when using ConnectionManager).
For this improvement, I think it is ready to enter trunk based on our staging 
env benchmark when backport it into my internal branch.
I do not get the original thought to involve the ConnectionManager for Router 
after trace the history, So I wonder if ConnectionManager is effective and is 
it necessary for Router? Please correct me if something I missed. Thanks. cc 
[~elgoiri],[~ayushtkn]

> RBF: Improving Router Connection Management
> ---
>
> Key: HDFS-15757
> URL: https://issues.apache.org/jira/browse/HDFS-15757
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rbf
>Reporter: Fengnan Li
>Assignee: Fengnan Li
>Priority: Major
>  Labels: pull-request-available
> Attachments: RBF_ Improving Router Connection Management_v2.pdf, RBF_ 
> Improving Router Connection Management_v3.pdf, RBF_ Router Connection 
> Management.pdf
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> We have seen high number of connections from Router to namenodes, leaving 
> namenodes unstable.
> This ticket is trying to reduce connections through some changes. Please take 
> a look at the design and leave comments. 
> Thanks!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16011) HDFS: Support viewfs nested mount

2021-05-07 Thread zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhu updated HDFS-16011:
---
Description: 
The current viewfs does not support nested mount points. The current viewfs 
does not support nested mount points. 

*E.g:*

    1./home/ => /home/

    2./home/work => /home/work

If mount point 1 is loaded, mount point 2 cannot be added, and the following 
exception will be thrown when loading 2.
{code:java}
throw new FileAlreadyExistsException("Path " + nextInode.fullPath +
 " already exists as link");
{code}

  was:
The current viewfs does not support nested mount points. The current viewfs 
does not support nested mount points. 

*E.g:* 

    1./home/ => /home/

    2./home/work => /home/work

If mount point 1 is loaded, mount point 2 cannot be added, and the following 
exception will be thrown when loading 2
{code:java}
throw new FileAlreadyExistsException("Path " + nextInode.fullPath +
 " already exists as link");
{code}


> HDFS: Support viewfs nested mount
> -
>
> Key: HDFS-16011
> URL: https://issues.apache.org/jira/browse/HDFS-16011
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.2.2, 3.3.1
>Reporter: zhu
>Assignee: zhu
>Priority: Major
>
> The current viewfs does not support nested mount points. The current viewfs 
> does not support nested mount points. 
> *E.g:*
>     1./home/ => /home/
>     2./home/work => /home/work
> If mount point 1 is loaded, mount point 2 cannot be added, and the following 
> exception will be thrown when loading 2.
> {code:java}
> throw new FileAlreadyExistsException("Path " + nextInode.fullPath +
>  " already exists as link");
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-16011) HDFS: Support viewfs nested mount

2021-05-07 Thread zhu (Jira)
zhu created HDFS-16011:
--

 Summary: HDFS: Support viewfs nested mount
 Key: HDFS-16011
 URL: https://issues.apache.org/jira/browse/HDFS-16011
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs
Affects Versions: 3.2.2, 3.3.1
Reporter: zhu
Assignee: zhu


The current viewfs does not support nested mount points. The current viewfs 
does not support nested mount points. 

*E.g:* 

    1./home/ => /home/

    2./home/work => /home/work

If mount point 1 is loaded, mount point 2 cannot be added, and the following 
exception will be thrown when loading 2
{code:java}
throw new FileAlreadyExistsException("Path " + nextInode.fullPath +
 " already exists as link");
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Issue Comment Deleted] (HDFS-14986) ReplicaCachingGetSpaceUsed throws ConcurrentModificationException

2021-05-07 Thread lei w (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lei w updated HDFS-14986:
-
Comment: was deleted

(was: hi [~Aiphag0] ,Will hadoop 3.1.2 also appear deadlock as you said above ? 
https://issues.apache.org/jira/browse/HDFS-14986?focusedCommentId=16973314=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16973314)

> ReplicaCachingGetSpaceUsed throws  ConcurrentModificationException
> --
>
> Key: HDFS-14986
> URL: https://issues.apache.org/jira/browse/HDFS-14986
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, performance
>Affects Versions: 2.10.0
>Reporter: Ryan Wu
>Assignee: Aiphago
>Priority: Major
> Fix For: 3.3.0, 3.1.4, 2.10.1
>
> Attachments: HDFS-14986.001.patch, HDFS-14986.002.patch, 
> HDFS-14986.003.patch, HDFS-14986.004.patch, HDFS-14986.005.patch, 
> HDFS-14986.006.patch
>
>
> Running DU across lots of disks is very expensive . We applied the patch 
> HDFS-14313 to get  used space from ReplicaInfo in memory.However, new du 
> threads throw the exception
> {code:java}
> // 2019-11-08 18:07:13,858 ERROR 
> [refreshUsed-/home/vipshop/hard_disk/7/dfs/dn/current/BP-1203969992--1450855658517]
>  
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed:
>  ReplicaCachingGetSpaceUsed refresh error
> java.util.ConcurrentModificationException: Tree has been modified outside of 
> iterator
> at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.checkForModification(FoldedTreeSet.java:311)
> 
> at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.hasNext(FoldedTreeSet.java:256)
> 
> at java.util.AbstractCollection.addAll(AbstractCollection.java:343)
> at java.util.HashSet.(HashSet.java:120)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.deepCopyReplica(FsDatasetImpl.java:1052)
> 
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed.refresh(ReplicaCachingGetSpaceUsed.java:73)
> 
> at 
> org.apache.hadoop.fs.CachingGetSpaceUsed$RefreshThread.run(CachingGetSpaceUsed.java:178)
>    
> at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Issue Comment Deleted] (HDFS-14986) ReplicaCachingGetSpaceUsed throws ConcurrentModificationException

2021-05-07 Thread lei w (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lei w updated HDFS-14986:
-
Comment: was deleted

(was: hi [~Aiphag0] , Will hadoop 3.1.2 also appear deadlock as you said above 
?[comment-16973314](https://issues.apache.org/jira/browse/HDFS-14986?focusedCommentId=16973314=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16973314))

> ReplicaCachingGetSpaceUsed throws  ConcurrentModificationException
> --
>
> Key: HDFS-14986
> URL: https://issues.apache.org/jira/browse/HDFS-14986
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, performance
>Affects Versions: 2.10.0
>Reporter: Ryan Wu
>Assignee: Aiphago
>Priority: Major
> Fix For: 3.3.0, 3.1.4, 2.10.1
>
> Attachments: HDFS-14986.001.patch, HDFS-14986.002.patch, 
> HDFS-14986.003.patch, HDFS-14986.004.patch, HDFS-14986.005.patch, 
> HDFS-14986.006.patch
>
>
> Running DU across lots of disks is very expensive . We applied the patch 
> HDFS-14313 to get  used space from ReplicaInfo in memory.However, new du 
> threads throw the exception
> {code:java}
> // 2019-11-08 18:07:13,858 ERROR 
> [refreshUsed-/home/vipshop/hard_disk/7/dfs/dn/current/BP-1203969992--1450855658517]
>  
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed:
>  ReplicaCachingGetSpaceUsed refresh error
> java.util.ConcurrentModificationException: Tree has been modified outside of 
> iterator
> at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.checkForModification(FoldedTreeSet.java:311)
> 
> at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.hasNext(FoldedTreeSet.java:256)
> 
> at java.util.AbstractCollection.addAll(AbstractCollection.java:343)
> at java.util.HashSet.(HashSet.java:120)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.deepCopyReplica(FsDatasetImpl.java:1052)
> 
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed.refresh(ReplicaCachingGetSpaceUsed.java:73)
> 
> at 
> org.apache.hadoop.fs.CachingGetSpaceUsed$RefreshThread.run(CachingGetSpaceUsed.java:178)
>    
> at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14986) ReplicaCachingGetSpaceUsed throws ConcurrentModificationException

2021-05-07 Thread lei w (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17340751#comment-17340751
 ] 

lei w commented on HDFS-14986:
--

hi [~Aiphag0] ,Will hadoop 3.1.2 also appear deadlock as you said above ? 
https://issues.apache.org/jira/browse/HDFS-14986?focusedCommentId=16973314=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16973314

> ReplicaCachingGetSpaceUsed throws  ConcurrentModificationException
> --
>
> Key: HDFS-14986
> URL: https://issues.apache.org/jira/browse/HDFS-14986
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, performance
>Affects Versions: 2.10.0
>Reporter: Ryan Wu
>Assignee: Aiphago
>Priority: Major
> Fix For: 3.3.0, 3.1.4, 2.10.1
>
> Attachments: HDFS-14986.001.patch, HDFS-14986.002.patch, 
> HDFS-14986.003.patch, HDFS-14986.004.patch, HDFS-14986.005.patch, 
> HDFS-14986.006.patch
>
>
> Running DU across lots of disks is very expensive . We applied the patch 
> HDFS-14313 to get  used space from ReplicaInfo in memory.However, new du 
> threads throw the exception
> {code:java}
> // 2019-11-08 18:07:13,858 ERROR 
> [refreshUsed-/home/vipshop/hard_disk/7/dfs/dn/current/BP-1203969992--1450855658517]
>  
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed:
>  ReplicaCachingGetSpaceUsed refresh error
> java.util.ConcurrentModificationException: Tree has been modified outside of 
> iterator
> at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.checkForModification(FoldedTreeSet.java:311)
> 
> at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.hasNext(FoldedTreeSet.java:256)
> 
> at java.util.AbstractCollection.addAll(AbstractCollection.java:343)
> at java.util.HashSet.(HashSet.java:120)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.deepCopyReplica(FsDatasetImpl.java:1052)
> 
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed.refresh(ReplicaCachingGetSpaceUsed.java:73)
> 
> at 
> org.apache.hadoop.fs.CachingGetSpaceUsed$RefreshThread.run(CachingGetSpaceUsed.java:178)
>    
> at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14986) ReplicaCachingGetSpaceUsed throws ConcurrentModificationException

2021-05-07 Thread lei w (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17340750#comment-17340750
 ] 

lei w commented on HDFS-14986:
--

hi [~Aiphag0] , Will hadoop 3.1.2 also appear deadlock as you said above 
?[comment-16973314](https://issues.apache.org/jira/browse/HDFS-14986?focusedCommentId=16973314=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16973314)

> ReplicaCachingGetSpaceUsed throws  ConcurrentModificationException
> --
>
> Key: HDFS-14986
> URL: https://issues.apache.org/jira/browse/HDFS-14986
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, performance
>Affects Versions: 2.10.0
>Reporter: Ryan Wu
>Assignee: Aiphago
>Priority: Major
> Fix For: 3.3.0, 3.1.4, 2.10.1
>
> Attachments: HDFS-14986.001.patch, HDFS-14986.002.patch, 
> HDFS-14986.003.patch, HDFS-14986.004.patch, HDFS-14986.005.patch, 
> HDFS-14986.006.patch
>
>
> Running DU across lots of disks is very expensive . We applied the patch 
> HDFS-14313 to get  used space from ReplicaInfo in memory.However, new du 
> threads throw the exception
> {code:java}
> // 2019-11-08 18:07:13,858 ERROR 
> [refreshUsed-/home/vipshop/hard_disk/7/dfs/dn/current/BP-1203969992--1450855658517]
>  
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed:
>  ReplicaCachingGetSpaceUsed refresh error
> java.util.ConcurrentModificationException: Tree has been modified outside of 
> iterator
> at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.checkForModification(FoldedTreeSet.java:311)
> 
> at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.hasNext(FoldedTreeSet.java:256)
> 
> at java.util.AbstractCollection.addAll(AbstractCollection.java:343)
> at java.util.HashSet.(HashSet.java:120)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.deepCopyReplica(FsDatasetImpl.java:1052)
> 
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed.refresh(ReplicaCachingGetSpaceUsed.java:73)
> 
> at 
> org.apache.hadoop.fs.CachingGetSpaceUsed$RefreshThread.run(CachingGetSpaceUsed.java:178)
>    
> at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14986) ReplicaCachingGetSpaceUsed throws ConcurrentModificationException

2021-05-07 Thread lei w (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17340749#comment-17340749
 ] 

lei w commented on HDFS-14986:
--

hi [~Aiphag0] , hadoop 3.1.2 also appear deadlock as you said above 
[comment-16973314](https://issues.apache.org/jira/browse/HDFS-14986?focusedCommentId=16973314=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16973314)

> ReplicaCachingGetSpaceUsed throws  ConcurrentModificationException
> --
>
> Key: HDFS-14986
> URL: https://issues.apache.org/jira/browse/HDFS-14986
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, performance
>Affects Versions: 2.10.0
>Reporter: Ryan Wu
>Assignee: Aiphago
>Priority: Major
> Fix For: 3.3.0, 3.1.4, 2.10.1
>
> Attachments: HDFS-14986.001.patch, HDFS-14986.002.patch, 
> HDFS-14986.003.patch, HDFS-14986.004.patch, HDFS-14986.005.patch, 
> HDFS-14986.006.patch
>
>
> Running DU across lots of disks is very expensive . We applied the patch 
> HDFS-14313 to get  used space from ReplicaInfo in memory.However, new du 
> threads throw the exception
> {code:java}
> // 2019-11-08 18:07:13,858 ERROR 
> [refreshUsed-/home/vipshop/hard_disk/7/dfs/dn/current/BP-1203969992--1450855658517]
>  
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed:
>  ReplicaCachingGetSpaceUsed refresh error
> java.util.ConcurrentModificationException: Tree has been modified outside of 
> iterator
> at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.checkForModification(FoldedTreeSet.java:311)
> 
> at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.hasNext(FoldedTreeSet.java:256)
> 
> at java.util.AbstractCollection.addAll(AbstractCollection.java:343)
> at java.util.HashSet.(HashSet.java:120)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.deepCopyReplica(FsDatasetImpl.java:1052)
> 
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed.refresh(ReplicaCachingGetSpaceUsed.java:73)
> 
> at 
> org.apache.hadoop.fs.CachingGetSpaceUsed$RefreshThread.run(CachingGetSpaceUsed.java:178)
>    
> at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Issue Comment Deleted] (HDFS-14986) ReplicaCachingGetSpaceUsed throws ConcurrentModificationException

2021-05-07 Thread lei w (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lei w updated HDFS-14986:
-
Comment: was deleted

(was: hi [~Aiphag0] , hadoop 3.1.2 also appear deadlock as you said above 
[comment-16973314](https://issues.apache.org/jira/browse/HDFS-14986?focusedCommentId=16973314=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16973314))

> ReplicaCachingGetSpaceUsed throws  ConcurrentModificationException
> --
>
> Key: HDFS-14986
> URL: https://issues.apache.org/jira/browse/HDFS-14986
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, performance
>Affects Versions: 2.10.0
>Reporter: Ryan Wu
>Assignee: Aiphago
>Priority: Major
> Fix For: 3.3.0, 3.1.4, 2.10.1
>
> Attachments: HDFS-14986.001.patch, HDFS-14986.002.patch, 
> HDFS-14986.003.patch, HDFS-14986.004.patch, HDFS-14986.005.patch, 
> HDFS-14986.006.patch
>
>
> Running DU across lots of disks is very expensive . We applied the patch 
> HDFS-14313 to get  used space from ReplicaInfo in memory.However, new du 
> threads throw the exception
> {code:java}
> // 2019-11-08 18:07:13,858 ERROR 
> [refreshUsed-/home/vipshop/hard_disk/7/dfs/dn/current/BP-1203969992--1450855658517]
>  
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed:
>  ReplicaCachingGetSpaceUsed refresh error
> java.util.ConcurrentModificationException: Tree has been modified outside of 
> iterator
> at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.checkForModification(FoldedTreeSet.java:311)
> 
> at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.hasNext(FoldedTreeSet.java:256)
> 
> at java.util.AbstractCollection.addAll(AbstractCollection.java:343)
> at java.util.HashSet.(HashSet.java:120)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.deepCopyReplica(FsDatasetImpl.java:1052)
> 
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed.refresh(ReplicaCachingGetSpaceUsed.java:73)
> 
> at 
> org.apache.hadoop.fs.CachingGetSpaceUsed$RefreshThread.run(CachingGetSpaceUsed.java:178)
>    
> at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16009) HDFS tiered storage support

2021-05-07 Thread JiangHua Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17340742#comment-17340742
 ] 

JiangHua Zhu commented on HDFS-16009:
-

[~LeonG], hope we can communicate often.


> HDFS tiered storage support
> ---
>
> Key: HDFS-16009
> URL: https://issues.apache.org/jira/browse/HDFS-16009
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: JiangHua Zhu
>Priority: Major
>
> HDFS currently stores data blocks on different data nodes. These data nodes 
> are random within the scope of the network topology (local rack/different 
> racks/node group).
> When the NameNode allocates DataNodes to the Block, it is randomly selected, 
> that is, each DataNode is selected equally.
> We should make a hierarchical strategy. For example, DataNode1, DataNode2, 
> and DataNode3 are the same level, and DataNode4 and DataNode5 are the same 
> level. Because the hardware between DataNodes may be different, in this case, 
> it is necessary for us to do tiered storage, so that DataNodes with poor 
> hardware can store up to 50%, and 40% or even lower. This is conducive to 
> cluster maintenance and stability.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Issue Comment Deleted] (HDFS-14986) ReplicaCachingGetSpaceUsed throws ConcurrentModificationException

2021-05-07 Thread lei w (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lei w updated HDFS-14986:
-
Comment: was deleted

(was: hi Aiphag0 , deadlock will appear in hadoop3.x ?)

> ReplicaCachingGetSpaceUsed throws  ConcurrentModificationException
> --
>
> Key: HDFS-14986
> URL: https://issues.apache.org/jira/browse/HDFS-14986
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, performance
>Affects Versions: 2.10.0
>Reporter: Ryan Wu
>Assignee: Aiphago
>Priority: Major
> Fix For: 3.3.0, 3.1.4, 2.10.1
>
> Attachments: HDFS-14986.001.patch, HDFS-14986.002.patch, 
> HDFS-14986.003.patch, HDFS-14986.004.patch, HDFS-14986.005.patch, 
> HDFS-14986.006.patch
>
>
> Running DU across lots of disks is very expensive . We applied the patch 
> HDFS-14313 to get  used space from ReplicaInfo in memory.However, new du 
> threads throw the exception
> {code:java}
> // 2019-11-08 18:07:13,858 ERROR 
> [refreshUsed-/home/vipshop/hard_disk/7/dfs/dn/current/BP-1203969992--1450855658517]
>  
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed:
>  ReplicaCachingGetSpaceUsed refresh error
> java.util.ConcurrentModificationException: Tree has been modified outside of 
> iterator
> at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.checkForModification(FoldedTreeSet.java:311)
> 
> at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.hasNext(FoldedTreeSet.java:256)
> 
> at java.util.AbstractCollection.addAll(AbstractCollection.java:343)
> at java.util.HashSet.(HashSet.java:120)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.deepCopyReplica(FsDatasetImpl.java:1052)
> 
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed.refresh(ReplicaCachingGetSpaceUsed.java:73)
> 
> at 
> org.apache.hadoop.fs.CachingGetSpaceUsed$RefreshThread.run(CachingGetSpaceUsed.java:178)
>    
> at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14986) ReplicaCachingGetSpaceUsed throws ConcurrentModificationException

2021-05-07 Thread lei w (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17340727#comment-17340727
 ] 

lei w commented on HDFS-14986:
--

hi Aiphag0 , deadlock will appear in hadoop3.x ?

> ReplicaCachingGetSpaceUsed throws  ConcurrentModificationException
> --
>
> Key: HDFS-14986
> URL: https://issues.apache.org/jira/browse/HDFS-14986
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, performance
>Affects Versions: 2.10.0
>Reporter: Ryan Wu
>Assignee: Aiphago
>Priority: Major
> Fix For: 3.3.0, 3.1.4, 2.10.1
>
> Attachments: HDFS-14986.001.patch, HDFS-14986.002.patch, 
> HDFS-14986.003.patch, HDFS-14986.004.patch, HDFS-14986.005.patch, 
> HDFS-14986.006.patch
>
>
> Running DU across lots of disks is very expensive . We applied the patch 
> HDFS-14313 to get  used space from ReplicaInfo in memory.However, new du 
> threads throw the exception
> {code:java}
> // 2019-11-08 18:07:13,858 ERROR 
> [refreshUsed-/home/vipshop/hard_disk/7/dfs/dn/current/BP-1203969992--1450855658517]
>  
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed:
>  ReplicaCachingGetSpaceUsed refresh error
> java.util.ConcurrentModificationException: Tree has been modified outside of 
> iterator
> at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.checkForModification(FoldedTreeSet.java:311)
> 
> at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.hasNext(FoldedTreeSet.java:256)
> 
> at java.util.AbstractCollection.addAll(AbstractCollection.java:343)
> at java.util.HashSet.(HashSet.java:120)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.deepCopyReplica(FsDatasetImpl.java:1052)
> 
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed.refresh(ReplicaCachingGetSpaceUsed.java:73)
> 
> at 
> org.apache.hadoop.fs.CachingGetSpaceUsed$RefreshThread.run(CachingGetSpaceUsed.java:178)
>    
> at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16008) RBF: Tool to initialize ViewFS Mapping to Router

2021-05-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16008?focusedWorklogId=593213=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-593213
 ]

ASF GitHub Bot logged work on HDFS-16008:
-

Author: ASF GitHub Bot
Created on: 07/May/21 09:57
Start Date: 07/May/21 09:57
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #2981:
URL: https://github.com/apache/hadoop/pull/2981#issuecomment-834228785


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 50s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  markdownlint  |   0m  1s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  33m 50s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 38s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   0m 33s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   0m 23s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 38s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 37s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 51s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   1m 14s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  17m  1s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 33s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 33s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |   0m 33s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 27s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |   0m 27s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 16s | 
[/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs-rbf.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2981/4/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs-rbf.txt)
 |  hadoop-hdfs-project/hadoop-hdfs-rbf: The patch generated 1 new + 0 
unchanged - 0 fixed = 1 total (was 0)  |
   | +1 :green_heart: |  mvnsite  |   0m 31s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 30s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 45s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   1m 17s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  16m 41s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  21m 57s |  |  hadoop-hdfs-rbf in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 30s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 101m 58s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2981/4/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2981 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell markdownlint |
   | uname | Linux 411e1f12e5ff 4.15.0-128-generic #131-Ubuntu SMP Wed Dec 9 
06:57:35 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 7fa083ac87f720663afb7c52953c830e45cefd4a |
   | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2981/4/testReport/ |
   | Max. 

[jira] [Work logged] (HDFS-16008) RBF: Tool to initialize ViewFS Mapping to Router

2021-05-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16008?focusedWorklogId=593188=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-593188
 ]

ASF GitHub Bot logged work on HDFS-16008:
-

Author: ASF GitHub Bot
Created on: 07/May/21 08:06
Start Date: 07/May/21 08:06
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #2981:
URL: https://github.com/apache/hadoop/pull/2981#issuecomment-834155907


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 48s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  markdownlint  |   0m  0s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  33m 38s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 41s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   0m 33s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   0m 22s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 39s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 37s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 51s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   1m 15s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  17m  2s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 32s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 34s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |   0m 34s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 28s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |   0m 28s |  |  the patch passed  |
   | -1 :x: |  blanks  |   0m  0s | 
[/blanks-eol.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2981/3/artifact/out/blanks-eol.txt)
 |  The patch has 1 line(s) that end in blanks. Use git apply --whitespace=fix 
<>. Refer https://git-scm.com/docs/git-apply  |
   | -0 :warning: |  checkstyle  |   0m 16s | 
[/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs-rbf.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2981/3/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs-rbf.txt)
 |  hadoop-hdfs-project/hadoop-hdfs-rbf: The patch generated 1 new + 0 
unchanged - 0 fixed = 1 total (was 0)  |
   | +1 :green_heart: |  mvnsite  |   0m 31s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 31s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 48s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   1m 19s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  16m 46s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  22m  5s |  |  hadoop-hdfs-rbf in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 29s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 102m  1s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2981/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2981 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell markdownlint |
   | uname | Linux 19992cf6eb20 4.15.0-128-generic #131-Ubuntu SMP Wed Dec 9 
06:57:35 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / e0b806f234d7217010f9ac1d804f08b77025bc70 |
   | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 

[jira] [Commented] (HDFS-15923) RBF: Authentication failed when rename accross sub clusters

2021-05-07 Thread Jinglun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17340649#comment-17340649
 ] 

Jinglun commented on HDFS-15923:


+1 on v03. Waiting one day for further comments. After that I'll commit this.

> RBF:  Authentication failed when rename accross sub clusters
> 
>
> Key: HDFS-15923
> URL: https://issues.apache.org/jira/browse/HDFS-15923
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: zhuobin zheng
>Assignee: zhuobin zheng
>Priority: Major
>  Labels: RBF, pull-request-available, rename
> Attachments: HDFS-15923.001.patch, HDFS-15923.002.patch, 
> HDFS-15923.003.patch, HDFS-15923.stack-trace, 
> hdfs-15923-fix-security-issue.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Rename accross subcluster with RBF and Kerberos environment. Will encounter 
> the following two errors:
>  # Save Object to journal.
>  # Precheck try to get src file status
> So, we need use Router Login UGI doAs create DistcpProcedure and 
> TrashProcedure and submit Job.
>  
> Beside, we should check user permission for src and dst path in router side 
> before do rename internal. (HDFS-15973)
> First: Save Object to journal.
> {code:java}
> // code placeholder
> 2021-03-23 14:01:16,233 WARN org.apache.hadoop.ipc.Client: Exception 
> encountered while connecting to the server 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]
> at 
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
> at 
> org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:408)
> at 
> org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:622)
> at 
> org.apache.hadoop.ipc.Client$Connection.access$2300(Client.java:413)
> at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:822)
> at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:818)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
> at 
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:818)
> at 
> org.apache.hadoop.ipc.Client$Connection.access$3800(Client.java:413)
> at org.apache.hadoop.ipc.Client.getConnection(Client.java:1636)
> at org.apache.hadoop.ipc.Client.call(Client.java:1452)
> at org.apache.hadoop.ipc.Client.call(Client.java:1405)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
> at com.sun.proxy.$Proxy11.create(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:376)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
> at com.sun.proxy.$Proxy12.create(Unknown Source)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:277)
> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1240)
> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1219)
> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1201)
> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1139)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$8.doCall(DistributedFileSystem.java:533)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$8.doCall(DistributedFileSystem.java:530)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> 

[jira] [Commented] (HDFS-16009) HDFS tiered storage support

2021-05-07 Thread Leon Gao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17340623#comment-17340623
 ] 

Leon Gao commented on HDFS-16009:
-

Cool, we have been running that in our production, please feel free to give it 
a try

> HDFS tiered storage support
> ---
>
> Key: HDFS-16009
> URL: https://issues.apache.org/jira/browse/HDFS-16009
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: JiangHua Zhu
>Priority: Major
>
> HDFS currently stores data blocks on different data nodes. These data nodes 
> are random within the scope of the network topology (local rack/different 
> racks/node group).
> When the NameNode allocates DataNodes to the Block, it is randomly selected, 
> that is, each DataNode is selected equally.
> We should make a hierarchical strategy. For example, DataNode1, DataNode2, 
> and DataNode3 are the same level, and DataNode4 and DataNode5 are the same 
> level. Because the hardware between DataNodes may be different, in this case, 
> it is necessary for us to do tiered storage, so that DataNodes with poor 
> hardware can store up to 50%, and 40% or even lower. This is conducive to 
> cluster maintenance and stability.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16007) Vulnerabilities found when serializing enum value

2021-05-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16007?focusedWorklogId=593160=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-593160
 ]

ASF GitHub Bot logged work on HDFS-16007:
-

Author: ASF GitHub Bot
Created on: 07/May/21 06:34
Start Date: 07/May/21 06:34
Worklog Time Spent: 10m 
  Work Description: virajjasani commented on pull request #2982:
URL: https://github.com/apache/hadoop/pull/2982#issuecomment-834106326


   > I prefer Validate#validIndex in commons-lang3 rather than Guava. We should 
avoid guava as possible.
   
   Sure, let me make this change right away.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 593160)
Time Spent: 0.5h  (was: 20m)

> Vulnerabilities found when serializing enum value
> -
>
> Key: HDFS-16007
> URL: https://issues.apache.org/jira/browse/HDFS-16007
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: junwen yang
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> ReplicaState enum is using ordinal to conduct serialization and 
> deserialization, which is vulnerable to the order, to cause issues similar to 
> HDFS-15624.
> To avoid it, either adding comments to let later developer not to change this 
> enum, or add index checking in the read and getState function to avoid index 
> out of bound error. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org