[jira] [Work logged] (HDFS-16107) Split RPC configuration to isolate RPC

2021-07-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16107?focusedWorklogId=624759=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-624759
 ]

ASF GitHub Bot logged work on HDFS-16107:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 05:52
Start Date: 20/Jul/21 05:52
Worklog Time Spent: 10m 
  Work Description: jianghuazhu commented on pull request #3170:
URL: https://github.com/apache/hadoop/pull/3170#issuecomment-883087348


   @jojochuang , I submitted some new code, can you review it.
   thank you very much.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 624759)
Time Spent: 1h 10m  (was: 1h)

> Split RPC configuration to isolate RPC
> --
>
> Key: HDFS-16107
> URL: https://issues.apache.org/jira/browse/HDFS-16107
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> For RPC of different ports, there are some common configurations, such as:
> ipc.server.read.threadpool.size
> ipc.server.read.connection-queue.size
> ipc.server.handler.queue.size
> Once we configure these values, it will affect all requests (including client 
> and requests within the cluster).
> It is necessary for us to split these configurations to adapt to different 
> ports, such as:
> ipc.8020.server.read.threadpool.size
> ipc.8021.server.read.threadpool.size
> ipc.8020.server.read.connection-queue.size
> ipc.8021.server.read.connection-queue.size
> The advantage of this is to isolate the RPC to deal with the pressure of 
> requests from all sides.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16107) Split RPC configuration to isolate RPC

2021-07-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16107?focusedWorklogId=624751=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-624751
 ]

ASF GitHub Bot logged work on HDFS-16107:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 05:30
Start Date: 20/Jul/21 05:30
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3170:
URL: https://github.com/apache/hadoop/pull/3170#issuecomment-883071565


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 56s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  30m 50s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  21m  9s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |  18m 16s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   1m 10s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 36s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m  8s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 41s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   2m 25s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  15m 44s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 55s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  20m  8s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |  20m  8s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  18m 38s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |  18m 38s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   1m  2s | 
[/results-checkstyle-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3170/5/artifact/out/results-checkstyle-hadoop-common-project_hadoop-common.txt)
 |  hadoop-common-project/hadoop-common: The patch generated 7 new + 300 
unchanged - 1 fixed = 307 total (was 301)  |
   | +1 :green_heart: |  mvnsite  |   1m 41s |  |  the patch passed  |
   | +1 :green_heart: |  xml  |   0m  2s |  |  The patch has no ill-formed XML 
file.  |
   | +1 :green_heart: |  javadoc  |   1m  7s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 32s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   2m 54s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  17m  1s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  |  17m 23s | 
[/patch-unit-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3170/5/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt)
 |  hadoop-common in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 53s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 178m  0s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.fs.viewfs.TestViewFileSystemLocalFileSystem |
   |   | hadoop.conf.TestCommonConfigurationFields |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3170/5/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3170 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell xml |
   | uname | Linux 2cdb0d4be557 4.15.0-65-generic #74-Ubuntu SMP Tue Sep 17 
17:06:04 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 6a709d57093a49c23e74dfa426aa19f7d29817e6 |
   | Default Java | 

[jira] [Work logged] (HDFS-16119) start balancer with parameters -hotBlockTimeInterval xxx is invalid

2021-07-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16119?focusedWorklogId=624721=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-624721
 ]

ASF GitHub Bot logged work on HDFS-16119:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 02:22
Start Date: 20/Jul/21 02:22
Worklog Time Spent: 10m 
  Work Description: JiaguodongF commented on pull request #3185:
URL: https://github.com/apache/hadoop/pull/3185#issuecomment-883001218


   Hi @hemanthboyina
   Could you please review this PR?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 624721)
Time Spent: 1h 50m  (was: 1h 40m)

> start balancer with parameters -hotBlockTimeInterval xxx is invalid
> ---
>
> Key: HDFS-16119
> URL: https://issues.apache.org/jira/browse/HDFS-16119
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: jiaguodong
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
>  
> when start balancer with parameters -hotBlockTimeInterval xxx,  it is invalid.
> but set hdfs-site.xml is valid.
> 
>  dfs.balancer.getBlocks.hot-time-interval
>  3600
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16087) RBF balance process is stuck at DisableWrite stage

2021-07-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16087?focusedWorklogId=624720=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-624720
 ]

ASF GitHub Bot logged work on HDFS-16087:
-

Author: ASF GitHub Bot
Created on: 20/Jul/21 02:18
Start Date: 20/Jul/21 02:18
Worklog Time Spent: 10m 
  Work Description: lipp commented on pull request #3141:
URL: https://github.com/apache/hadoop/pull/3141#issuecomment-88233


   Thanks @Hexiaoqiao for your review and approval.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 624720)
Time Spent: 3h  (was: 2h 50m)

> RBF balance process is stuck at DisableWrite stage
> --
>
> Key: HDFS-16087
> URL: https://issues.apache.org/jira/browse/HDFS-16087
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rbf
>Affects Versions: 3.4.0
>Reporter: Eric Yin
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> The balance process will be stuck at DisableWrite stage when running the 
> rbfbalance command.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16119) start balancer with parameters -hotBlockTimeInterval xxx is invalid

2021-07-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16119?focusedWorklogId=624397=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-624397
 ]

ASF GitHub Bot logged work on HDFS-16119:
-

Author: ASF GitHub Bot
Created on: 19/Jul/21 14:32
Start Date: 19/Jul/21 14:32
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3185:
URL: https://github.com/apache/hadoop/pull/3185#issuecomment-882598356


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 52s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  33m  8s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 24s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   1m 17s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   1m  3s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 30s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 57s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 25s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m 29s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  19m 49s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 16s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 22s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |   1m 22s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 13s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |   1m 13s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  1s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 56s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 18s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 49s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 19s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m 31s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  19m 33s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 404m 41s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3185/5/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m 18s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 499m 11s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.hdfs.server.namenode.TestDecommissioningStatusWithBackoffMonitor |
   |   | hadoop.hdfs.TestHDFSFileSystemContract |
   |   | hadoop.hdfs.server.namenode.TestDecommissioningStatus |
   |   | hadoop.hdfs.web.TestWebHdfsFileSystemContract |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3185/5/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3185 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux 97bb16146408 4.15.0-128-generic #131-Ubuntu SMP Wed Dec 9 
06:57:35 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / b56848c846f33853406dc33b632941f5b6f4cf24 |
   | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   |  Test 

[jira] [Updated] (HDFS-16132) SnapshotDiff report fails with invalid path assertion with external Attribute provider

2021-07-19 Thread Shashikant Banerjee (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDFS-16132:
---
Description: 
The issue can be reproduced with the below unit test:
{code:java}
diff --git 
a/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestINodeAttributeProvider.java
 
b/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestINodeAttributeProvider.java
index 512d1029835..27b80882766 100644
--- 
a/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestINodeAttributeProvider.java
+++ 
b/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestINodeAttributeProvider.java
@@ -36,6 +36,7 @@
 import org.apache.hadoop.hdfs.DistributedFileSystem;
 import org.apache.hadoop.hdfs.HdfsConfiguration;
 import org.apache.hadoop.hdfs.MiniDFSCluster;
+import org.apache.hadoop.hdfs.DFSTestUtil;
 import org.apache.hadoop.security.AccessControlException;
 import org.apache.hadoop.security.UserGroupInformation;
 import org.apache.hadoop.util.Lists;
@@ -89,7 +90,7 @@ public void checkPermissionWithContext(
           AuthorizationContext authzContext) throws AccessControlException {
         if (authzContext.getAncestorIndex() > 1
             && authzContext.getInodes()[1].getLocalName().equals("user")
-            && authzContext.getInodes()[2].getLocalName().equals("acl")) {
+            && authzContext.getInodes()[2].getLocalName().equals("acl") || 
runPermissionCheck) {
           this.ace.checkPermissionWithContext(authzContext);
         }
         CALLED.add("checkPermission|" + authzContext.getAncestorAccess()
@@ -598,6 +599,55 @@ public Void run() throws Exception {
         return null;
       }
     });
+  }
 
+  @Test
+  public void testAttrProviderSeesResolvedSnapahotPaths1() throws Exception {
+    runPermissionCheck = true;
+    FileSystem fs = FileSystem.get(miniDFS.getConfiguration(0));
+    DistributedFileSystem hdfs = miniDFS.getFileSystem();
+    final Path parent = new Path("/user");
+    hdfs.mkdirs(parent);
+    fs.setPermission(parent, new FsPermission(HDFS_PERMISSION));
+    final Path sub1 = new Path(parent, "sub1");
+    final Path sub1foo = new Path(sub1, "foo");
+    hdfs.mkdirs(sub1);
+    hdfs.mkdirs(sub1foo);
+    Path f = new Path(sub1foo, "file0");
+    DFSTestUtil.createFile(hdfs, f, 0, (short) 1, 0);
+    hdfs.allowSnapshot(parent);
+    hdfs.createSnapshot(parent, "s0");
+
+    f = new Path(sub1foo, "file1");
+    DFSTestUtil.createFile(hdfs, f, 0, (short) 1, 0);
+    f = new Path(sub1foo, "file2");
+    DFSTestUtil.createFile(hdfs, f, 0, (short) 1, 0);
+
+    final Path sub2 = new Path(parent, "sub2");
+    hdfs.mkdirs(sub2);
+    final Path sub2foo = new Path(sub2, "foo");
+    // mv /parent/sub1/foo to /parent/sub2/foo
+    hdfs.rename(sub1foo, sub2foo);
+
+    hdfs.createSnapshot(parent, "s1");
+    hdfs.createSnapshot(parent, "s2");
+
+    final Path sub3 = new Path(parent, "sub3");
+    hdfs.mkdirs(sub3);
+    // mv /parent/sub2/foo to /parent/sub3/foo
+    hdfs.rename(sub2foo, sub3);
+
+    hdfs.delete(sub3, true);
+    UserGroupInformation ugi =
+        UserGroupInformation.createUserForTesting("u1", new String[] { "g1" });
+    ugi.doAs(new PrivilegedExceptionAction() {
+      @Override
+      public Void run() throws Exception {
+        FileSystem fs = FileSystem.get(miniDFS.getConfiguration(0));
+        ((DistributedFileSystem)fs).getSnapshotDiffReport(parent, "s1", "s2");
+        CALLED.clear();
+        return null;
+      }
+    });
   }
 }
{code}
It fails with the below error when executed:
{code:java}
org.apache.hadoop.ipc.RemoteException(java.lang.AssertionError): Absolute path 
required, but got 
'foo'org.apache.hadoop.ipc.RemoteException(java.lang.AssertionError): Absolute 
path required, but got 'foo' at 
org.apache.hadoop.hdfs.server.namenode.INode.checkAbsolutePath(INode.java:838) 
at 
org.apache.hadoop.hdfs.server.namenode.INode.getPathComponents(INode.java:813) 
at 
org.apache.hadoop.hdfs.server.namenode.INodesInPath.resolveFromRoot(INodesInPath.java:154)
 at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.getINodeAttrs(FSPermissionChecker.java:447)
 at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkSubAccess(FSPermissionChecker.java:507)
 at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:403)
 at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermissionWithContext(FSPermissionChecker.java:417)
 at 
org.apache.hadoop.hdfs.server.namenode.TestINodeAttributeProvider$MyAuthorizationProvider$MyAccessControlEnforcer.checkPermissionWithContext(TestINodeAttributeProvider.java:94)
 at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:297)
 at 

[jira] [Created] (HDFS-16132) SnapshotDiff report fails with invalid path assertion with external Attribute provider

2021-07-19 Thread Shashikant Banerjee (Jira)
Shashikant Banerjee created HDFS-16132:
--

 Summary: SnapshotDiff report fails with invalid path assertion 
with external Attribute provider
 Key: HDFS-16132
 URL: https://issues.apache.org/jira/browse/HDFS-16132
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Shashikant Banerjee
Assignee: Shashikant Banerjee


The issue can be reproduced with the below unit test:
{code:java}
diff --git 
a/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestINodeAttributeProvider.java
 
b/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestINodeAttributeProvider.java
 index 512d1029835..27b80882766 100644 --- 
a/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestINodeAttributeProvider.java
 +++ 
b/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestINodeAttributeProvider.java
 @@ -36,6 +36,7 @@ import org.apache.hadoop.hdfs.DistributedFileSystem; import 
org.apache.hadoop.hdfs.HdfsConfiguration; import 
org.apache.hadoop.hdfs.MiniDFSCluster; +import 
org.apache.hadoop.hdfs.DFSTestUtil; import 
org.apache.hadoop.security.AccessControlException; import 
org.apache.hadoop.security.UserGroupInformation; import 
org.apache.hadoop.util.Lists; @@ -89,7 +90,7 @@ public void 
checkPermissionWithContext(            AuthorizationContext authzContext) 
throws AccessControlException {          if (authzContext.getAncestorIndex() > 
1              && authzContext.getInodes()[1].getLocalName().equals("user") -   
         && authzContext.getInodes()[2].getLocalName().equals("acl")) { +       
     && authzContext.getInodes()[2].getLocalName().equals("acl") || 
runPermissionCheck) {            
this.ace.checkPermissionWithContext(authzContext);          }          
CALLED.add("checkPermission|" + authzContext.getAncestorAccess() @@ -598,6 
+599,55 @@ public Void run() throws Exception {          return null;        }  
    }); +  } +  @Test +  public void 
testAttrProviderSeesResolvedSnapahotPaths1() throws Exception { +    
runPermissionCheck = true; +    FileSystem fs = 
FileSystem.get(miniDFS.getConfiguration(0)); +    DistributedFileSystem hdfs = 
miniDFS.getFileSystem(); +    final Path parent = new Path("/user"); +    
hdfs.mkdirs(parent); +    fs.setPermission(parent, new 
FsPermission(HDFS_PERMISSION)); +    final Path sub1 = new Path(parent, 
"sub1"); +    final Path sub1foo = new Path(sub1, "foo"); +    
hdfs.mkdirs(sub1); +    hdfs.mkdirs(sub1foo); +    Path f = new Path(sub1foo, 
"file0"); +    DFSTestUtil.createFile(hdfs, f, 0, (short) 1, 0); +    
hdfs.allowSnapshot(parent); +    hdfs.createSnapshot(parent, "s0"); + +    f = 
new Path(sub1foo, "file1"); +    DFSTestUtil.createFile(hdfs, f, 0, (short) 1, 
0); +    f = new Path(sub1foo, "file2"); +    DFSTestUtil.createFile(hdfs, f, 
0, (short) 1, 0); + +    final Path sub2 = new Path(parent, "sub2"); +    
hdfs.mkdirs(sub2); +    final Path sub2foo = new Path(sub2, "foo"); +    // mv 
/parent/sub1/foo to /parent/sub2/foo +    hdfs.rename(sub1foo, sub2foo); + +    
hdfs.createSnapshot(parent, "s1"); +    hdfs.createSnapshot(parent, "s2"); + +  
  final Path sub3 = new Path(parent, "sub3"); +    hdfs.mkdirs(sub3); +    // 
mv /parent/sub2/foo to /parent/sub3/foo +    hdfs.rename(sub2foo, sub3); + +    
hdfs.delete(sub3, true); +    UserGroupInformation ugi = +        
UserGroupInformation.createUserForTesting("u1", new String[] { "g1" }); +    
ugi.doAs(new PrivilegedExceptionAction() { +      @Override +      public 
Void run() throws Exception { +        FileSystem fs = 
FileSystem.get(miniDFS.getConfiguration(0)); +        
((DistributedFileSystem)fs).getSnapshotDiffReport(parent, "s1", "s2"); +        
CALLED.clear(); +        return null; +      } +    });    } }
{code}
It fails with the below error when executed:
{code:java}
org.apache.hadoop.ipc.RemoteException(java.lang.AssertionError): Absolute path 
required, but got 
'foo'org.apache.hadoop.ipc.RemoteException(java.lang.AssertionError): Absolute 
path required, but got 'foo' at 
org.apache.hadoop.hdfs.server.namenode.INode.checkAbsolutePath(INode.java:838) 
at 
org.apache.hadoop.hdfs.server.namenode.INode.getPathComponents(INode.java:813) 
at 
org.apache.hadoop.hdfs.server.namenode.INodesInPath.resolveFromRoot(INodesInPath.java:154)
 at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.getINodeAttrs(FSPermissionChecker.java:447)
 at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkSubAccess(FSPermissionChecker.java:507)
 at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:403)
 at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermissionWithContext(FSPermissionChecker.java:417)
 at 

[jira] [Commented] (HDFS-13697) DFSClient should instantiate and cache KMSClientProvider using UGI at creation time for consistent UGI handling

2021-07-19 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383287#comment-17383287
 ] 

Hadoop QA commented on HDFS-13697:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m 11s{color} 
| {color:red}{color} | {color:red} HDFS-13697 does not apply to trunk. Rebase 
required? Wrong Branch? See 
https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute for help. 
{color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-13697 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12939137/HDFS-13697.12.patch |
| Console output | 
https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/685/console |
| versions | git=2.17.1 |
| Powered by | Apache Yetus 0.13.0-SNAPSHOT https://yetus.apache.org |


This message was automatically generated.



> DFSClient should instantiate and cache KMSClientProvider using UGI at 
> creation time for consistent UGI handling
> ---
>
> Key: HDFS-13697
> URL: https://issues.apache.org/jira/browse/HDFS-13697
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Zsolt Venczel
>Assignee: Zsolt Venczel
>Priority: Major
> Attachments: HDFS-13697.01.patch, HDFS-13697.02.patch, 
> HDFS-13697.03.patch, HDFS-13697.04.patch, HDFS-13697.05.patch, 
> HDFS-13697.06.patch, HDFS-13697.07.patch, HDFS-13697.08.patch, 
> HDFS-13697.09.patch, HDFS-13697.10.patch, HDFS-13697.11.patch, 
> HDFS-13697.12.patch, HDFS-13697.prelim.patch
>
>
> While calling KeyProviderCryptoExtension decryptEncryptedKey the call stack 
> might not have doAs privileged execution call (in the DFSClient for example). 
> This results in loosing the proxy user from UGI as UGI.getCurrentUser finds 
> no AccessControllerContext and does a re-login for the login user only.
> This can cause the following for example: if we have set up the oozie user to 
> be entitled to perform actions on behalf of example_user but oozie is 
> forbidden to decrypt any EDEK (for security reasons), due to the above issue, 
> example_user entitlements are lost from UGI and the following error is 
> reported:
> {code}
> [0] 
> SERVER[xxx] USER[example_user] GROUP[-] TOKEN[] APP[Test_EAR] 
> JOB[0020905-180313191552532-oozie-oozi-W] 
> ACTION[0020905-180313191552532-oozie-oozi-W@polling_dir_path] Error starting 
> action [polling_dir_path]. ErrorType [ERROR], ErrorCode [FS014], Message 
> [FS014: User [oozie] is not authorized to perform [DECRYPT_EEK] on key with 
> ACL name [encrypted_key]!!]
> org.apache.oozie.action.ActionExecutorException: FS014: User [oozie] is not 
> authorized to perform [DECRYPT_EEK] on key with ACL name [encrypted_key]!!
>  at 
> org.apache.oozie.action.ActionExecutor.convertExceptionHelper(ActionExecutor.java:463)
>  at 
> org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:441)
>  at 
> org.apache.oozie.action.hadoop.FsActionExecutor.touchz(FsActionExecutor.java:523)
>  at 
> org.apache.oozie.action.hadoop.FsActionExecutor.doOperations(FsActionExecutor.java:199)
>  at 
> org.apache.oozie.action.hadoop.FsActionExecutor.start(FsActionExecutor.java:563)
>  at 
> org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:232)
>  at 
> org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:63)
>  at org.apache.oozie.command.XCommand.call(XCommand.java:286)
>  at 
> org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:332)
>  at 
> org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:261)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>  at 
> org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:179)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  at java.lang.Thread.run(Thread.java:744)
> Caused by: org.apache.hadoop.security.authorize.AuthorizationException: User 
> [oozie] is not authorized to perform [DECRYPT_EEK] on key with ACL name 
> [encrypted_key]!!
>  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>  at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>  at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  at 

[jira] [Commented] (HDFS-15942) Increase Quota initialization threads

2021-07-19 Thread Xiaoqiao He (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383264#comment-17383264
 ] 

Xiaoqiao He commented on HDFS-15942:


cherry-pick to branch-3.2.

> Increase Quota initialization threads
> -
>
> Key: HDFS-15942
> URL: https://issues.apache.org/jira/browse/HDFS-15942
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Fix For: 3.3.1, 3.4.0, 3.2.3
>
> Attachments: HDFS-15942.001.patch
>
>
> On large namespaces, the quota initialization at started can take a long time 
> with the default 4 threads. Also on NN failover, often the quota needs to be 
> calculated before the failover can completed, delaying the failover.
> I performed some benchmarks some time back on a large image (316M inodes 35GB 
> on disk), the quota load takes:
> {code}
> quota - 4  threads 39 seconds
> quota - 8  threads 23 seconds
> quota - 12 threads 20 seconds
> quota - 16 threads 15 seconds
> {code}
> As the quota is calculated when the NN is starting up (and hence doing no 
> other work) or at failover time before the new standby becomes active, I 
> think the quota should use as many threads as possible.
> I proposed we change the default to 8 or 12 on at least trunk and branch-3.3 
> so we have a better default going forward.
> Has anyone got any other thoughts?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15942) Increase Quota initialization threads

2021-07-19 Thread Xiaoqiao He (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoqiao He updated HDFS-15942:
---
Fix Version/s: 3.2.3

> Increase Quota initialization threads
> -
>
> Key: HDFS-15942
> URL: https://issues.apache.org/jira/browse/HDFS-15942
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Fix For: 3.3.1, 3.4.0, 3.2.3
>
> Attachments: HDFS-15942.001.patch
>
>
> On large namespaces, the quota initialization at started can take a long time 
> with the default 4 threads. Also on NN failover, often the quota needs to be 
> calculated before the failover can completed, delaying the failover.
> I performed some benchmarks some time back on a large image (316M inodes 35GB 
> on disk), the quota load takes:
> {code}
> quota - 4  threads 39 seconds
> quota - 8  threads 23 seconds
> quota - 12 threads 20 seconds
> quota - 16 threads 15 seconds
> {code}
> As the quota is calculated when the NN is starting up (and hence doing no 
> other work) or at failover time before the new standby becomes active, I 
> think the quota should use as many threads as possible.
> I proposed we change the default to 8 or 12 on at least trunk and branch-3.3 
> so we have a better default going forward.
> Has anyone got any other thoughts?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15651) Client could not obtain block when DN CommandProcessingThread exit

2021-07-19 Thread Xiaoqiao He (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383261#comment-17383261
 ] 

Xiaoqiao He commented on HDFS-15651:


Thanks [~hemanthboyina] for your information. I just found that HDFS-14997 is 
also not backported to branch-3.2. I will try to backport this feature shortly. 
Thanks.

> Client could not obtain block when DN CommandProcessingThread exit
> --
>
> Key: HDFS-15651
> URL: https://issues.apache.org/jira/browse/HDFS-15651
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yiqun Lin
>Assignee: Aiphago
>Priority: Major
> Fix For: 3.3.1, 3.4.0
>
> Attachments: HDFS-15651.001.patch, HDFS-15651.002.patch, 
> HDFS-15651.patch
>
>
> In our cluster, we applied the HDFS-14997 improvement.
>  We find one case that CommandProcessingThread will exit due to OOM error. 
> OOM error was caused by our one abnormal application that running on this DN 
> node.
> {noformat}
> 2020-10-18 10:27:12,604 ERROR 
> org.apache.hadoop.hdfs.server.datanode.DataNode: Command processor 
> encountered fatal exception and exit.
> java.lang.OutOfMemoryError: unable to create new native thread
> at java.lang.Thread.start0(Native Method)
> at java.lang.Thread.start(Thread.java:717)
> at 
> java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:957)
> at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1367)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService.execute(FsDatasetAsyncDiskService.java:173)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService.deleteAsync(FsDatasetAsyncDiskService.java:222)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.invalidate(FsDatasetImpl.java:2005)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:671)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:617)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor$CommandProcessingThread.processCommand(BPServiceActor.java:1247)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor$CommandProcessingThread.access$1000(BPServiceActor.java:1194)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor$CommandProcessingThread$3.run(BPServiceActor.java:1299)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor$CommandProcessingThread.processQueue(BPServiceActor.java:1221)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor$CommandProcessingThread.run(BPServiceActor.java:1208)
> {noformat}
> Here the main point is that CommandProcessingThread crashed will lead a very 
> bad impact. All the NN response commands will not be processed by DN side.
> We enabled the block token to access the data, but here the DN command 
> DNA_ACCESSKEYUPDATE is not processed on time by DN. And then we see lots of 
> Sasl error due to key expiration in DN log:
> {noformat}
> javax.security.sasl.SaslException: DIGEST-MD5: IO error acquiring password 
> [Caused by org.apache.hadoop.security.token.SecretManager$InvalidToken: Can't 
> re-compute password for block_token_identifier (expiryDate=xxx, keyId=xx, 
> userId=xxx, blockPoolId=, blockId=xxx, access modes=[READ]), since the 
> required block key (keyID=xxx) doesn't exist.]
> {noformat}
>  
> For the impact in client side, our users receive lots of 'could not obtain 
> block' error  with BlockMissingException.
> CommandProcessingThread is a critical thread, it should always be running.
> {code:java}
>   /**
>* CommandProcessingThread that process commands asynchronously.
>*/
>   class CommandProcessingThread extends Thread {
> private final BPServiceActor actor;
> private final BlockingQueue queue;
> ...
> @Override
> public void run() {
>   try {
> processQueue();
>   } catch (Throwable t) {
> LOG.error("{} encountered fatal exception and exit.", getName(), t);  
>  <=== should not exit this thread
>   }
> }
> {code}
> Once a unexpected error happened, a better handing should be:
>  * catch the exception, appropriately deal with the error and let 
> processQueue continue to run
>  or
>  * exit the DN process to let admin user investigate this



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13697) DFSClient should instantiate and cache KMSClientProvider using UGI at creation time for consistent UGI handling

2021-07-19 Thread Akira Ajisaka (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383223#comment-17383223
 ] 

Akira Ajisaka commented on HDFS-13697:
--

We recently hit this issue while experimenting Hadoop KMS. Hi [~zvenczel], what 
is this issue going on?

> DFSClient should instantiate and cache KMSClientProvider using UGI at 
> creation time for consistent UGI handling
> ---
>
> Key: HDFS-13697
> URL: https://issues.apache.org/jira/browse/HDFS-13697
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Zsolt Venczel
>Assignee: Zsolt Venczel
>Priority: Major
> Attachments: HDFS-13697.01.patch, HDFS-13697.02.patch, 
> HDFS-13697.03.patch, HDFS-13697.04.patch, HDFS-13697.05.patch, 
> HDFS-13697.06.patch, HDFS-13697.07.patch, HDFS-13697.08.patch, 
> HDFS-13697.09.patch, HDFS-13697.10.patch, HDFS-13697.11.patch, 
> HDFS-13697.12.patch, HDFS-13697.prelim.patch
>
>
> While calling KeyProviderCryptoExtension decryptEncryptedKey the call stack 
> might not have doAs privileged execution call (in the DFSClient for example). 
> This results in loosing the proxy user from UGI as UGI.getCurrentUser finds 
> no AccessControllerContext and does a re-login for the login user only.
> This can cause the following for example: if we have set up the oozie user to 
> be entitled to perform actions on behalf of example_user but oozie is 
> forbidden to decrypt any EDEK (for security reasons), due to the above issue, 
> example_user entitlements are lost from UGI and the following error is 
> reported:
> {code}
> [0] 
> SERVER[xxx] USER[example_user] GROUP[-] TOKEN[] APP[Test_EAR] 
> JOB[0020905-180313191552532-oozie-oozi-W] 
> ACTION[0020905-180313191552532-oozie-oozi-W@polling_dir_path] Error starting 
> action [polling_dir_path]. ErrorType [ERROR], ErrorCode [FS014], Message 
> [FS014: User [oozie] is not authorized to perform [DECRYPT_EEK] on key with 
> ACL name [encrypted_key]!!]
> org.apache.oozie.action.ActionExecutorException: FS014: User [oozie] is not 
> authorized to perform [DECRYPT_EEK] on key with ACL name [encrypted_key]!!
>  at 
> org.apache.oozie.action.ActionExecutor.convertExceptionHelper(ActionExecutor.java:463)
>  at 
> org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:441)
>  at 
> org.apache.oozie.action.hadoop.FsActionExecutor.touchz(FsActionExecutor.java:523)
>  at 
> org.apache.oozie.action.hadoop.FsActionExecutor.doOperations(FsActionExecutor.java:199)
>  at 
> org.apache.oozie.action.hadoop.FsActionExecutor.start(FsActionExecutor.java:563)
>  at 
> org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:232)
>  at 
> org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:63)
>  at org.apache.oozie.command.XCommand.call(XCommand.java:286)
>  at 
> org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:332)
>  at 
> org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:261)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>  at 
> org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:179)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  at java.lang.Thread.run(Thread.java:744)
> Caused by: org.apache.hadoop.security.authorize.AuthorizationException: User 
> [oozie] is not authorized to perform [DECRYPT_EEK] on key with ACL name 
> [encrypted_key]!!
>  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>  at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>  at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>  at 
> org.apache.hadoop.util.HttpExceptionUtils.validateResponse(HttpExceptionUtils.java:157)
>  at 
> org.apache.hadoop.crypto.key.kms.KMSClientProvider.call(KMSClientProvider.java:607)
>  at 
> org.apache.hadoop.crypto.key.kms.KMSClientProvider.call(KMSClientProvider.java:565)
>  at 
> org.apache.hadoop.crypto.key.kms.KMSClientProvider.decryptEncryptedKey(KMSClientProvider.java:832)
>  at 
> org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$5.call(LoadBalancingKMSClientProvider.java:209)
>  at 
> org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$5.call(LoadBalancingKMSClientProvider.java:205)
>  at 
> org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.doOp(LoadBalancingKMSClientProvider.java:94)
>  at 
> 

[jira] [Commented] (HDFS-15651) Client could not obtain block when DN CommandProcessingThread exit

2021-07-19 Thread Hemanth Boyina (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383210#comment-17383210
 ] 

Hemanth Boyina commented on HDFS-15651:
---

[~Aiphag0] [~hexiaoqiao] can this Jira be cherry-pick to branch 3.2

> Client could not obtain block when DN CommandProcessingThread exit
> --
>
> Key: HDFS-15651
> URL: https://issues.apache.org/jira/browse/HDFS-15651
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yiqun Lin
>Assignee: Aiphago
>Priority: Major
> Fix For: 3.3.1, 3.4.0
>
> Attachments: HDFS-15651.001.patch, HDFS-15651.002.patch, 
> HDFS-15651.patch
>
>
> In our cluster, we applied the HDFS-14997 improvement.
>  We find one case that CommandProcessingThread will exit due to OOM error. 
> OOM error was caused by our one abnormal application that running on this DN 
> node.
> {noformat}
> 2020-10-18 10:27:12,604 ERROR 
> org.apache.hadoop.hdfs.server.datanode.DataNode: Command processor 
> encountered fatal exception and exit.
> java.lang.OutOfMemoryError: unable to create new native thread
> at java.lang.Thread.start0(Native Method)
> at java.lang.Thread.start(Thread.java:717)
> at 
> java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:957)
> at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1367)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService.execute(FsDatasetAsyncDiskService.java:173)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService.deleteAsync(FsDatasetAsyncDiskService.java:222)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.invalidate(FsDatasetImpl.java:2005)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:671)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:617)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor$CommandProcessingThread.processCommand(BPServiceActor.java:1247)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor$CommandProcessingThread.access$1000(BPServiceActor.java:1194)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor$CommandProcessingThread$3.run(BPServiceActor.java:1299)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor$CommandProcessingThread.processQueue(BPServiceActor.java:1221)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor$CommandProcessingThread.run(BPServiceActor.java:1208)
> {noformat}
> Here the main point is that CommandProcessingThread crashed will lead a very 
> bad impact. All the NN response commands will not be processed by DN side.
> We enabled the block token to access the data, but here the DN command 
> DNA_ACCESSKEYUPDATE is not processed on time by DN. And then we see lots of 
> Sasl error due to key expiration in DN log:
> {noformat}
> javax.security.sasl.SaslException: DIGEST-MD5: IO error acquiring password 
> [Caused by org.apache.hadoop.security.token.SecretManager$InvalidToken: Can't 
> re-compute password for block_token_identifier (expiryDate=xxx, keyId=xx, 
> userId=xxx, blockPoolId=, blockId=xxx, access modes=[READ]), since the 
> required block key (keyID=xxx) doesn't exist.]
> {noformat}
>  
> For the impact in client side, our users receive lots of 'could not obtain 
> block' error  with BlockMissingException.
> CommandProcessingThread is a critical thread, it should always be running.
> {code:java}
>   /**
>* CommandProcessingThread that process commands asynchronously.
>*/
>   class CommandProcessingThread extends Thread {
> private final BPServiceActor actor;
> private final BlockingQueue queue;
> ...
> @Override
> public void run() {
>   try {
> processQueue();
>   } catch (Throwable t) {
> LOG.error("{} encountered fatal exception and exit.", getName(), t);  
>  <=== should not exit this thread
>   }
> }
> {code}
> Once a unexpected error happened, a better handing should be:
>  * catch the exception, appropriately deal with the error and let 
> processQueue continue to run
>  or
>  * exit the DN process to let admin user investigate this



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15942) Increase Quota initialization threads

2021-07-19 Thread Hemanth Boyina (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383183#comment-17383183
 ] 

Hemanth Boyina commented on HDFS-15942:
---

[~sodonnell] [~hexiaoqiao] can this jira be cherry-pick to branch 3.2 ? Thanks

> Increase Quota initialization threads
> -
>
> Key: HDFS-15942
> URL: https://issues.apache.org/jira/browse/HDFS-15942
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Fix For: 3.3.1, 3.4.0
>
> Attachments: HDFS-15942.001.patch
>
>
> On large namespaces, the quota initialization at started can take a long time 
> with the default 4 threads. Also on NN failover, often the quota needs to be 
> calculated before the failover can completed, delaying the failover.
> I performed some benchmarks some time back on a large image (316M inodes 35GB 
> on disk), the quota load takes:
> {code}
> quota - 4  threads 39 seconds
> quota - 8  threads 23 seconds
> quota - 12 threads 20 seconds
> quota - 16 threads 15 seconds
> {code}
> As the quota is calculated when the NN is starting up (and hence doing no 
> other work) or at failover time before the new standby becomes active, I 
> think the quota should use as many threads as possible.
> I proposed we change the default to 8 or 12 on at least trunk and branch-3.3 
> so we have a better default going forward.
> Has anyone got any other thoughts?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16119) start balancer with parameters -hotBlockTimeInterval xxx is invalid

2021-07-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16119?focusedWorklogId=624210=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-624210
 ]

ASF GitHub Bot logged work on HDFS-16119:
-

Author: ASF GitHub Bot
Created on: 19/Jul/21 08:03
Start Date: 19/Jul/21 08:03
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3185:
URL: https://github.com/apache/hadoop/pull/3185#issuecomment-882334294


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 35s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  30m 52s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 22s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   1m 15s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   1m  6s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 24s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 59s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 28s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m  7s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  16m 15s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 10s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 14s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |   1m 14s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  8s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |   1m  8s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 56s | 
[/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3185/4/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs-project/hadoop-hdfs: The patch generated 4 new + 637 unchanged 
- 0 fixed = 641 total (was 637)  |
   | +1 :green_heart: |  mvnsite  |   1m 12s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 48s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 20s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m 10s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  15m 56s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  | 235m 28s |  |  hadoop-hdfs in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 47s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 319m 33s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3185/4/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3185 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux 10adb1af224b 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / dd9f6f2b8b5856e42c778ab7db556ac21c499876 |
   | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3185/4/testReport/ |
   | Max. process+thread count | 3614 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: