[jira] [Commented] (HDFS-16853) The UT TestLeaseRecovery2#testHardLeaseRecoveryAfterNameNodeRestart failed because HADOOP-18324

2023-02-23 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693059#comment-17693059
 ] 

ASF GitHub Bot commented on HDFS-16853:
---

ZanderXu closed pull request #5162: HDFS-16853. BugFix HADOOP-18324 caused UT 
TestLeaseRecovery2#testHardLeaseRecoveryAfterNameNodeRestart failed
URL: https://github.com/apache/hadoop/pull/5162




> The UT TestLeaseRecovery2#testHardLeaseRecoveryAfterNameNodeRestart failed 
> because HADOOP-18324
> ---
>
> Key: HDFS-16853
> URL: https://issues.apache.org/jira/browse/HDFS-16853
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.5
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 3.3.5
>
>
> The UT TestLeaseRecovery2#testHardLeaseRecoveryAfterNameNodeRestart failed 
> with error message: Waiting for cluster to become active. And the blocking 
> jstack as bellows:
> {code:java}
> "BP-1618793397-192.168.3.4-1669198559828 heartbeating to 
> localhost/127.0.0.1:54673" #260 daemon prio=5 os_prio=31 tid=0x
> 7fc1108fa000 nid=0x19303 waiting on condition [0x700017884000]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x0007430a9ec0> (a 
> java.util.concurrent.SynchronousQueue$TransferQueue)
>         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>         at 
> java.util.concurrent.SynchronousQueue$TransferQueue.awaitFulfill(SynchronousQueue.java:762)
>         at 
> java.util.concurrent.SynchronousQueue$TransferQueue.transfer(SynchronousQueue.java:695)
>         at 
> java.util.concurrent.SynchronousQueue.put(SynchronousQueue.java:877)
>         at 
> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1186)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1482)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1429)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:258)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:139)
>         at com.sun.proxy.$Proxy23.sendHeartbeat(Unknown Source)
>         at 
> org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.sendHeartbeat(DatanodeProtocolClient
> SideTranslatorPB.java:168)
>         at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:570)
>         at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:714)
>         at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:915)
>         at java.lang.Thread.run(Thread.java:748)  {code}
> After looking into the code and found that this bug is imported by 
> HADOOP-18324. Because RpcRequestSender exited without cleaning up the 
> rpcRequestQueue, then caused BPServiceActor was blocked in sending request.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16853) The UT TestLeaseRecovery2#testHardLeaseRecoveryAfterNameNodeRestart failed because HADOOP-18324

2023-02-23 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693050#comment-17693050
 ] 

ASF GitHub Bot commented on HDFS-16853:
---

ZanderXu closed pull request #5368: HDFS-16853. The UT 
TestLeaseRecovery2#testHardLeaseRecoveryAfterNameNodeRestart failed because 
HADOOP-18324
URL: https://github.com/apache/hadoop/pull/5368




> The UT TestLeaseRecovery2#testHardLeaseRecoveryAfterNameNodeRestart failed 
> because HADOOP-18324
> ---
>
> Key: HDFS-16853
> URL: https://issues.apache.org/jira/browse/HDFS-16853
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.5
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 3.3.5
>
>
> The UT TestLeaseRecovery2#testHardLeaseRecoveryAfterNameNodeRestart failed 
> with error message: Waiting for cluster to become active. And the blocking 
> jstack as bellows:
> {code:java}
> "BP-1618793397-192.168.3.4-1669198559828 heartbeating to 
> localhost/127.0.0.1:54673" #260 daemon prio=5 os_prio=31 tid=0x
> 7fc1108fa000 nid=0x19303 waiting on condition [0x700017884000]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x0007430a9ec0> (a 
> java.util.concurrent.SynchronousQueue$TransferQueue)
>         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>         at 
> java.util.concurrent.SynchronousQueue$TransferQueue.awaitFulfill(SynchronousQueue.java:762)
>         at 
> java.util.concurrent.SynchronousQueue$TransferQueue.transfer(SynchronousQueue.java:695)
>         at 
> java.util.concurrent.SynchronousQueue.put(SynchronousQueue.java:877)
>         at 
> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1186)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1482)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1429)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:258)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:139)
>         at com.sun.proxy.$Proxy23.sendHeartbeat(Unknown Source)
>         at 
> org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.sendHeartbeat(DatanodeProtocolClient
> SideTranslatorPB.java:168)
>         at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:570)
>         at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:714)
>         at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:915)
>         at java.lang.Thread.run(Thread.java:748)  {code}
> After looking into the code and found that this bug is imported by 
> HADOOP-18324. Because RpcRequestSender exited without cleaning up the 
> rpcRequestQueue, then caused BPServiceActor was blocked in sending request.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16923) The getListing RPC will throw NPE if the path does not exist

2023-02-23 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693047#comment-17693047
 ] 

ASF GitHub Bot commented on HDFS-16923:
---

ZanderXu commented on PR #5400:
URL: https://github.com/apache/hadoop/pull/5400#issuecomment-1442943465

   @simbadzina Master, sorry for the late response. 
   
   >  I think we also need to do the same thing for the block added in 
getFileInfo() (null guard on stat), WDYT?
   
   `stat instanceof HdfsLocatedFileStatus` will return false if the stat is 
null, so there is no bugs in getFileInfo().




> The getListing RPC will throw NPE if the path does not exist
> 
>
> Key: HDFS-16923
> URL: https://issues.apache.org/jira/browse/HDFS-16923
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
>
> The getListing RPC will throw NPE if the path does not exist. And the stack 
> as bellow:
> {code:java}
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.RemoteException): 
> org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): 
> java.lang.NullPointerException
>     at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListing(FSNamesystem.java:4195)
>     at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getListing(NameNodeRpcServer.java:1421)
>     at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:783)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:622)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:590)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:574)
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16917) Add transfer rate quantile metrics for DataNode reads

2023-02-23 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693018#comment-17693018
 ] 

ASF GitHub Bot commented on HDFS-16917:
---

hadoop-yetus commented on PR #5397:
URL: https://github.com/apache/hadoop/pull/5397#issuecomment-1442866247

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 42s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  markdownlint  |   0m  1s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 2 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  16m 23s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  25m 53s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  23m 20s |  |  trunk passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  compile  |  20m 33s |  |  trunk passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   4m 16s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m 33s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m 30s |  |  trunk passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javadoc  |   2m 39s |  |  trunk passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   6m 13s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m 58s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 28s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m 18s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  22m 32s |  |  the patch passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javac  |  22m 32s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  20m 36s |  |  the patch passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  javac  |  20m 36s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   3m 39s | 
[/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/16/artifact/out/results-checkstyle-root.txt)
 |  root: The patch generated 3 new + 205 unchanged - 0 fixed = 208 total (was 
205)  |
   | +1 :green_heart: |  mvnsite  |   3m 29s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   2m 20s |  |  the patch passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javadoc  |   2m 34s |  |  the patch passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   6m 24s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  23m 21s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  18m 28s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  | 204m 38s |  |  hadoop-hdfs in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   1m  8s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 440m 38s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.42 ServerAPI=1.42 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/16/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5397 |
   | Optional Tests | dupname asflicense mvnsite codespell detsecrets 
markdownlint compile javac javadoc mvninstall unit shadedclient spotbugs 
checkstyle |
   | uname | Linux 5d9dae387567 4.15.0-200-generic #211-Ubuntu SMP Thu Nov 24 
18:16:04 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 30bd77593a5f3f5c5ec110f1bbc9767f37478e5d |
   | Default Java | Private Build-1.8.0_352-8u352-ga-1~20.04-b08 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_352-8u352-ga-1~20.04-b08 |
   |  Test Results | 

[jira] [Commented] (HDFS-16917) Add transfer rate quantile metrics for DataNode reads

2023-02-23 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692956#comment-17692956
 ] 

ASF GitHub Bot commented on HDFS-16917:
---

hadoop-yetus commented on PR #5397:
URL: https://github.com/apache/hadoop/pull/5397#issuecomment-1442614458

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 52s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  markdownlint  |   0m  0s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 2 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  25m 24s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  25m 45s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  23m  9s |  |  trunk passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  compile  |  20m 33s |  |  trunk passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   3m 48s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m 23s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m 27s |  |  trunk passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javadoc  |   2m 41s |  |  trunk passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   6m 11s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  23m 15s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 29s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m 18s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  22m 26s |  |  the patch passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javac  |  22m 26s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  20m 30s |  |  the patch passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  javac  |  20m 30s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   3m 34s | 
[/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/15/artifact/out/results-checkstyle-root.txt)
 |  root: The patch generated 1 new + 206 unchanged - 0 fixed = 207 total (was 
206)  |
   | +1 :green_heart: |  mvnsite  |   3m 25s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   2m 18s |  |  the patch passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javadoc  |   2m 42s |  |  the patch passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   6m 18s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  23m 20s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  18m 12s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  | 203m 19s |  |  hadoop-hdfs in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   1m 14s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 448m 58s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.42 ServerAPI=1.42 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/15/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5397 |
   | Optional Tests | dupname asflicense mvnsite codespell detsecrets 
markdownlint compile javac javadoc mvninstall unit shadedclient spotbugs 
checkstyle |
   | uname | Linux 2c9a92b916fd 4.15.0-200-generic #211-Ubuntu SMP Thu Nov 24 
18:16:04 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 1200eaf675f270bdcdd960adc2a8ddd3a8809260 |
   | Default Java | Private Build-1.8.0_352-8u352-ga-1~20.04-b08 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_352-8u352-ga-1~20.04-b08 |
   |  Test Results | 

[jira] [Commented] (HDFS-16917) Add transfer rate quantile metrics for DataNode reads

2023-02-23 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692949#comment-17692949
 ] 

ASF GitHub Bot commented on HDFS-16917:
---

rdingankar commented on code in PR #5397:
URL: https://github.com/apache/hadoop/pull/5397#discussion_r1116351959


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java:
##
@@ -632,6 +632,11 @@ public void readBlock(final ExtendedBlock block,
   datanode.metrics.incrBytesRead((int) read);
   datanode.metrics.incrBlocksRead();
   datanode.metrics.incrTotalReadTime(duration);
+  if (read < 0 || duration <= 0) {

Review Comment:
   moved to a Utils method





> Add transfer rate quantile metrics for DataNode reads
> -
>
> Key: HDFS-16917
> URL: https://issues.apache.org/jira/browse/HDFS-16917
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: datanode
>Reporter: Ravindra Dingankar
>Priority: Minor
>  Labels: pull-request-available
>
> Currently we have the following metrics for datanode reads.
> |BytesRead
> BlocksRead
> TotalReadTime|Total number of bytes read from DataNode
> Total number of blocks read from DataNode
> Total number of milliseconds spent on read operation|
> We would like to add a new quantile metric calculating the transfer rate for 
> datanode reads.
> This will give us a distribution across a window of the read transfer rate 
> for each datanode.
> Quantiles for transfer rate per host will help in identifying issues like 
> hotspotting of datasets as well as finding repetitive slow datanodes.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16917) Add transfer rate quantile metrics for DataNode reads

2023-02-23 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692920#comment-17692920
 ] 

ASF GitHub Bot commented on HDFS-16917:
---

mkuchenbecker commented on code in PR #5397:
URL: https://github.com/apache/hadoop/pull/5397#discussion_r1116262461


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java:
##
@@ -632,6 +632,11 @@ public void readBlock(final ExtendedBlock block,
   datanode.metrics.incrBytesRead((int) read);
   datanode.metrics.incrBlocksRead();
   datanode.metrics.incrTotalReadTime(duration);
+  if (read < 0 || duration <= 0) {

Review Comment:
   You duplicated this logic, I would put behind a private helper so people 
don't forget this check.





> Add transfer rate quantile metrics for DataNode reads
> -
>
> Key: HDFS-16917
> URL: https://issues.apache.org/jira/browse/HDFS-16917
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: datanode
>Reporter: Ravindra Dingankar
>Priority: Minor
>  Labels: pull-request-available
>
> Currently we have the following metrics for datanode reads.
> |BytesRead
> BlocksRead
> TotalReadTime|Total number of bytes read from DataNode
> Total number of blocks read from DataNode
> Total number of milliseconds spent on read operation|
> We would like to add a new quantile metric calculating the transfer rate for 
> datanode reads.
> This will give us a distribution across a window of the read transfer rate 
> for each datanode.
> Quantiles for transfer rate per host will help in identifying issues like 
> hotspotting of datasets as well as finding repetitive slow datanodes.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16453) Upgrade okhttp from 2.7.5 to 4.9.3

2023-02-23 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692868#comment-17692868
 ] 

ASF GitHub Bot commented on HDFS-16453:
---

steveloughran commented on PR #4229:
URL: https://github.com/apache/hadoop/pull/4229#issuecomment-1442310489

   was this the PR which added kotlin as a dependency?
   
   if so: please remember to validate license then update the LICENSE-binary 
file once you have verified it is compatible.




> Upgrade okhttp from 2.7.5 to 4.9.3
> --
>
> Key: HDFS-16453
> URL: https://issues.apache.org/jira/browse/HDFS-16453
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 3.3.1
>Reporter: Ivan Viaznikov
>Assignee: Ashutosh Gupta
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.4
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> {{org.apache.hadoop:hadoop-hdfs-client}} comes with 
> {{com.squareup.okhttp:okhttp:2.7.5}} as a dependency, which is vulnerable to 
> an information disclosure issue due to how the contents of sensitive headers, 
> such as the {{Authorization}} header, can be logged when an 
> {{IllegalArgumentException}} is thrown.
> This issue could allow an attacker or malicious user who has access to the 
> logs to obtain the sensitive contents of the affected headers which could 
> facilitate further attacks.
> Fixed in {{5.0.0-alpha3}} by 
> [this|https://github.com/square/okhttp/commit/dcc6483b7dc6d9c0b8e03ff7c30c13f3c75264a5]
>  commit. The fix was cherry-picked and backported into {{4.9.2}} with 
> [this|https://github.com/square/okhttp/commit/1fd7c0afdc2cee9ba982b07d49662af7f60e1518]
>  commit.
> Requesting you to clarify if this dependency will be updated to a fixed 
> version in the following releases



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16453) Upgrade okhttp from 2.7.5 to 4.9.3

2023-02-23 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692865#comment-17692865
 ] 

Steve Loughran commented on HDFS-16453:
---

[~chengpan] its license update too. but no, not aFAIK

> Upgrade okhttp from 2.7.5 to 4.9.3
> --
>
> Key: HDFS-16453
> URL: https://issues.apache.org/jira/browse/HDFS-16453
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 3.3.1
>Reporter: Ivan Viaznikov
>Assignee: Ashutosh Gupta
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.4
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> {{org.apache.hadoop:hadoop-hdfs-client}} comes with 
> {{com.squareup.okhttp:okhttp:2.7.5}} as a dependency, which is vulnerable to 
> an information disclosure issue due to how the contents of sensitive headers, 
> such as the {{Authorization}} header, can be logged when an 
> {{IllegalArgumentException}} is thrown.
> This issue could allow an attacker or malicious user who has access to the 
> logs to obtain the sensitive contents of the affected headers which could 
> facilitate further attacks.
> Fixed in {{5.0.0-alpha3}} by 
> [this|https://github.com/square/okhttp/commit/dcc6483b7dc6d9c0b8e03ff7c30c13f3c75264a5]
>  commit. The fix was cherry-picked and backported into {{4.9.2}} with 
> [this|https://github.com/square/okhttp/commit/1fd7c0afdc2cee9ba982b07d49662af7f60e1518]
>  commit.
> Requesting you to clarify if this dependency will be updated to a fixed 
> version in the following releases



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-16901) RBF: Routers should propagate the real user in the UGI via the caller context

2023-02-23 Thread Simbarashe Dzinamarira (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692815#comment-17692815
 ] 

Simbarashe Dzinamarira edited comment on HDFS-16901 at 2/23/23 5:56 PM:


I believe this is because HDFS-16756 hasn't been backported to branch-3.3.

The proxyUser that has the routerUser as its realUser is created here 
[https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterRpcClient.java#L392]

In TestRouterRPC.java I added extra configs that make _*this.enableProxyUser*_ 
be true. I did this for simplicity rather that using a secure MiniDFSCluster so 
that _*UserGroupInformation.isSecurityEnabled()*_ is true.


was (Author: simbadzina):
I believe this is because HDFS-16756 hasn't been backported to branch-3.3.

The proxyUser that has the routerUser as its realUser is created here 
[https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterRpcClient.java#L392]

In TestRouterRPC I added extra configs that make`this.enableProxyUser` be true. 
I did this for simplicity rather that using a secure MiniDFSCluster so that 
UserGroupInformation.isSecurityEnabled() is true.

> RBF: Routers should propagate the real user in the UGI via the caller context
> -
>
> Key: HDFS-16901
> URL: https://issues.apache.org/jira/browse/HDFS-16901
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Simbarashe Dzinamarira
>Assignee: Simbarashe Dzinamarira
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.6
>
>
> If the router receives an operation from a proxyUser, it drops the realUser 
> in the UGI and makes the routerUser the realUser for the operation that goes 
> to the namenode.
> In the namenode UGI logs, we'd like the ability to know the original realUser.
> The router should propagate the realUser from the client call as part of the 
> callerContext.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-16901) RBF: Routers should propagate the real user in the UGI via the caller context

2023-02-23 Thread Simbarashe Dzinamarira (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692815#comment-17692815
 ] 

Simbarashe Dzinamarira edited comment on HDFS-16901 at 2/23/23 5:56 PM:


I believe this is because HDFS-16756 hasn't been backported to branch-3.3.

The proxyUser that has the routerUser as its realUser is created here 
[https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterRpcClient.java#L392]

In TestRouterRPC.java I added extra configs that make _*this.enableProxyUser*_ 
be true. I did this for simplicity rather that using a secure MiniDFSCluster.


was (Author: simbadzina):
I believe this is because HDFS-16756 hasn't been backported to branch-3.3.

The proxyUser that has the routerUser as its realUser is created here 
[https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterRpcClient.java#L392]

In TestRouterRPC.java I added extra configs that make _*this.enableProxyUser*_ 
be true. I did this for simplicity rather that using a secure MiniDFSCluster so 
that _*UserGroupInformation.isSecurityEnabled()*_ is true.

> RBF: Routers should propagate the real user in the UGI via the caller context
> -
>
> Key: HDFS-16901
> URL: https://issues.apache.org/jira/browse/HDFS-16901
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Simbarashe Dzinamarira
>Assignee: Simbarashe Dzinamarira
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.6
>
>
> If the router receives an operation from a proxyUser, it drops the realUser 
> in the UGI and makes the routerUser the realUser for the operation that goes 
> to the namenode.
> In the namenode UGI logs, we'd like the ability to know the original realUser.
> The router should propagate the realUser from the client call as part of the 
> callerContext.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16901) RBF: Routers should propagate the real user in the UGI via the caller context

2023-02-23 Thread Simbarashe Dzinamarira (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692815#comment-17692815
 ] 

Simbarashe Dzinamarira commented on HDFS-16901:
---

I believe this is because HDFS-16756 hasn't been backported to branch-3.3.

The proxyUser that has the routerUser as its realUser is created here 
[https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterRpcClient.java#L392]

In TestRouterRPC I added extra configs that make`this.enableProxyUser` be true. 
I did this for simplicity rather that using a secure MiniDFSCluster so that 
UserGroupInformation.isSecurityEnabled() is true.

> RBF: Routers should propagate the real user in the UGI via the caller context
> -
>
> Key: HDFS-16901
> URL: https://issues.apache.org/jira/browse/HDFS-16901
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Simbarashe Dzinamarira
>Assignee: Simbarashe Dzinamarira
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.6
>
>
> If the router receives an operation from a proxyUser, it drops the realUser 
> in the UGI and makes the routerUser the realUser for the operation that goes 
> to the namenode.
> In the namenode UGI logs, we'd like the ability to know the original realUser.
> The router should propagate the realUser from the client call as part of the 
> callerContext.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16901) RBF: Routers should propagate the real user in the UGI via the caller context

2023-02-23 Thread Owen O'Malley (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692793#comment-17692793
 ] 

Owen O'Malley commented on HDFS-16901:
--

My trial backport is here - 
https://github.com/omalley/hadoop/tree/HDFS-16901-3.3

> RBF: Routers should propagate the real user in the UGI via the caller context
> -
>
> Key: HDFS-16901
> URL: https://issues.apache.org/jira/browse/HDFS-16901
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Simbarashe Dzinamarira
>Assignee: Simbarashe Dzinamarira
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.6
>
>
> If the router receives an operation from a proxyUser, it drops the realUser 
> in the UGI and makes the routerUser the realUser for the operation that goes 
> to the namenode.
> In the namenode UGI logs, we'd like the ability to know the original realUser.
> The router should propagate the realUser from the client call as part of the 
> callerContext.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16901) RBF: Routers should propagate the real user in the UGI via the caller context

2023-02-23 Thread Owen O'Malley (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692786#comment-17692786
 ] 

Owen O'Malley commented on HDFS-16901:
--

Simba, when I backport this to branch-3.3 I get a test failure. Basically the 
new test has 'oomalley' as the login user, but the log is using testRealUser.

 
{code:java}
2023-02-22 17:03:36,169 [IPC Server handler 5 on default port 49453] INFO  
FSNamesystem.audit (FSNamesystem.java:logAuditMessage(8574)) - allowed=true    
ugi=testProxyUser (auth:PROXY) via testRealUser (auth:SIMPLE)       
ip=/127.0.0.1   cmd=listStatus  src=/   dst=null        perm=null           
proto=rpc       
callerContext=clientIp:172.25.204.192,clientPort:49519,realUser:testRealUser
{code}
What the test is looking for is:
{code:java}
ugi=testProxyUser (auth:PROXY) via oomalley (auth:SIMPLE){code}
The test works correctly on trunk.

> RBF: Routers should propagate the real user in the UGI via the caller context
> -
>
> Key: HDFS-16901
> URL: https://issues.apache.org/jira/browse/HDFS-16901
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Simbarashe Dzinamarira
>Assignee: Simbarashe Dzinamarira
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.6
>
>
> If the router receives an operation from a proxyUser, it drops the realUser 
> in the UGI and makes the routerUser the realUser for the operation that goes 
> to the namenode.
> In the namenode UGI logs, we'd like the ability to know the original realUser.
> The router should propagate the realUser from the client call as part of the 
> callerContext.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16310) RBF: Add client port to CallerContext for Router

2023-02-23 Thread Takanobu Asanuma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692558#comment-17692558
 ] 

Takanobu Asanuma commented on HDFS-16310:
-

Hi [~omalley], I have found several JIRAs that you have cherry-picked to 
branch-3.3 but not updated the fix versions. Please take care to set the 
correct fix versions. It would affect the hadoop changelog.

> RBF: Add client port to CallerContext for Router
> 
>
> Key: HDFS-16310
> URL: https://issues.apache.org/jira/browse/HDFS-16310
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Tao Li
>Assignee: Tao Li
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.5
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> We mentioned in [HDFS-16266|https://issues.apache.org/jira/browse/HDFS-16266] 
> that adding the client port to the CallerContext of the Router.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16310) RBF: Add client port to CallerContext for Router

2023-02-23 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated HDFS-16310:

Fix Version/s: 3.3.5

> RBF: Add client port to CallerContext for Router
> 
>
> Key: HDFS-16310
> URL: https://issues.apache.org/jira/browse/HDFS-16310
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Tao Li
>Assignee: Tao Li
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.5
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> We mentioned in [HDFS-16266|https://issues.apache.org/jira/browse/HDFS-16266] 
> that adding the client port to the CallerContext of the Router.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16310) RBF: Add client port to CallerContext for Router

2023-02-23 Thread Takanobu Asanuma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692555#comment-17692555
 ] 

Takanobu Asanuma commented on HDFS-16310:
-

I added 3.3.5 to fix versions.

> RBF: Add client port to CallerContext for Router
> 
>
> Key: HDFS-16310
> URL: https://issues.apache.org/jira/browse/HDFS-16310
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Tao Li
>Assignee: Tao Li
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.5
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> We mentioned in [HDFS-16266|https://issues.apache.org/jira/browse/HDFS-16266] 
> that adding the client port to the CallerContext of the Router.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16266) Add remote port information to HDFS audit log

2023-02-23 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated HDFS-16266:

Fix Version/s: 3.3.5

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Tao Li
>Assignee: Tao Li
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.5
>
>  Time Spent: 9h 10m
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16266) Add remote port information to HDFS audit log

2023-02-23 Thread Takanobu Asanuma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692553#comment-17692553
 ] 

Takanobu Asanuma commented on HDFS-16266:
-

I added 3.3.5 to fix versions.

> Add remote port information to HDFS audit log
> -
>
> Key: HDFS-16266
> URL: https://issues.apache.org/jira/browse/HDFS-16266
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Tao Li
>Assignee: Tao Li
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.5
>
>  Time Spent: 9h 10m
>  Remaining Estimate: 0h
>
> In our production environment, we occasionally encounter a problem where a 
> user submits an abnormal computation task, causing a sudden flood of 
> requests, which causes the queueTime and processingTime of the Namenode to 
> rise very high, causing a large backlog of tasks.
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based on 
> metrics and audit logs. Currently, IP and UGI are recorded in audit logs, but 
> there is no port information, so it is difficult to locate specific processes 
> sometimes. Therefore, I propose that we add the port information to the audit 
> log, so that we can easily track the upstream process.
> Currently, some projects contain port information in audit logs, such as 
> Hbase and Alluxio. I think it is also necessary to add port information for 
> HDFS audit logs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org