[jira] [Work logged] (HDFS-16521) DFS API to retrieve slow datanodes

2022-05-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16521?focusedWorklogId=772188=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-772188
 ]

ASF GitHub Bot logged work on HDFS-16521:
-

Author: ASF GitHub Bot
Created on: 18/May/22 23:53
Start Date: 18/May/22 23:53
Worklog Time Spent: 10m 
  Work Description: virajjasani commented on PR #4107:
URL: https://github.com/apache/hadoop/pull/4107#issuecomment-1130767525

   A follow-up PR to expose latency of slow nodes as perceived by their 
reporting nodes #4323 




Issue Time Tracking
---

Worklog Id: (was: 772188)
Time Spent: 7h 10m  (was: 7h)

> DFS API to retrieve slow datanodes
> --
>
> Key: HDFS-16521
> URL: https://issues.apache.org/jira/browse/HDFS-16521
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.4
>
>  Time Spent: 7h 10m
>  Remaining Estimate: 0h
>
> Providing DFS API to retrieve slow nodes would help add an additional option 
> to "dfsadmin -report" that lists slow datanodes info for operators to take a 
> look, specifically useful filter for larger clusters.
> The other purpose of such API is for HDFS downstreamers without direct access 
> to namenode http port (only rpc port accessible) to retrieve slownodes.
> Moreover, 
> [FanOutOneBlockAsyncDFSOutput|https://github.com/apache/hbase/blob/master/hbase-asyncfs/src/main/java/org/apache/hadoop/hbase/io/asyncfs/FanOutOneBlockAsyncDFSOutput.java]
>  in HBase currently has to rely on it's own way of marking and excluding slow 
> nodes while 1) creating pipelines and 2) handling ack, based on factors like 
> the data length of the packet, processing time with last ack timestamp, 
> whether flush to replicas is finished etc. If it can utilize slownode API 
> from HDFS to exclude nodes appropriately while writing block, a lot of it's 
> own post-ack computation of slow nodes can be _saved_ or _improved_ or based 
> on further experiment, we could find _better solution_ to manage slow node 
> detection logic both in HDFS and HBase. However, in order to collect more 
> data points and run more POC around this area, HDFS should provide API for 
> downstreamers to efficiently utilize slownode info for such critical 
> low-latency use-case (like writing WALs).



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16521) DFS API to retrieve slow datanodes

2022-05-05 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16521?focusedWorklogId=766922=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-766922
 ]

ASF GitHub Bot logged work on HDFS-16521:
-

Author: ASF GitHub Bot
Created on: 05/May/22 20:55
Start Date: 05/May/22 20:55
Worklog Time Spent: 10m 
  Work Description: jojochuang merged PR #4259:
URL: https://github.com/apache/hadoop/pull/4259




Issue Time Tracking
---

Worklog Id: (was: 766922)
Time Spent: 7h  (was: 6h 50m)

> DFS API to retrieve slow datanodes
> --
>
> Key: HDFS-16521
> URL: https://issues.apache.org/jira/browse/HDFS-16521
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 7h
>  Remaining Estimate: 0h
>
> Providing DFS API to retrieve slow nodes would help add an additional option 
> to "dfsadmin -report" that lists slow datanodes info for operators to take a 
> look, specifically useful filter for larger clusters.
> The other purpose of such API is for HDFS downstreamers without direct access 
> to namenode http port (only rpc port accessible) to retrieve slownodes.
> Moreover, 
> [FanOutOneBlockAsyncDFSOutput|https://github.com/apache/hbase/blob/master/hbase-asyncfs/src/main/java/org/apache/hadoop/hbase/io/asyncfs/FanOutOneBlockAsyncDFSOutput.java]
>  in HBase currently has to rely on it's own way of marking and excluding slow 
> nodes while 1) creating pipelines and 2) handling ack, based on factors like 
> the data length of the packet, processing time with last ack timestamp, 
> whether flush to replicas is finished etc. If it can utilize slownode API 
> from HDFS to exclude nodes appropriately while writing block, a lot of it's 
> own post-ack computation of slow nodes can be _saved_ or _improved_ or based 
> on further experiment, we could find _better solution_ to manage slow node 
> detection logic both in HDFS and HBase. However, in order to collect more 
> data points and run more POC around this area, HDFS should provide API for 
> downstreamers to efficiently utilize slownode info for such critical 
> low-latency use-case (like writing WALs).



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16521) DFS API to retrieve slow datanodes

2022-05-05 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16521?focusedWorklogId=766921=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-766921
 ]

ASF GitHub Bot logged work on HDFS-16521:
-

Author: ASF GitHub Bot
Created on: 05/May/22 20:53
Start Date: 05/May/22 20:53
Worklog Time Spent: 10m 
  Work Description: jojochuang commented on PR #4259:
URL: https://github.com/apache/hadoop/pull/4259#issuecomment-1119036951

   The failed tests do not look relevant. Some of them seem to be failing since 
before.




Issue Time Tracking
---

Worklog Id: (was: 766921)
Time Spent: 6h 50m  (was: 6h 40m)

> DFS API to retrieve slow datanodes
> --
>
> Key: HDFS-16521
> URL: https://issues.apache.org/jira/browse/HDFS-16521
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6h 50m
>  Remaining Estimate: 0h
>
> Providing DFS API to retrieve slow nodes would help add an additional option 
> to "dfsadmin -report" that lists slow datanodes info for operators to take a 
> look, specifically useful filter for larger clusters.
> The other purpose of such API is for HDFS downstreamers without direct access 
> to namenode http port (only rpc port accessible) to retrieve slownodes.
> Moreover, 
> [FanOutOneBlockAsyncDFSOutput|https://github.com/apache/hbase/blob/master/hbase-asyncfs/src/main/java/org/apache/hadoop/hbase/io/asyncfs/FanOutOneBlockAsyncDFSOutput.java]
>  in HBase currently has to rely on it's own way of marking and excluding slow 
> nodes while 1) creating pipelines and 2) handling ack, based on factors like 
> the data length of the packet, processing time with last ack timestamp, 
> whether flush to replicas is finished etc. If it can utilize slownode API 
> from HDFS to exclude nodes appropriately while writing block, a lot of it's 
> own post-ack computation of slow nodes can be _saved_ or _improved_ or based 
> on further experiment, we could find _better solution_ to manage slow node 
> detection logic both in HDFS and HBase. However, in order to collect more 
> data points and run more POC around this area, HDFS should provide API for 
> downstreamers to efficiently utilize slownode info for such critical 
> low-latency use-case (like writing WALs).



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16521) DFS API to retrieve slow datanodes

2022-05-04 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16521?focusedWorklogId=766364=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-766364
 ]

ASF GitHub Bot logged work on HDFS-16521:
-

Author: ASF GitHub Bot
Created on: 05/May/22 00:46
Start Date: 05/May/22 00:46
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4259:
URL: https://github.com/apache/hadoop/pull/4259#issuecomment-1118063690

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 51s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  buf  |   0m  1s |  |  buf was not available.  |
   | +0 :ok: |  markdownlint  |   0m  1s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 4 new or modified test files.  |
    _ branch-3.3 Compile Tests _ |
   | +0 :ok: |  mvndep  |  15m  2s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  26m 30s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  compile  |   3m 58s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  checkstyle  |   1m 19s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  mvnsite  |   3m 29s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  javadoc  |   3m 33s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  spotbugs  |   7m 42s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  shadedclient  |  26m 29s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 28s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m 50s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   3m 52s |  |  the patch passed  |
   | +1 :green_heart: |  cc  |   3m 52s |  |  the patch passed  |
   | -1 :x: |  javac  |   3m 52s | 
[/results-compile-javac-hadoop-hdfs-project.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4259/6/artifact/out/results-compile-javac-hadoop-hdfs-project.txt)
 |  hadoop-hdfs-project generated 1 new + 746 unchanged - 0 fixed = 747 total 
(was 746)  |
   | +1 :green_heart: |  blanks  |   0m  1s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m  2s |  |  hadoop-hdfs-project: The 
patch generated 0 new + 470 unchanged - 1 fixed = 470 total (was 471)  |
   | +1 :green_heart: |  mvnsite  |   2m 53s |  |  the patch passed  |
   | +1 :green_heart: |  xml  |   0m  1s |  |  The patch has no ill-formed XML 
file.  |
   | +1 :green_heart: |  javadoc  |   2m 53s |  |  the patch passed  |
   | +1 :green_heart: |  spotbugs  |   7m 35s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  26m 32s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 18s |  |  hadoop-hdfs-client in the patch 
passed.  |
   | -1 :x: |  unit  | 209m 42s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4259/6/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | -1 :x: |  unit  |  15m 52s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4259/6/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt)
 |  hadoop-hdfs-rbf in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m  0s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 367m 35s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.hdfs.server.balancer.TestBalancerWithHANameNodes |
   |   | hadoop.hdfs.server.datanode.TestNNHandlesBlockReportPerStorage |
   |   | hadoop.hdfs.TestReconstructStripedFile |
   |   | hadoop.hdfs.TestRollingUpgrade |
   |   | hadoop.hdfs.server.federation.router.TestRouterRpc |
   |   | hadoop.hdfs.server.federation.security.TestRouterSecurityManager |
   |   | hadoop.hdfs.server.federation.router.TestRouterRpcMultiDestination |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4259/6/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4259 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle 

[jira] [Work logged] (HDFS-16521) DFS API to retrieve slow datanodes

2022-05-04 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16521?focusedWorklogId=766360=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-766360
 ]

ASF GitHub Bot logged work on HDFS-16521:
-

Author: ASF GitHub Bot
Created on: 05/May/22 00:29
Start Date: 05/May/22 00:29
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4259:
URL: https://github.com/apache/hadoop/pull/4259#issuecomment-1118056654

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 36s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  buf  |   0m  0s |  |  buf was not available.  |
   | +0 :ok: |  markdownlint  |   0m  0s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 4 new or modified test files.  |
    _ branch-3.3 Compile Tests _ |
   | +0 :ok: |  mvndep  |  15m 44s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  23m 43s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  compile  |   3m 58s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  checkstyle  |   1m 26s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  mvnsite  |   3m 55s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  javadoc  |   4m  1s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  spotbugs  |   7m 44s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  shadedclient  |  24m 44s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 30s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   3m  0s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   3m 41s |  |  the patch passed  |
   | +1 :green_heart: |  cc  |   3m 41s |  |  the patch passed  |
   | -1 :x: |  javac  |   3m 41s | 
[/results-compile-javac-hadoop-hdfs-project.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4259/5/artifact/out/results-compile-javac-hadoop-hdfs-project.txt)
 |  hadoop-hdfs-project generated 1 new + 746 unchanged - 0 fixed = 747 total 
(was 746)  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m  2s |  |  hadoop-hdfs-project: The 
patch generated 0 new + 470 unchanged - 1 fixed = 470 total (was 471)  |
   | +1 :green_heart: |  mvnsite  |   3m  3s |  |  the patch passed  |
   | +1 :green_heart: |  xml  |   0m  2s |  |  The patch has no ill-formed XML 
file.  |
   | +1 :green_heart: |  javadoc  |   3m 10s |  |  the patch passed  |
   | +1 :green_heart: |  spotbugs  |   7m 22s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  24m 21s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 30s |  |  hadoop-hdfs-client in the patch 
passed.  |
   | -1 :x: |  unit  | 196m 54s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4259/5/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | -1 :x: |  unit  |  15m 38s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4259/5/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt)
 |  hadoop-hdfs-rbf in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m 17s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 350m 22s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA |
   |   | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks |
   |   | hadoop.hdfs.server.federation.router.TestRouterRpcMultiDestination |
   |   | hadoop.hdfs.server.federation.security.TestRouterSecurityManager |
   |   | hadoop.hdfs.server.federation.router.TestRouterRpc |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4259/5/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4259 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell cc buflint bufcompat 
markdownlint xml |
   | uname | Linux cc4e016a4963 4.15.0-112-generic 

[jira] [Work logged] (HDFS-16521) DFS API to retrieve slow datanodes

2022-05-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16521?focusedWorklogId=765785=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-765785
 ]

ASF GitHub Bot logged work on HDFS-16521:
-

Author: ASF GitHub Bot
Created on: 04/May/22 00:47
Start Date: 04/May/22 00:47
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4259:
URL: https://github.com/apache/hadoop/pull/4259#issuecomment-1116837508

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 51s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  buf  |   0m  1s |  |  buf was not available.  |
   | +0 :ok: |  markdownlint  |   0m  1s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 3 new or modified test files.  |
    _ branch-3.3 Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m 56s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  26m 27s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  compile  |   3m 57s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  checkstyle  |   1m 20s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  mvnsite  |   3m 28s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  javadoc  |   3m 32s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  spotbugs  |   7m 49s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  shadedclient  |  26m 17s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 28s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m 52s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   3m 47s |  |  the patch passed  |
   | +1 :green_heart: |  cc  |   3m 47s |  |  the patch passed  |
   | -1 :x: |  javac  |   3m 47s | 
[/results-compile-javac-hadoop-hdfs-project.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4259/2/artifact/out/results-compile-javac-hadoop-hdfs-project.txt)
 |  hadoop-hdfs-project generated 1 new + 746 unchanged - 0 fixed = 747 total 
(was 746)  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m  1s |  |  hadoop-hdfs-project: The 
patch generated 0 new + 470 unchanged - 1 fixed = 470 total (was 471)  |
   | +1 :green_heart: |  mvnsite  |   2m 51s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   2m 52s |  |  the patch passed  |
   | +1 :green_heart: |  spotbugs  |   7m 35s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  26m  5s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 18s |  |  hadoop-hdfs-client in the patch 
passed.  |
   | -1 :x: |  unit  | 208m 47s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4259/2/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | -1 :x: |  unit  |  15m 45s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4259/2/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt)
 |  hadoop-hdfs-rbf in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m  0s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 365m 39s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.cli.TestHDFSCLI |
   |   | hadoop.hdfs.TestDecommissionWithBackoffMonitor |
   |   | hadoop.hdfs.server.blockmanagement.TestBlockManager |
   |   | hadoop.hdfs.server.datanode.TestDataNodeRollingUpgrade |
   |   | hadoop.hdfs.server.federation.router.TestRouterRpc |
   |   | hadoop.hdfs.server.federation.security.TestRouterSecurityManager |
   |   | hadoop.hdfs.server.federation.router.TestRouterRpcMultiDestination |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4259/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4259 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell cc buflint bufcompat 
markdownlint |
   | uname | Linux f5439c5e6ce8 4.15.0-175-generic 

[jira] [Work logged] (HDFS-16521) DFS API to retrieve slow datanodes

2022-05-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16521?focusedWorklogId=765773=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-765773
 ]

ASF GitHub Bot logged work on HDFS-16521:
-

Author: ASF GitHub Bot
Created on: 04/May/22 00:30
Start Date: 04/May/22 00:30
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4259:
URL: https://github.com/apache/hadoop/pull/4259#issuecomment-1116812093

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   7m  1s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  buf  |   0m  0s |  |  buf was not available.  |
   | +0 :ok: |  markdownlint  |   0m  0s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 3 new or modified test files.  |
    _ branch-3.3 Compile Tests _ |
   | +0 :ok: |  mvndep  |  15m 23s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  23m 41s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  compile  |   3m 57s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  checkstyle  |   1m 26s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  mvnsite  |   3m 54s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  javadoc  |   4m  3s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  spotbugs  |   7m 48s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  shadedclient  |  25m  2s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 29s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   3m  1s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   3m 43s |  |  the patch passed  |
   | +1 :green_heart: |  cc  |   3m 43s |  |  the patch passed  |
   | -1 :x: |  javac  |   3m 43s | 
[/results-compile-javac-hadoop-hdfs-project.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4259/3/artifact/out/results-compile-javac-hadoop-hdfs-project.txt)
 |  hadoop-hdfs-project generated 1 new + 746 unchanged - 0 fixed = 747 total 
(was 746)  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m  3s |  |  hadoop-hdfs-project: The 
patch generated 0 new + 470 unchanged - 1 fixed = 470 total (was 471)  |
   | +1 :green_heart: |  mvnsite  |   3m  5s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   3m 10s |  |  the patch passed  |
   | +1 :green_heart: |  spotbugs  |   7m 24s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  24m 39s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 27s |  |  hadoop-hdfs-client in the patch 
passed.  |
   | -1 :x: |  unit  | 188m 45s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4259/3/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | -1 :x: |  unit  |  15m 38s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4259/3/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt)
 |  hadoop-hdfs-rbf in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m 14s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 348m 56s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.TestFileCreation |
   |   | hadoop.hdfs.server.datanode.TestDataNodeRollingUpgrade |
   |   | hadoop.cli.TestHDFSCLI |
   |   | hadoop.hdfs.server.datanode.TestBPOfferService |
   |   | 
hadoop.hdfs.server.blockmanagement.TestAvailableSpaceRackFaultTolerantBPP |
   |   | hadoop.hdfs.server.federation.router.TestRouterRpcMultiDestination |
   |   | hadoop.hdfs.server.federation.security.TestRouterSecurityManager |
   |   | hadoop.hdfs.server.federation.router.TestRouterRpc |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4259/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4259 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell cc buflint bufcompat 

[jira] [Work logged] (HDFS-16521) DFS API to retrieve slow datanodes

2022-05-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16521?focusedWorklogId=765310=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-765310
 ]

ASF GitHub Bot logged work on HDFS-16521:
-

Author: ASF GitHub Bot
Created on: 03/May/22 07:35
Start Date: 03/May/22 07:35
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4259:
URL: https://github.com/apache/hadoop/pull/4259#issuecomment-1115816832

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |  10m 33s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  buf  |   0m  1s |  |  buf was not available.  |
   | +0 :ok: |  markdownlint  |   0m  1s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 3 new or modified test files.  |
    _ branch-3.3 Compile Tests _ |
   | +0 :ok: |  mvndep  |  15m 46s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  27m  8s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  compile  |   4m  2s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  checkstyle  |   1m 18s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  mvnsite  |   3m 28s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  javadoc  |   3m 34s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  spotbugs  |   7m 46s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  shadedclient  |  26m 29s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 29s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m 51s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   3m 46s |  |  the patch passed  |
   | +1 :green_heart: |  cc  |   3m 46s |  |  the patch passed  |
   | -1 :x: |  javac  |   3m 46s | 
[/results-compile-javac-hadoop-hdfs-project.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4259/1/artifact/out/results-compile-javac-hadoop-hdfs-project.txt)
 |  hadoop-hdfs-project generated 1 new + 746 unchanged - 0 fixed = 747 total 
(was 746)  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m  2s |  |  hadoop-hdfs-project: The 
patch generated 0 new + 470 unchanged - 1 fixed = 470 total (was 471)  |
   | +1 :green_heart: |  mvnsite  |   2m 54s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   2m 54s |  |  the patch passed  |
   | +1 :green_heart: |  spotbugs  |   7m 37s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  26m  2s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 16s |  |  hadoop-hdfs-client in the patch 
passed.  |
   | -1 :x: |  unit  | 207m 47s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4259/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | -1 :x: |  unit  |  15m 57s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4259/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt)
 |  hadoop-hdfs-rbf in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 58s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 376m 41s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.server.datanode.TestBPOfferService |
   |   | hadoop.cli.TestHDFSCLI |
   |   | hadoop.hdfs.TestRollingUpgrade |
   |   | hadoop.hdfs.server.datanode.TestDataNodeRollingUpgrade |
   |   | hadoop.hdfs.server.federation.router.TestRouterRpc |
   |   | hadoop.hdfs.server.federation.security.TestRouterSecurityManager |
   |   | hadoop.hdfs.server.federation.router.TestRouterRpcMultiDestination |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4259/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4259 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell cc buflint bufcompat 
markdownlint |
   | uname | Linux 1a4d0d83a476 4.15.0-175-generic #184-Ubuntu SMP Thu Mar 24 

[jira] [Work logged] (HDFS-16521) DFS API to retrieve slow datanodes

2022-05-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16521?focusedWorklogId=765255=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-765255
 ]

ASF GitHub Bot logged work on HDFS-16521:
-

Author: ASF GitHub Bot
Created on: 03/May/22 01:18
Start Date: 03/May/22 01:18
Worklog Time Spent: 10m 
  Work Description: virajjasani commented on PR #4107:
URL: https://github.com/apache/hadoop/pull/4107#issuecomment-1115523698

   Thanks everyone for the reviews, here is the branch-3.3 backport PR #4259 




Issue Time Tracking
---

Worklog Id: (was: 765255)
Time Spent: 5h 50m  (was: 5h 40m)

> DFS API to retrieve slow datanodes
> --
>
> Key: HDFS-16521
> URL: https://issues.apache.org/jira/browse/HDFS-16521
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> Providing DFS API to retrieve slow nodes would help add an additional option 
> to "dfsadmin -report" that lists slow datanodes info for operators to take a 
> look, specifically useful filter for larger clusters.
> The other purpose of such API is for HDFS downstreamers without direct access 
> to namenode http port (only rpc port accessible) to retrieve slownodes.
> Moreover, 
> [FanOutOneBlockAsyncDFSOutput|https://github.com/apache/hbase/blob/master/hbase-asyncfs/src/main/java/org/apache/hadoop/hbase/io/asyncfs/FanOutOneBlockAsyncDFSOutput.java]
>  in HBase currently has to rely on it's own way of marking and excluding slow 
> nodes while 1) creating pipelines and 2) handling ack, based on factors like 
> the data length of the packet, processing time with last ack timestamp, 
> whether flush to replicas is finished etc. If it can utilize slownode API 
> from HDFS to exclude nodes appropriately while writing block, a lot of it's 
> own post-ack computation of slow nodes can be _saved_ or _improved_ or based 
> on further experiment, we could find _better solution_ to manage slow node 
> detection logic both in HDFS and HBase. However, in order to collect more 
> data points and run more POC around this area, HDFS should provide API for 
> downstreamers to efficiently utilize slownode info for such critical 
> low-latency use-case (like writing WALs).



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16521) DFS API to retrieve slow datanodes

2022-05-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16521?focusedWorklogId=765254=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-765254
 ]

ASF GitHub Bot logged work on HDFS-16521:
-

Author: ASF GitHub Bot
Created on: 03/May/22 01:18
Start Date: 03/May/22 01:18
Worklog Time Spent: 10m 
  Work Description: virajjasani opened a new pull request, #4259:
URL: https://github.com/apache/hadoop/pull/4259

   branch-3.3 backport PR of #4107 




Issue Time Tracking
---

Worklog Id: (was: 765254)
Time Spent: 5h 40m  (was: 5.5h)

> DFS API to retrieve slow datanodes
> --
>
> Key: HDFS-16521
> URL: https://issues.apache.org/jira/browse/HDFS-16521
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> Providing DFS API to retrieve slow nodes would help add an additional option 
> to "dfsadmin -report" that lists slow datanodes info for operators to take a 
> look, specifically useful filter for larger clusters.
> The other purpose of such API is for HDFS downstreamers without direct access 
> to namenode http port (only rpc port accessible) to retrieve slownodes.
> Moreover, 
> [FanOutOneBlockAsyncDFSOutput|https://github.com/apache/hbase/blob/master/hbase-asyncfs/src/main/java/org/apache/hadoop/hbase/io/asyncfs/FanOutOneBlockAsyncDFSOutput.java]
>  in HBase currently has to rely on it's own way of marking and excluding slow 
> nodes while 1) creating pipelines and 2) handling ack, based on factors like 
> the data length of the packet, processing time with last ack timestamp, 
> whether flush to replicas is finished etc. If it can utilize slownode API 
> from HDFS to exclude nodes appropriately while writing block, a lot of it's 
> own post-ack computation of slow nodes can be _saved_ or _improved_ or based 
> on further experiment, we could find _better solution_ to manage slow node 
> detection logic both in HDFS and HBase. However, in order to collect more 
> data points and run more POC around this area, HDFS should provide API for 
> downstreamers to efficiently utilize slownode info for such critical 
> low-latency use-case (like writing WALs).



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16521) DFS API to retrieve slow datanodes

2022-05-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16521?focusedWorklogId=765144=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-765144
 ]

ASF GitHub Bot logged work on HDFS-16521:
-

Author: ASF GitHub Bot
Created on: 02/May/22 21:05
Start Date: 02/May/22 21:05
Worklog Time Spent: 10m 
  Work Description: jojochuang merged PR #4107:
URL: https://github.com/apache/hadoop/pull/4107




Issue Time Tracking
---

Worklog Id: (was: 765144)
Time Spent: 5.5h  (was: 5h 20m)

> DFS API to retrieve slow datanodes
> --
>
> Key: HDFS-16521
> URL: https://issues.apache.org/jira/browse/HDFS-16521
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> Providing DFS API to retrieve slow nodes would help add an additional option 
> to "dfsadmin -report" that lists slow datanodes info for operators to take a 
> look, specifically useful filter for larger clusters.
> The other purpose of such API is for HDFS downstreamers without direct access 
> to namenode http port (only rpc port accessible) to retrieve slownodes.
> Moreover, 
> [FanOutOneBlockAsyncDFSOutput|https://github.com/apache/hbase/blob/master/hbase-asyncfs/src/main/java/org/apache/hadoop/hbase/io/asyncfs/FanOutOneBlockAsyncDFSOutput.java]
>  in HBase currently has to rely on it's own way of marking and excluding slow 
> nodes while 1) creating pipelines and 2) handling ack, based on factors like 
> the data length of the packet, processing time with last ack timestamp, 
> whether flush to replicas is finished etc. If it can utilize slownode API 
> from HDFS to exclude nodes appropriately while writing block, a lot of it's 
> own post-ack computation of slow nodes can be _saved_ or _improved_ or based 
> on further experiment, we could find _better solution_ to manage slow node 
> detection logic both in HDFS and HBase. However, in order to collect more 
> data points and run more POC around this area, HDFS should provide API for 
> downstreamers to efficiently utilize slownode info for such critical 
> low-latency use-case (like writing WALs).



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16521) DFS API to retrieve slow datanodes

2022-05-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16521?focusedWorklogId=765113=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-765113
 ]

ASF GitHub Bot logged work on HDFS-16521:
-

Author: ASF GitHub Bot
Created on: 02/May/22 19:28
Start Date: 02/May/22 19:28
Worklog Time Spent: 10m 
  Work Description: jojochuang commented on code in PR #4107:
URL: https://github.com/apache/hadoop/pull/4107#discussion_r863122472


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSAdmin.java:
##
@@ -632,6 +638,20 @@ private static void 
printDataNodeReports(DistributedFileSystem dfs,
 }
   }
 
+  private static void printSlowDataNodeReports(DistributedFileSystem dfs, 
boolean listNodes,

Review Comment:
   Fantastic. Thanks for offering the output screenshot.





Issue Time Tracking
---

Worklog Id: (was: 765113)
Time Spent: 5h 20m  (was: 5h 10m)

> DFS API to retrieve slow datanodes
> --
>
> Key: HDFS-16521
> URL: https://issues.apache.org/jira/browse/HDFS-16521
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> Providing DFS API to retrieve slow nodes would help add an additional option 
> to "dfsadmin -report" that lists slow datanodes info for operators to take a 
> look, specifically useful filter for larger clusters.
> The other purpose of such API is for HDFS downstreamers without direct access 
> to namenode http port (only rpc port accessible) to retrieve slownodes.
> Moreover, 
> [FanOutOneBlockAsyncDFSOutput|https://github.com/apache/hbase/blob/master/hbase-asyncfs/src/main/java/org/apache/hadoop/hbase/io/asyncfs/FanOutOneBlockAsyncDFSOutput.java]
>  in HBase currently has to rely on it's own way of marking and excluding slow 
> nodes while 1) creating pipelines and 2) handling ack, based on factors like 
> the data length of the packet, processing time with last ack timestamp, 
> whether flush to replicas is finished etc. If it can utilize slownode API 
> from HDFS to exclude nodes appropriately while writing block, a lot of it's 
> own post-ack computation of slow nodes can be _saved_ or _improved_ or based 
> on further experiment, we could find _better solution_ to manage slow node 
> detection logic both in HDFS and HBase. However, in order to collect more 
> data points and run more POC around this area, HDFS should provide API for 
> downstreamers to efficiently utilize slownode info for such critical 
> low-latency use-case (like writing WALs).



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16521) DFS API to retrieve slow datanodes

2022-04-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16521?focusedWorklogId=764682=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-764682
 ]

ASF GitHub Bot logged work on HDFS-16521:
-

Author: ASF GitHub Bot
Created on: 30/Apr/22 16:37
Start Date: 30/Apr/22 16:37
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4107:
URL: https://github.com/apache/hadoop/pull/4107#issuecomment-1114016617

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |  17m 48s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  buf  |   0m  0s |  |  buf was not available.  |
   | +0 :ok: |  markdownlint  |   0m  0s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 3 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  16m  7s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  28m 42s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   7m 18s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |   6m 54s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   1m 35s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m 44s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   3m  5s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   3m 43s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   8m  1s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  24m  1s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 25s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m 56s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   6m 43s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  cc  |   6m 43s |  |  the patch passed  |
   | -1 :x: |  javac  |   6m 43s | 
[/results-compile-javac-hadoop-hdfs-project-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4107/11/artifact/out/results-compile-javac-hadoop-hdfs-project-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt)
 |  hadoop-hdfs-project-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 with 
JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 generated 1 new + 651 
unchanged - 0 fixed = 652 total (was 651)  |
   | +1 :green_heart: |  compile  |   6m 18s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  cc  |   6m 18s |  |  the patch passed  |
   | -1 :x: |  javac  |   6m 18s | 
[/results-compile-javac-hadoop-hdfs-project-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4107/11/artifact/out/results-compile-javac-hadoop-hdfs-project-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt)
 |  hadoop-hdfs-project-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 
with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 generated 1 new + 
629 unchanged - 0 fixed = 630 total (was 629)  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m 21s |  |  hadoop-hdfs-project: The 
patch generated 0 new + 455 unchanged - 1 fixed = 455 total (was 456)  |
   | +1 :green_heart: |  mvnsite  |   3m  6s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   2m 21s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   3m  7s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   7m 48s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  23m 31s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 29s |  |  hadoop-hdfs-client in the patch 
passed.  |
   | +1 :green_heart: |  unit  | 369m  4s |  |  hadoop-hdfs in the patch 
passed.  |
   | -1 :x: |  unit  |  41m  0s | 

[jira] [Work logged] (HDFS-16521) DFS API to retrieve slow datanodes

2022-04-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16521?focusedWorklogId=764624=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-764624
 ]

ASF GitHub Bot logged work on HDFS-16521:
-

Author: ASF GitHub Bot
Created on: 30/Apr/22 06:43
Start Date: 30/Apr/22 06:43
Worklog Time Spent: 10m 
  Work Description: virajjasani commented on PR #4107:
URL: https://github.com/apache/hadoop/pull/4107#issuecomment-1113930859

   Conflicts resolved with latest changes




Issue Time Tracking
---

Worklog Id: (was: 764624)
Time Spent: 5h  (was: 4h 50m)

> DFS API to retrieve slow datanodes
> --
>
> Key: HDFS-16521
> URL: https://issues.apache.org/jira/browse/HDFS-16521
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> Providing DFS API to retrieve slow nodes would help add an additional option 
> to "dfsadmin -report" that lists slow datanodes info for operators to take a 
> look, specifically useful filter for larger clusters.
> The other purpose of such API is for HDFS downstreamers without direct access 
> to namenode http port (only rpc port accessible) to retrieve slownodes.
> Moreover, 
> [FanOutOneBlockAsyncDFSOutput|https://github.com/apache/hbase/blob/master/hbase-asyncfs/src/main/java/org/apache/hadoop/hbase/io/asyncfs/FanOutOneBlockAsyncDFSOutput.java]
>  in HBase currently has to rely on it's own way of marking and excluding slow 
> nodes while 1) creating pipelines and 2) handling ack, based on factors like 
> the data length of the packet, processing time with last ack timestamp, 
> whether flush to replicas is finished etc. If it can utilize slownode API 
> from HDFS to exclude nodes appropriately while writing block, a lot of it's 
> own post-ack computation of slow nodes can be _saved_ or _improved_ or based 
> on further experiment, we could find _better solution_ to manage slow node 
> detection logic both in HDFS and HBase. However, in order to collect more 
> data points and run more POC around this area, HDFS should provide API for 
> downstreamers to efficiently utilize slownode info for such critical 
> low-latency use-case (like writing WALs).



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16521) DFS API to retrieve slow datanodes

2022-04-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16521?focusedWorklogId=764591=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-764591
 ]

ASF GitHub Bot logged work on HDFS-16521:
-

Author: ASF GitHub Bot
Created on: 30/Apr/22 02:19
Start Date: 30/Apr/22 02:19
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4107:
URL: https://github.com/apache/hadoop/pull/4107#issuecomment-1113894018

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m  2s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  2s |  |  codespell was not available.  |
   | +0 :ok: |  buf  |   0m  2s |  |  buf was not available.  |
   | +0 :ok: |  markdownlint  |   0m  2s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 3 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m 36s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  27m 40s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   6m 58s |  |  trunk passed with JDK 
Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  compile  |   6m 23s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   1m 35s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m 42s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   3m  2s |  |  trunk passed with JDK 
Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   3m 44s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   7m 59s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  24m  1s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | -0 :warning: |  patch  |  24m 26s |  |  Used diff version of patch file. 
Binary files and potentially other changes not applied. Please rebase and 
squash commits if necessary.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 25s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m 57s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   6m 33s |  |  the patch passed with JDK 
Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  cc  |   6m 33s |  |  the patch passed  |
   | -1 :x: |  javac  |   6m 33s | 
[/results-compile-javac-hadoop-hdfs-project-jdkUbuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4107/10/artifact/out/results-compile-javac-hadoop-hdfs-project-jdkUbuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04.txt)
 |  hadoop-hdfs-project-jdkUbuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04 with JDK 
Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04 generated 1 new + 651 unchanged - 0 
fixed = 652 total (was 651)  |
   | +1 :green_heart: |  compile  |   6m 14s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  cc  |   6m 14s |  |  the patch passed  |
   | -1 :x: |  javac  |   6m 14s | 
[/results-compile-javac-hadoop-hdfs-project-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4107/10/artifact/out/results-compile-javac-hadoop-hdfs-project-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt)
 |  hadoop-hdfs-project-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 
with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 generated 1 new + 
629 unchanged - 0 fixed = 630 total (was 629)  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m 17s |  |  hadoop-hdfs-project: The 
patch generated 0 new + 455 unchanged - 1 fixed = 455 total (was 456)  |
   | +1 :green_heart: |  mvnsite  |   3m  6s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   2m 21s |  |  the patch passed with JDK 
Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   3m  8s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   7m 56s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  23m 18s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 27s |  |  hadoop-hdfs-client in the patch 
passed.  |

[jira] [Work logged] (HDFS-16521) DFS API to retrieve slow datanodes

2022-04-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16521?focusedWorklogId=763505=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-763505
 ]

ASF GitHub Bot logged work on HDFS-16521:
-

Author: ASF GitHub Bot
Created on: 28/Apr/22 13:24
Start Date: 28/Apr/22 13:24
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4107:
URL: https://github.com/apache/hadoop/pull/4107#issuecomment-1112200478

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 57s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  buf  |   0m  1s |  |  buf was not available.  |
   | +0 :ok: |  markdownlint  |   0m  1s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 3 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  15m  9s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  25m  6s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   6m 23s |  |  trunk passed with JDK 
Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  compile  |   6m  2s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   1m 41s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   4m  4s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   3m 24s |  |  trunk passed with JDK 
Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   4m  4s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   8m  8s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  21m 24s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 33s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   3m  2s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   6m  3s |  |  the patch passed with JDK 
Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  cc  |   6m  3s |  |  the patch passed  |
   | -1 :x: |  javac  |   6m  3s | 
[/results-compile-javac-hadoop-hdfs-project-jdkUbuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4107/9/artifact/out/results-compile-javac-hadoop-hdfs-project-jdkUbuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04.txt)
 |  hadoop-hdfs-project-jdkUbuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04 with JDK 
Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04 generated 1 new + 651 unchanged - 0 
fixed = 652 total (was 651)  |
   | +1 :green_heart: |  compile  |   5m 51s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  cc  |   5m 51s |  |  the patch passed  |
   | -1 :x: |  javac  |   5m 51s | 
[/results-compile-javac-hadoop-hdfs-project-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4107/9/artifact/out/results-compile-javac-hadoop-hdfs-project-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt)
 |  hadoop-hdfs-project-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 
with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 generated 1 new + 
629 unchanged - 0 fixed = 630 total (was 629)  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m 20s |  |  hadoop-hdfs-project: The 
patch generated 0 new + 455 unchanged - 1 fixed = 455 total (was 456)  |
   | +1 :green_heart: |  mvnsite  |   3m 16s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   2m 28s |  |  the patch passed with JDK 
Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   3m 16s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   7m 37s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  20m 59s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 39s |  |  hadoop-hdfs-client in the patch 
passed.  |
   | -1 :x: |  unit  | 442m 24s | 

[jira] [Work logged] (HDFS-16521) DFS API to retrieve slow datanodes

2022-04-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16521?focusedWorklogId=763266=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-763266
 ]

ASF GitHub Bot logged work on HDFS-16521:
-

Author: ASF GitHub Bot
Created on: 28/Apr/22 02:51
Start Date: 28/Apr/22 02:51
Worklog Time Spent: 10m 
  Work Description: virajjasani commented on code in PR #4107:
URL: https://github.com/apache/hadoop/pull/4107#discussion_r860405718


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/metrics/DataNodePeerMetrics.java:
##
@@ -142,14 +144,28 @@ public void collectThreadLocalStates() {
* than their peers.
*/
   public Map getOutliers() {
-// This maps the metric name to the aggregate latency.
-// The metric name is the datanode ID.
-final Map stats =
-sendPacketDownstreamRollingAverages.getStats(
-minOutlierDetectionSamples);
-LOG.trace("DataNodePeerMetrics: Got stats: {}", stats);
-
-return slowNodeDetector.getOutliers(stats);
+// outlier must be null for source code.
+if (testOutlier == null) {
+  // This maps the metric name to the aggregate latency.
+  // The metric name is the datanode ID.
+  final Map stats =
+  
sendPacketDownstreamRollingAverages.getStats(minOutlierDetectionSamples);
+  LOG.trace("DataNodePeerMetrics: Got stats: {}", stats);
+  return slowNodeDetector.getOutliers(stats);
+} else {
+  // this happens only for test code.
+  return testOutlier;
+}
+  }
+
+  /**
+   * Strictly to be used by test code only. Source code is not supposed to use 
this. This method
+   * directly sets outlier mapping so that aggregate latency metrics are not 
calculated for tests.
+   *
+   * @param outlier outlier directly set by tests.
+   */
+  public void setTestOutliers(Map outlier) {

Review Comment:
   Yeah it's very difficult to reproduce the actual slow node in UT, hence had 
to do this way. Sure, added comment on `testOutlier` member as well (in 
addition to this setter method Javadoc).



##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSAdmin.java:
##
@@ -632,6 +638,20 @@ private static void 
printDataNodeReports(DistributedFileSystem dfs,
 }
   }
 
+  private static void printSlowDataNodeReports(DistributedFileSystem dfs, 
boolean listNodes,

Review Comment:
   > One comment on the slow datanode report is that it seems to say nothing 
about why the NN thinks it slow;
   
   It's a datanode that determines whether it's peer datanodes are slower, NN 
just aggregates all DN reports.
   
   > For example, say something about how in excess a DNs latency is? (Perhaps 
this could be added later)
   
   Sure, this can be added as an additional info. Will create a follow-up Jira. 
Thanks @saintstack 



##
hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSCommands.md:
##
@@ -394,7 +394,7 @@ Usage:
 
 | COMMAND\_OPTION | Description |
 |:--

Issue Time Tracking
---

Worklog Id: (was: 763266)
Time Spent: 4.5h  (was: 4h 20m)

> DFS API to retrieve slow datanodes
> --
>
> Key: HDFS-16521
> URL: https://issues.apache.org/jira/browse/HDFS-16521
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> Providing DFS API to retrieve slow nodes would help add an additional option 
> to "dfsadmin -report" that lists slow datanodes info for operators to take a 
> look, specifically useful filter for larger clusters.
> The other purpose of such API is for HDFS downstreamers without direct access 
> to namenode http port (only rpc port accessible) to retrieve slownodes.
> Moreover, 
> [FanOutOneBlockAsyncDFSOutput|https://github.com/apache/hbase/blob/master/hbase-asyncfs/src/main/java/org/apache/hadoop/hbase/io/asyncfs/FanOutOneBlockAsyncDFSOutput.java]
>  in HBase currently has to rely on it's own way of marking and excluding slow 
> nodes while 1) creating pipelines and 2) handling ack, based on factors like 
> the data length of the packet, processing time with last ack timestamp, 
> whether flush to replicas is finished etc. If it can utilize slownode API 
> from HDFS to exclude nodes appropriately while writing block, a lot of it's 
> own post-ack computation of slow nodes can be _saved_ or _improved_ or based 
> on further experiment, we could find _better solution_ to manage slow node 
> detection logic both in HDFS and HBase. However, in order to collect more 
> data points and run more POC around this area, HDFS should provide API for 
> downstreamers to efficiently utilize slownode info 

[jira] [Work logged] (HDFS-16521) DFS API to retrieve slow datanodes

2022-04-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16521?focusedWorklogId=763037=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-763037
 ]

ASF GitHub Bot logged work on HDFS-16521:
-

Author: ASF GitHub Bot
Created on: 27/Apr/22 16:40
Start Date: 27/Apr/22 16:40
Worklog Time Spent: 10m 
  Work Description: saintstack commented on code in PR #4107:
URL: https://github.com/apache/hadoop/pull/4107#discussion_r860008767


##
hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSCommands.md:
##
@@ -394,7 +394,7 @@ Usage:
 
 | COMMAND\_OPTION | Description |
 |:--

Issue Time Tracking
---

Worklog Id: (was: 763037)
Time Spent: 4h 20m  (was: 4h 10m)

> DFS API to retrieve slow datanodes
> --
>
> Key: HDFS-16521
> URL: https://issues.apache.org/jira/browse/HDFS-16521
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> Providing DFS API to retrieve slow nodes would help add an additional option 
> to "dfsadmin -report" that lists slow datanodes info for operators to take a 
> look, specifically useful filter for larger clusters.
> The other purpose of such API is for HDFS downstreamers without direct access 
> to namenode http port (only rpc port accessible) to retrieve slownodes.
> Moreover, 
> [FanOutOneBlockAsyncDFSOutput|https://github.com/apache/hbase/blob/master/hbase-asyncfs/src/main/java/org/apache/hadoop/hbase/io/asyncfs/FanOutOneBlockAsyncDFSOutput.java]
>  in HBase currently has to rely on it's own way of marking and excluding slow 
> nodes while 1) creating pipelines and 2) handling ack, based on factors like 
> the data length of the packet, processing time with last ack timestamp, 
> whether flush to replicas is finished etc. If it can utilize slownode API 
> from HDFS to exclude nodes appropriately while writing block, a lot of it's 
> own post-ack computation of slow nodes can be _saved_ or _improved_ or based 
> on further experiment, we could find _better solution_ to manage slow node 
> detection logic both in HDFS and HBase. However, in order to collect more 
> data points and run more POC around this area, HDFS should provide API for 
> downstreamers to efficiently utilize slownode info for such critical 
> low-latency use-case (like writing WALs).



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16521) DFS API to retrieve slow datanodes

2022-04-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16521?focusedWorklogId=762776=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-762776
 ]

ASF GitHub Bot logged work on HDFS-16521:
-

Author: ASF GitHub Bot
Created on: 27/Apr/22 09:39
Start Date: 27/Apr/22 09:39
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4107:
URL: https://github.com/apache/hadoop/pull/4107#issuecomment-1110789635

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m 34s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  buf  |   0m  0s |  |  buf was not available.  |
   | +0 :ok: |  markdownlint  |   0m  0s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 3 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  16m  2s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  26m  8s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   6m 52s |  |  trunk passed with JDK 
Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  compile  |   6m 33s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   1m 41s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   4m  1s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   3m 26s |  |  trunk passed with JDK 
Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   4m  8s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   8m  3s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  21m 31s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | -0 :warning: |  patch  |  21m 58s |  |  Used diff version of patch file. 
Binary files and potentially other changes not applied. Please rebase and 
squash commits if necessary.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 28s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   3m 30s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   7m 33s |  |  the patch passed with JDK 
Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  cc  |   7m 33s |  |  the patch passed  |
   | -1 :x: |  javac  |   7m 33s | 
[/results-compile-javac-hadoop-hdfs-project-jdkUbuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4107/8/artifact/out/results-compile-javac-hadoop-hdfs-project-jdkUbuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04.txt)
 |  hadoop-hdfs-project-jdkUbuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04 with JDK 
Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04 generated 1 new + 651 unchanged - 0 
fixed = 652 total (was 651)  |
   | +1 :green_heart: |  compile  |   7m 18s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  cc  |   7m 18s |  |  the patch passed  |
   | -1 :x: |  javac  |   7m 18s | 
[/results-compile-javac-hadoop-hdfs-project-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4107/8/artifact/out/results-compile-javac-hadoop-hdfs-project-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt)
 |  hadoop-hdfs-project-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 
with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 generated 1 new + 
629 unchanged - 0 fixed = 630 total (was 629)  |
   | +1 :green_heart: |  blanks  |   0m  1s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m 32s |  |  hadoop-hdfs-project: The 
patch generated 0 new + 455 unchanged - 1 fixed = 455 total (was 456)  |
   | +1 :green_heart: |  mvnsite  |   3m 41s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   2m 51s |  |  the patch passed with JDK 
Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   3m 39s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |  10m 11s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  26m 42s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  |   0m 43s | 

[jira] [Work logged] (HDFS-16521) DFS API to retrieve slow datanodes

2022-04-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16521?focusedWorklogId=762741=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-762741
 ]

ASF GitHub Bot logged work on HDFS-16521:
-

Author: ASF GitHub Bot
Created on: 27/Apr/22 06:45
Start Date: 27/Apr/22 06:45
Worklog Time Spent: 10m 
  Work Description: virajjasani commented on code in PR #4107:
URL: https://github.com/apache/hadoop/pull/4107#discussion_r859429956


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSAdmin.java:
##
@@ -632,6 +638,20 @@ private static void 
printDataNodeReports(DistributedFileSystem dfs,
 }
   }
 
+  private static void printSlowDataNodeReports(DistributedFileSystem dfs, 
boolean listNodes,

Review Comment:
   > I suspect you would need some kind of header to distinguish from the other 
data node reports.
   
   This is called only if condition `listAll || listSlowNodes` is true:
   ```
   if (listAll || listSlowNodes) {
 printSlowDataNodeReports(dfs, listSlowNodes, "Slow");
   }
   ```
   
   Sample output:
   
   Header:
   ```
   -
   Slow datanodes (n):
   
   ```
   
   https://user-images.githubusercontent.com/34790606/165455352-303eb506-0a5f-491d-ac44-bcc243a8f0f6.png;>
   





Issue Time Tracking
---

Worklog Id: (was: 762741)
Time Spent: 4h  (was: 3h 50m)

> DFS API to retrieve slow datanodes
> --
>
> Key: HDFS-16521
> URL: https://issues.apache.org/jira/browse/HDFS-16521
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> Providing DFS API to retrieve slow nodes would help add an additional option 
> to "dfsadmin -report" that lists slow datanodes info for operators to take a 
> look, specifically useful filter for larger clusters.
> The other purpose of such API is for HDFS downstreamers without direct access 
> to namenode http port (only rpc port accessible) to retrieve slownodes.
> Moreover, 
> [FanOutOneBlockAsyncDFSOutput|https://github.com/apache/hbase/blob/master/hbase-asyncfs/src/main/java/org/apache/hadoop/hbase/io/asyncfs/FanOutOneBlockAsyncDFSOutput.java]
>  in HBase currently has to rely on it's own way of marking and excluding slow 
> nodes while 1) creating pipelines and 2) handling ack, based on factors like 
> the data length of the packet, processing time with last ack timestamp, 
> whether flush to replicas is finished etc. If it can utilize slownode API 
> from HDFS to exclude nodes appropriately while writing block, a lot of it's 
> own post-ack computation of slow nodes can be _saved_ or _improved_ or based 
> on further experiment, we could find _better solution_ to manage slow node 
> detection logic both in HDFS and HBase. However, in order to collect more 
> data points and run more POC around this area, HDFS should provide API for 
> downstreamers to efficiently utilize slownode info for such critical 
> low-latency use-case (like writing WALs).



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16521) DFS API to retrieve slow datanodes

2022-04-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16521?focusedWorklogId=762739=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-762739
 ]

ASF GitHub Bot logged work on HDFS-16521:
-

Author: ASF GitHub Bot
Created on: 27/Apr/22 06:44
Start Date: 27/Apr/22 06:44
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4107:
URL: https://github.com/apache/hadoop/pull/4107#issuecomment-1110606241

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m  0s |  |  Docker mode activated.  |
   | -1 :x: |  patch  |   0m 27s |  |  
https://github.com/apache/hadoop/pull/4107 does not apply to trunk. Rebase 
required? Wrong Branch? See 
https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute for help.  
|
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | GITHUB PR | https://github.com/apache/hadoop/pull/4107 |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4107/7/console |
   | versions | git=2.17.1 |
   | Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




Issue Time Tracking
---

Worklog Id: (was: 762739)
Time Spent: 3h 50m  (was: 3h 40m)

> DFS API to retrieve slow datanodes
> --
>
> Key: HDFS-16521
> URL: https://issues.apache.org/jira/browse/HDFS-16521
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> Providing DFS API to retrieve slow nodes would help add an additional option 
> to "dfsadmin -report" that lists slow datanodes info for operators to take a 
> look, specifically useful filter for larger clusters.
> The other purpose of such API is for HDFS downstreamers without direct access 
> to namenode http port (only rpc port accessible) to retrieve slownodes.
> Moreover, 
> [FanOutOneBlockAsyncDFSOutput|https://github.com/apache/hbase/blob/master/hbase-asyncfs/src/main/java/org/apache/hadoop/hbase/io/asyncfs/FanOutOneBlockAsyncDFSOutput.java]
>  in HBase currently has to rely on it's own way of marking and excluding slow 
> nodes while 1) creating pipelines and 2) handling ack, based on factors like 
> the data length of the packet, processing time with last ack timestamp, 
> whether flush to replicas is finished etc. If it can utilize slownode API 
> from HDFS to exclude nodes appropriately while writing block, a lot of it's 
> own post-ack computation of slow nodes can be _saved_ or _improved_ or based 
> on further experiment, we could find _better solution_ to manage slow node 
> detection logic both in HDFS and HBase. However, in order to collect more 
> data points and run more POC around this area, HDFS should provide API for 
> downstreamers to efficiently utilize slownode info for such critical 
> low-latency use-case (like writing WALs).



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16521) DFS API to retrieve slow datanodes

2022-04-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16521?focusedWorklogId=762734=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-762734
 ]

ASF GitHub Bot logged work on HDFS-16521:
-

Author: ASF GitHub Bot
Created on: 27/Apr/22 06:42
Start Date: 27/Apr/22 06:42
Worklog Time Spent: 10m 
  Work Description: virajjasani commented on code in PR #4107:
URL: https://github.com/apache/hadoop/pull/4107#discussion_r859427464


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSAdmin.java:
##
@@ -433,7 +433,7 @@ static int run(DistributedFileSystem dfs, String[] argv, 
int idx) throws IOExcep
*/
   private static final String commonUsageSummary =
 "\t[-report [-live] [-dead] [-decommissioning] " +
-"[-enteringmaintenance] [-inmaintenance]]\n" +
+  "[-enteringmaintenance] [-inmaintenance] [-slownodes]]\n" +

Review Comment:
   Reg the command options, I believe filters can be ideally used for both: 1) 
state of DNs (decommissioning, dead, live etc) and 2) nature of DNs (slow 
outliers). Updated the doc, please review.
   Thanks



##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSAdmin.java:
##
@@ -632,6 +638,20 @@ private static void 
printDataNodeReports(DistributedFileSystem dfs,
 }
   }
 
+  private static void printSlowDataNodeReports(DistributedFileSystem dfs, 
boolean listNodes,

Review Comment:
   > I suspect you would need some kind of header to distinguish from the other 
data node reports.
   
   This is called only if condition `listAll || listSlowNodes` is true:
   ```
   if (listAll || listSlowNodes) {
 printSlowDataNodeReports(dfs, listSlowNodes, "Slow");
   }
   ```
   
   Sample output:
   
   https://user-images.githubusercontent.com/34790606/165455352-303eb506-0a5f-491d-ac44-bcc243a8f0f6.png;>
   



##
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/ClientProtocol.java:
##
@@ -1868,4 +1868,16 @@ BatchedEntries listOpenFiles(long prevId,
*/
   @AtMostOnce
   void satisfyStoragePolicy(String path) throws IOException;
+
+  /**
+   * Get report on all of the slow Datanodes. Slow running datanodes are 
identified based on
+   * the Outlier detection algorithm, if slow peer tracking is enabled for the 
DFS cluster.
+   *
+   * @return Datanode report for slow running datanodes.
+   * @throws IOException If an I/O error occurs.
+   */
+  @Idempotent
+  @ReadOnly
+  DatanodeInfo[] getSlowDatanodeReport() throws IOException;

Review Comment:
   I thought List is also fine but kept it Array to keep the API contract in 
line with `getDatanodeReport()` so that both APIs can use same underlying 
utility methods (e.g. getDatanodeInfoFromDescriptors() ).



##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java:
##
@@ -4914,6 +4914,33 @@ int getNumberOfDatanodes(DatanodeReportType type) {
 }
   }
 
+  DatanodeInfo[] slowDataNodesReport() throws IOException {
+String operationName = "slowDataNodesReport";
+DatanodeInfo[] datanodeInfos;
+checkSuperuserPrivilege(operationName);

Review Comment:
   Not really, removed, thanks.





Issue Time Tracking
---

Worklog Id: (was: 762734)
Time Spent: 3h 40m  (was: 3.5h)

> DFS API to retrieve slow datanodes
> --
>
> Key: HDFS-16521
> URL: https://issues.apache.org/jira/browse/HDFS-16521
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> Providing DFS API to retrieve slow nodes would help add an additional option 
> to "dfsadmin -report" that lists slow datanodes info for operators to take a 
> look, specifically useful filter for larger clusters.
> The other purpose of such API is for HDFS downstreamers without direct access 
> to namenode http port (only rpc port accessible) to retrieve slownodes.
> Moreover, 
> [FanOutOneBlockAsyncDFSOutput|https://github.com/apache/hbase/blob/master/hbase-asyncfs/src/main/java/org/apache/hadoop/hbase/io/asyncfs/FanOutOneBlockAsyncDFSOutput.java]
>  in HBase currently has to rely on it's own way of marking and excluding slow 
> nodes while 1) creating pipelines and 2) handling ack, based on factors like 
> the data length of the packet, processing time with last ack timestamp, 
> whether flush to replicas is finished etc. If it can utilize slownode API 
> from HDFS to exclude nodes appropriately while writing block, a lot of it's 
> own post-ack computation of slow nodes can be _saved_ or _improved_ or based 
> on further experiment, we could 

[jira] [Work logged] (HDFS-16521) DFS API to retrieve slow datanodes

2022-04-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16521?focusedWorklogId=762716=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-762716
 ]

ASF GitHub Bot logged work on HDFS-16521:
-

Author: ASF GitHub Bot
Created on: 27/Apr/22 05:32
Start Date: 27/Apr/22 05:32
Worklog Time Spent: 10m 
  Work Description: jojochuang commented on code in PR #4107:
URL: https://github.com/apache/hadoop/pull/4107#discussion_r859381403


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java:
##
@@ -4914,6 +4914,33 @@ int getNumberOfDatanodes(DatanodeReportType type) {
 }
   }
 
+  DatanodeInfo[] slowDataNodesReport() throws IOException {
+String operationName = "slowDataNodesReport";
+DatanodeInfo[] datanodeInfos;
+checkSuperuserPrivilege(operationName);

Review Comment:
   does it need to require super user privilege?



##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSAdmin.java:
##
@@ -433,7 +433,7 @@ static int run(DistributedFileSystem dfs, String[] argv, 
int idx) throws IOExcep
*/
   private static final String commonUsageSummary =
 "\t[-report [-live] [-dead] [-decommissioning] " +
-"[-enteringmaintenance] [-inmaintenance]]\n" +
+  "[-enteringmaintenance] [-inmaintenance] [-slownodes]]\n" +

Review Comment:
   The corresponding documentation needs to update when CLI commands are 
added/updated.



##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSAdmin.java:
##
@@ -632,6 +638,20 @@ private static void 
printDataNodeReports(DistributedFileSystem dfs,
 }
   }
 
+  private static void printSlowDataNodeReports(DistributedFileSystem dfs, 
boolean listNodes,

Review Comment:
   Can you provide a sample output? It would be confusing, I guess. I suspect 
you would need some kind of header to distinguish from the other data node 
reports. 



##
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/ClientProtocol.java:
##
@@ -1868,4 +1868,16 @@ BatchedEntries listOpenFiles(long prevId,
*/
   @AtMostOnce
   void satisfyStoragePolicy(String path) throws IOException;
+
+  /**
+   * Get report on all of the slow Datanodes. Slow running datanodes are 
identified based on
+   * the Outlier detection algorithm, if slow peer tracking is enabled for the 
DFS cluster.
+   *
+   * @return Datanode report for slow running datanodes.
+   * @throws IOException If an I/O error occurs.
+   */
+  @Idempotent
+  @ReadOnly
+  DatanodeInfo[] getSlowDatanodeReport() throws IOException;

Review Comment:
   I just want to check with every one that it is okay to have an array of 
objects as the return value.
   I think it's fine but just want to check with every one, because once we 
decide the the interface it can't be changed later.



##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSAdmin.java:
##
@@ -433,7 +433,7 @@ static int run(DistributedFileSystem dfs, String[] argv, 
int idx) throws IOExcep
*/
   private static final String commonUsageSummary =
 "\t[-report [-live] [-dead] [-decommissioning] " +
-"[-enteringmaintenance] [-inmaintenance]]\n" +
+  "[-enteringmaintenance] [-inmaintenance] [-slownodes]]\n" +

Review Comment:
   In fact it would appear confusion to HDFS administrators. These subcommands 
are meant to filter the DNs in these states, and "slownodes" is not a defined 
DataNode state.





Issue Time Tracking
---

Worklog Id: (was: 762716)
Time Spent: 3.5h  (was: 3h 20m)

> DFS API to retrieve slow datanodes
> --
>
> Key: HDFS-16521
> URL: https://issues.apache.org/jira/browse/HDFS-16521
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> Providing DFS API to retrieve slow nodes would help add an additional option 
> to "dfsadmin -report" that lists slow datanodes info for operators to take a 
> look, specifically useful filter for larger clusters.
> The other purpose of such API is for HDFS downstreamers without direct access 
> to namenode http port (only rpc port accessible) to retrieve slownodes.
> Moreover, 
> [FanOutOneBlockAsyncDFSOutput|https://github.com/apache/hbase/blob/master/hbase-asyncfs/src/main/java/org/apache/hadoop/hbase/io/asyncfs/FanOutOneBlockAsyncDFSOutput.java]
>  in HBase currently has to rely on it's own way of marking and excluding slow 
> nodes while 1) creating pipelines and 2) handling ack, based on factors like 
> the data length of the packet, processing time with last 

[jira] [Work logged] (HDFS-16521) DFS API to retrieve slow datanodes

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16521?focusedWorklogId=761902=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761902
 ]

ASF GitHub Bot logged work on HDFS-16521:
-

Author: ASF GitHub Bot
Created on: 25/Apr/22 16:44
Start Date: 25/Apr/22 16:44
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4107:
URL: https://github.com/apache/hadoop/pull/4107#issuecomment-1108803632

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 57s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  buf  |   0m  0s |  |  buf was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 3 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  15m 51s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  28m 17s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   7m 16s |  |  trunk passed with JDK 
Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  compile  |   6m 52s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   1m 36s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m 50s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   3m  9s |  |  trunk passed with JDK 
Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   3m 39s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   8m 17s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  24m 14s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 25s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m 58s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   6m 43s |  |  the patch passed with JDK 
Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  cc  |   6m 43s |  |  the patch passed  |
   | -1 :x: |  javac  |   6m 43s | 
[/results-compile-javac-hadoop-hdfs-project-jdkUbuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4107/6/artifact/out/results-compile-javac-hadoop-hdfs-project-jdkUbuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04.txt)
 |  hadoop-hdfs-project-jdkUbuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04 with JDK 
Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04 generated 1 new + 651 unchanged - 0 
fixed = 652 total (was 651)  |
   | +1 :green_heart: |  compile  |   6m 25s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  cc  |   6m 25s |  |  the patch passed  |
   | -1 :x: |  javac  |   6m 25s | 
[/results-compile-javac-hadoop-hdfs-project-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4107/6/artifact/out/results-compile-javac-hadoop-hdfs-project-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt)
 |  hadoop-hdfs-project-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 
with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 generated 1 new + 
629 unchanged - 0 fixed = 630 total (was 629)  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m 20s |  |  hadoop-hdfs-project: The 
patch generated 0 new + 456 unchanged - 1 fixed = 456 total (was 457)  |
   | +1 :green_heart: |  mvnsite  |   3m  6s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   2m 22s |  |  the patch passed with JDK 
Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   3m  3s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   7m 53s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  23m 55s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 29s |  |  hadoop-hdfs-client in the patch 
passed.  |
   | -1 :x: |  unit  | 379m 33s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4107/6/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | -1 :x: |  

[jira] [Work logged] (HDFS-16521) DFS API to retrieve slow datanodes

2022-04-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16521?focusedWorklogId=758849=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-758849
 ]

ASF GitHub Bot logged work on HDFS-16521:
-

Author: ASF GitHub Bot
Created on: 19/Apr/22 22:30
Start Date: 19/Apr/22 22:30
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4107:
URL: https://github.com/apache/hadoop/pull/4107#issuecomment-1103229191

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m  5s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  buf  |   0m  1s |  |  buf was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 3 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  15m 48s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  28m 17s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   7m  0s |  |  trunk passed with JDK 
Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  compile  |   6m 31s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   1m 37s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m 38s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   3m  0s |  |  trunk passed with JDK 
Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   3m 39s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   8m  8s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  23m 47s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 27s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m 56s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   6m 50s |  |  the patch passed with JDK 
Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  cc  |   6m 50s |  |  the patch passed  |
   | -1 :x: |  javac  |   6m 50s | 
[/results-compile-javac-hadoop-hdfs-project-jdkUbuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4107/5/artifact/out/results-compile-javac-hadoop-hdfs-project-jdkUbuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04.txt)
 |  hadoop-hdfs-project-jdkUbuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04 with JDK 
Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04 generated 1 new + 651 unchanged - 0 
fixed = 652 total (was 651)  |
   | +1 :green_heart: |  compile  |   6m 19s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  cc  |   6m 19s |  |  the patch passed  |
   | -1 :x: |  javac  |   6m 19s | 
[/results-compile-javac-hadoop-hdfs-project-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4107/5/artifact/out/results-compile-javac-hadoop-hdfs-project-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt)
 |  hadoop-hdfs-project-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 
with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 generated 1 new + 
629 unchanged - 0 fixed = 630 total (was 629)  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m 19s |  |  hadoop-hdfs-project: The 
patch generated 0 new + 456 unchanged - 1 fixed = 456 total (was 457)  |
   | +1 :green_heart: |  mvnsite  |   3m 27s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   2m 37s |  |  the patch passed with JDK 
Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   3m 28s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   8m 58s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  25m 58s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 27s |  |  hadoop-hdfs-client in the patch 
passed.  |
   | -1 :x: |  unit  | 392m 41s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4107/5/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 

[jira] [Work logged] (HDFS-16521) DFS API to retrieve slow datanodes

2022-04-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16521?focusedWorklogId=755010=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-755010
 ]

ASF GitHub Bot logged work on HDFS-16521:
-

Author: ASF GitHub Bot
Created on: 10/Apr/22 09:49
Start Date: 10/Apr/22 09:49
Worklog Time Spent: 10m 
  Work Description: virajjasani commented on PR #4107:
URL: https://github.com/apache/hadoop/pull/4107#issuecomment-1094233257

   To provide more insights, 
[FanOutOneBlockAsyncDFSOutput](https://github.com/apache/hbase/blob/master/hbase-asyncfs/src/main/java/org/apache/hadoop/hbase/io/asyncfs/FanOutOneBlockAsyncDFSOutput.java)
 in HBase currently has to rely on it's own way of marking and excluding slow 
nodes while 1) creating pipelines and 2) handling ack, based on factors like 
the data length of the packet, processing time with last ack timestamp, whether 
flush to replicas is finished etc. If it can utilize slownode API from HDFS to 
exclude nodes appropriately while writing block, a lot of it's own post-ack 
computation of slow nodes can be _saved_ or _improved_ or based on further 
experiment, we could find _better solution_ to manage slow node detection logic 
both in HDFS and HBase. However, in order to collect more data points and run 
more POC around this area, at least we should expect HDFS to provide API for 
downstreamers to efficiently utilize slownode info for such critical 
low-latency use-case (like writing WALs).
   
   cc @jojochuang @saintstack 




Issue Time Tracking
---

Worklog Id: (was: 755010)
Time Spent: 3h  (was: 2h 50m)

> DFS API to retrieve slow datanodes
> --
>
> Key: HDFS-16521
> URL: https://issues.apache.org/jira/browse/HDFS-16521
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Providing DFS API to retrieve slow nodes would help add an additional option 
> to "dfsadmin -report" that lists slow datanodes info for operators to take a 
> look, specifically useful filter for larger clusters.
> The other purpose of such API is for HDFS downstreamers without direct access 
> to namenode http port (only rpc port accessible) to retrieve slownodes.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16521) DFS API to retrieve slow datanodes

2022-04-08 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16521?focusedWorklogId=754489=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-754489
 ]

ASF GitHub Bot logged work on HDFS-16521:
-

Author: ASF GitHub Bot
Created on: 08/Apr/22 09:41
Start Date: 08/Apr/22 09:41
Worklog Time Spent: 10m 
  Work Description: virajjasani commented on PR #4107:
URL: https://github.com/apache/hadoop/pull/4107#issuecomment-1092670492

   I have updated Jira/PR description to summarize the above points.




Issue Time Tracking
---

Worklog Id: (was: 754489)
Time Spent: 2h 50m  (was: 2h 40m)

> DFS API to retrieve slow datanodes
> --
>
> Key: HDFS-16521
> URL: https://issues.apache.org/jira/browse/HDFS-16521
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> Providing DFS API to retrieve slow nodes would help add an additional option 
> to "dfsadmin -report" that lists slow datanodes info for operators to take a 
> look, specifically useful filter for larger clusters.
> The other purpose of such API is for HDFS downstreamers without direct access 
> to namenode http port (only rpc port accessible) to retrieve slownodes.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16521) DFS API to retrieve slow datanodes

2022-04-08 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16521?focusedWorklogId=754431=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-754431
 ]

ASF GitHub Bot logged work on HDFS-16521:
-

Author: ASF GitHub Bot
Created on: 08/Apr/22 06:20
Start Date: 08/Apr/22 06:20
Worklog Time Spent: 10m 
  Work Description: virajjasani commented on PR #4107:
URL: https://github.com/apache/hadoop/pull/4107#issuecomment-1092480863

   @jojochuang @iwasakims @ayushtkn @aajisaka Could you please take a look?




Issue Time Tracking
---

Worklog Id: (was: 754431)
Time Spent: 2h 40m  (was: 2.5h)

> DFS API to retrieve slow datanodes
> --
>
> Key: HDFS-16521
> URL: https://issues.apache.org/jira/browse/HDFS-16521
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> In order to build some automation around slow datanodes that regularly show 
> up in the slow peer tracking report, e.g. decommission such nodes and queue 
> them up for external processing and add them back later to the cluster after 
> fixing issues etc, we should expose DFS API to retrieve all slow nodes at a 
> given time.
> Providing such API would also help add an additional option to "dfsadmin 
> -report" that lists slow datanodes info for operators to take a look, 
> specifically useful filter for larger clusters.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16521) DFS API to retrieve slow datanodes

2022-04-01 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16521?focusedWorklogId=751438=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-751438
 ]

ASF GitHub Bot logged work on HDFS-16521:
-

Author: ASF GitHub Bot
Created on: 01/Apr/22 07:55
Start Date: 01/Apr/22 07:55
Worklog Time Spent: 10m 
  Work Description: virajjasani commented on pull request #4107:
URL: https://github.com/apache/hadoop/pull/4107#issuecomment-1085550751


   @jojochuang @iwasakims @ayushtkn 
   Created HDFS-16528 to support enabling peer stats without having to restart 
Namenode. Planning to take it up after this PR because this PR has a couple of 
tests where we can explicitly add outliers for few datanodes and have them 
being reported as slownodes. I can utilize same UTs to provide tests for 
reconfig options in HDFS-16528 to reduce redundant work.
   
   Could you please help review this PR?
   Thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 751438)
Time Spent: 2.5h  (was: 2h 20m)

> DFS API to retrieve slow datanodes
> --
>
> Key: HDFS-16521
> URL: https://issues.apache.org/jira/browse/HDFS-16521
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> In order to build some automation around slow datanodes that regularly show 
> up in the slow peer tracking report, e.g. decommission such nodes and queue 
> them up for external processing and add them back later to the cluster after 
> fixing issues etc, we should expose DFS API to retrieve all slow nodes at a 
> given time.
> Providing such API would also help add an additional option to "dfsadmin 
> -report" that lists slow datanodes info for operators to take a look, 
> specifically useful filter for larger clusters.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16521) DFS API to retrieve slow datanodes

2022-03-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16521?focusedWorklogId=750944=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-750944
 ]

ASF GitHub Bot logged work on HDFS-16521:
-

Author: ASF GitHub Bot
Created on: 31/Mar/22 11:53
Start Date: 31/Mar/22 11:53
Worklog Time Spent: 10m 
  Work Description: virajjasani edited a comment on pull request #4107:
URL: https://github.com/apache/hadoop/pull/4107#issuecomment-1084475215


   > You mean 
[HDFS-16327](https://issues.apache.org/jira/browse/HDFS-16327)(#3716) needs 
follow-up?
   
   I mean HDFS-16396 (#3827) can be extended for namenode as well (reconfig of 
`dfs.datanode.peer.stats.enabled`). This work is w.r.t what @jojochuang has 
mentioned reg the ability to enable slownode tracking without cluster restart.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 750944)
Time Spent: 2h 20m  (was: 2h 10m)

> DFS API to retrieve slow datanodes
> --
>
> Key: HDFS-16521
> URL: https://issues.apache.org/jira/browse/HDFS-16521
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> In order to build some automation around slow datanodes that regularly show 
> up in the slow peer tracking report, e.g. decommission such nodes and queue 
> them up for external processing and add them back later to the cluster after 
> fixing issues etc, we should expose DFS API to retrieve all slow nodes at a 
> given time.
> Providing such API would also help add an additional option to "dfsadmin 
> -report" that lists slow datanodes info for operators to take a look, 
> specifically useful filter for larger clusters.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16521) DFS API to retrieve slow datanodes

2022-03-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16521?focusedWorklogId=750940=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-750940
 ]

ASF GitHub Bot logged work on HDFS-16521:
-

Author: ASF GitHub Bot
Created on: 31/Mar/22 11:45
Start Date: 31/Mar/22 11:45
Worklog Time Spent: 10m 
  Work Description: virajjasani commented on pull request #4107:
URL: https://github.com/apache/hadoop/pull/4107#issuecomment-1084475215


   > You mean 
[HDFS-16327](https://issues.apache.org/jira/browse/HDFS-16327)(#3716) needs 
follow-up?
   
   I mean HDFS-16396 (#3827) can be extended for namenode as well (reconfig of 
`dfs.datanode.peer.stats.enabled`).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 750940)
Time Spent: 2h 10m  (was: 2h)

> DFS API to retrieve slow datanodes
> --
>
> Key: HDFS-16521
> URL: https://issues.apache.org/jira/browse/HDFS-16521
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> In order to build some automation around slow datanodes that regularly show 
> up in the slow peer tracking report, e.g. decommission such nodes and queue 
> them up for external processing and add them back later to the cluster after 
> fixing issues etc, we should expose DFS API to retrieve all slow nodes at a 
> given time.
> Providing such API would also help add an additional option to "dfsadmin 
> -report" that lists slow datanodes info for operators to take a look, 
> specifically useful filter for larger clusters.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16521) DFS API to retrieve slow datanodes

2022-03-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16521?focusedWorklogId=750879=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-750879
 ]

ASF GitHub Bot logged work on HDFS-16521:
-

Author: ASF GitHub Bot
Created on: 31/Mar/22 10:07
Start Date: 31/Mar/22 10:07
Worklog Time Spent: 10m 
  Work Description: iwasakims commented on pull request #4107:
URL: https://github.com/apache/hadoop/pull/4107#issuecomment-1084364741


   You mean 
[HDFS-16327](https://issues.apache.org/jira/browse/HDFS-16327)(#3716) needs 
follow-up?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 750879)
Time Spent: 2h  (was: 1h 50m)

> DFS API to retrieve slow datanodes
> --
>
> Key: HDFS-16521
> URL: https://issues.apache.org/jira/browse/HDFS-16521
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> In order to build some automation around slow datanodes that regularly show 
> up in the slow peer tracking report, e.g. decommission such nodes and queue 
> them up for external processing and add them back later to the cluster after 
> fixing issues etc, we should expose DFS API to retrieve all slow nodes at a 
> given time.
> Providing such API would also help add an additional option to "dfsadmin 
> -report" that lists slow datanodes info for operators to take a look, 
> specifically useful filter for larger clusters.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16521) DFS API to retrieve slow datanodes

2022-03-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16521?focusedWorklogId=750870=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-750870
 ]

ASF GitHub Bot logged work on HDFS-16521:
-

Author: ASF GitHub Bot
Created on: 31/Mar/22 09:36
Start Date: 31/Mar/22 09:36
Worklog Time Spent: 10m 
  Work Description: virajjasani commented on pull request #4107:
URL: https://github.com/apache/hadoop/pull/4107#issuecomment-1084329656


   > It would be much more useful it can be made available on demand at runtime.
   
   HDFS-16396 has good attempt to make it reconfigurable for datanodes, we can 
extend the support for namenode as well in a follow-up Jira.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 750870)
Time Spent: 1h 50m  (was: 1h 40m)

> DFS API to retrieve slow datanodes
> --
>
> Key: HDFS-16521
> URL: https://issues.apache.org/jira/browse/HDFS-16521
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> In order to build some automation around slow datanodes that regularly show 
> up in the slow peer tracking report, e.g. decommission such nodes and queue 
> them up for external processing and add them back later to the cluster after 
> fixing issues etc, we should expose DFS API to retrieve all slow nodes at a 
> given time.
> Providing such API would also help add an additional option to "dfsadmin 
> -report" that lists slow datanodes info for operators to take a look, 
> specifically useful filter for larger clusters.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16521) DFS API to retrieve slow datanodes

2022-03-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16521?focusedWorklogId=750775=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-750775
 ]

ASF GitHub Bot logged work on HDFS-16521:
-

Author: ASF GitHub Bot
Created on: 31/Mar/22 06:06
Start Date: 31/Mar/22 06:06
Worklog Time Spent: 10m 
  Work Description: jojochuang commented on pull request #4107:
URL: https://github.com/apache/hadoop/pull/4107#issuecomment-1084132385


   I've wanted to build a UI to expose the slow datanode metrics more easily. 
For example at the NameNode itself or in Cloudera Manager Chart System. Never 
got the time to make one.
   
   But the biggest complaint from the users was that it is disabled by default 
and it's annoying to restart the cluster just to refresh the configuration and 
wait for the slow node to show up again. It would be much more useful it can be 
made available on demand at runtime.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 750775)
Time Spent: 1h 40m  (was: 1.5h)

> DFS API to retrieve slow datanodes
> --
>
> Key: HDFS-16521
> URL: https://issues.apache.org/jira/browse/HDFS-16521
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> In order to build some automation around slow datanodes that regularly show 
> up in the slow peer tracking report, e.g. decommission such nodes and queue 
> them up for external processing and add them back later to the cluster after 
> fixing issues etc, we should expose DFS API to retrieve all slow nodes at a 
> given time.
> Providing such API would also help add an additional option to "dfsadmin 
> -report" that lists slow datanodes info for operators to take a look, 
> specifically useful filter for larger clusters.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16521) DFS API to retrieve slow datanodes

2022-03-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16521?focusedWorklogId=750689=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-750689
 ]

ASF GitHub Bot logged work on HDFS-16521:
-

Author: ASF GitHub Bot
Created on: 31/Mar/22 02:29
Start Date: 31/Mar/22 02:29
Worklog Time Spent: 10m 
  Work Description: virajjasani commented on pull request #4107:
URL: https://github.com/apache/hadoop/pull/4107#issuecomment-1084006873


   > namenode port access might be restricted to only namenode and datanode 
pods/containers
   
   I meant only http port (not rpc).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 750689)
Time Spent: 1.5h  (was: 1h 20m)

> DFS API to retrieve slow datanodes
> --
>
> Key: HDFS-16521
> URL: https://issues.apache.org/jira/browse/HDFS-16521
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> In order to build some automation around slow datanodes that regularly show 
> up in the slow peer tracking report, e.g. decommission such nodes and queue 
> them up for external processing and add them back later to the cluster after 
> fixing issues etc, we should expose DFS API to retrieve all slow nodes at a 
> given time.
> Providing such API would also help add an additional option to "dfsadmin 
> -report" that lists slow datanodes info for operators to take a look, 
> specifically useful filter for larger clusters.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16521) DFS API to retrieve slow datanodes

2022-03-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16521?focusedWorklogId=750038=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-750038
 ]

ASF GitHub Bot logged work on HDFS-16521:
-

Author: ASF GitHub Bot
Created on: 30/Mar/22 11:46
Start Date: 30/Mar/22 11:46
Worklog Time Spent: 10m 
  Work Description: virajjasani commented on pull request #4107:
URL: https://github.com/apache/hadoop/pull/4107#issuecomment-1083038635


   @iwasakims @ayushtkn I was earlier thinking about adding SLOW_NODE in 
`DatanodeReportType` so that ClientProtocol#getDatanodeReport can take care of 
retrieval of slownodes but the server side implementation seems to be getting 
bit more complicated with it and hence to make this a separate and clean 
workflow, I thought of adding it as new API in ClientProtocol. But other than 
that, this is quite similar to getDatanodeReport() API only.
   
   When HDFS throughput is affected, it would be really great for operators to 
check for slownode details (similar command to retrieve decommission, dead, 
live nodes) using `dfsadmin -report` command.
   
   > How about enhancing metrics if the current information in the 
SlowPeersReport is insufficient?
   
   We can do this but I believe if we can add more info to slownode only when 
required i.e. by user triggered API (similar to ClientProtocol), that would be 
less overhead than continuously exposing additional details in the metrics. 
WDYT?
   
   
   > Thanks to 
[JMXJsonServlet](https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/jmx/JMXJsonServlet.java),
 we can get metrics in JSON format via HTTP/HTTPS port of NameNode without 
additional configuration.
   
   Yes this is helpful for sure but only if Namenode port is exposed to 
downstream application.
   For instance, in K8S cluster, namenode port access might be restricted to 
only namenode and datanode pods/containers, so other service pods (e.g. hbase 
service pods/containers) would not even have access to namenode port and hence 
no way for it to derive metric values. Metric exposure is definitely good for 
the end customers to get a high level view, I agree with it. But applications 
on the other hand, depending on the environment, might or might not even have 
access to values derived from metrics.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 750038)
Time Spent: 1h 20m  (was: 1h 10m)

> DFS API to retrieve slow datanodes
> --
>
> Key: HDFS-16521
> URL: https://issues.apache.org/jira/browse/HDFS-16521
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> In order to build some automation around slow datanodes that regularly show 
> up in the slow peer tracking report, e.g. decommission such nodes and queue 
> them up for external processing and add them back later to the cluster after 
> fixing issues etc, we should expose DFS API to retrieve all slow nodes at a 
> given time.
> Providing such API would also help add an additional option to "dfsadmin 
> -report" that lists slow datanodes info for operators to take a look, 
> specifically useful filter for larger clusters.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16521) DFS API to retrieve slow datanodes

2022-03-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16521?focusedWorklogId=750012=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-750012
 ]

ASF GitHub Bot logged work on HDFS-16521:
-

Author: ASF GitHub Bot
Created on: 30/Mar/22 11:08
Start Date: 30/Mar/22 11:08
Worklog Time Spent: 10m 
  Work Description: iwasakims commented on pull request #4107:
URL: https://github.com/apache/hadoop/pull/4107#issuecomment-1082994545


   I agree with @ayushtkn that modifying ClientProtocol is overkill for the use 
case. @virajjasani
   
   > While I agree that JMX metric for slownode is already available, not every 
downstreamer might have access to it directly, for instance in K8S managed 
clusters, unless port forward is enabled (not so common case in prod), HDFS 
downstreamer would not be able to access JMX metrics.
   
   Thanks to 
[JMXJsonServlet](https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/jmx/JMXJsonServlet.java),
 we can get metrics in JSON format via HTTP/HTTPS port of NameNode without 
additional configuration. JSON on HTTP is usually easier to access from 
outside/downstream than Protobuf on RPC.
   
   ```
   $ curl namenode:9870/jmx?qry=Hadoop:service=NameNode,name=NameNodeStatus
   {
 "beans" : [ {
   "name" : "Hadoop:service=NameNode,name=NameNodeStatus",
   "modelerType" : "org.apache.hadoop.hdfs.server.namenode.NameNode",
   "NNRole" : "NameNode",
   "HostAndPort" : "localhost:8020",
   "SecurityEnabled" : false,
   "LastHATransitionTime" : 0,
   "BytesWithFutureGenerationStamps" : 0,
   "SlowPeersReport" : "[]",
   "SlowDisksReport" : null,
   "State" : "active"
 } ]
   }
   ```
   
   How about enhancing metrics if the current information in the 
SlowPeersReport is insufficient?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 750012)
Time Spent: 1h 10m  (was: 1h)

> DFS API to retrieve slow datanodes
> --
>
> Key: HDFS-16521
> URL: https://issues.apache.org/jira/browse/HDFS-16521
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> In order to build some automation around slow datanodes that regularly show 
> up in the slow peer tracking report, e.g. decommission such nodes and queue 
> them up for external processing and add them back later to the cluster after 
> fixing issues etc, we should expose DFS API to retrieve all slow nodes at a 
> given time.
> Providing such API would also help add an additional option to "dfsadmin 
> -report" that lists slow datanodes info for operators to take a look, 
> specifically useful filter for larger clusters.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16521) DFS API to retrieve slow datanodes

2022-03-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16521?focusedWorklogId=748334=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-748334
 ]

ASF GitHub Bot logged work on HDFS-16521:
-

Author: ASF GitHub Bot
Created on: 27/Mar/22 14:04
Start Date: 27/Mar/22 14:04
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #4107:
URL: https://github.com/apache/hadoop/pull/4107#issuecomment-1079938585


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 56s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  buf  |   0m  1s |  |  buf was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 3 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  12m 29s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  26m 20s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   6m 29s |  |  trunk passed with JDK 
Ubuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   6m  5s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   1m 15s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m  3s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m 21s |  |  trunk passed with JDK 
Ubuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   3m  2s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   7m 14s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  24m  0s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 26s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m 40s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   6m 22s |  |  the patch passed with JDK 
Ubuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  cc  |   6m 22s |  |  the patch passed  |
   | -1 :x: |  javac  |   6m 22s | 
[/results-compile-javac-hadoop-hdfs-project-jdkUbuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4107/4/artifact/out/results-compile-javac-hadoop-hdfs-project-jdkUbuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04.txt)
 |  hadoop-hdfs-project-jdkUbuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04 with JDK 
Ubuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04 generated 1 new + 651 unchanged - 0 
fixed = 652 total (was 651)  |
   | +1 :green_heart: |  compile  |   6m  0s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  cc  |   6m  0s |  |  the patch passed  |
   | -1 :x: |  javac  |   6m  0s | 
[/results-compile-javac-hadoop-hdfs-project-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4107/4/artifact/out/results-compile-javac-hadoop-hdfs-project-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt)
 |  hadoop-hdfs-project-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 
with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 generated 1 new + 
629 unchanged - 0 fixed = 630 total (was 629)  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m  9s |  |  hadoop-hdfs-project: The 
patch generated 0 new + 456 unchanged - 1 fixed = 456 total (was 457)  |
   | +1 :green_heart: |  mvnsite  |   2m 48s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   2m  4s |  |  the patch passed with JDK 
Ubuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   2m 51s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   7m 35s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  23m 41s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 19s |  |  hadoop-hdfs-client in the patch 
passed.  |
   | +1 :green_heart: |  unit  | 359m 46s |  |  hadoop-hdfs in the patch 
passed.  |
   | -1 :x: |  unit  |  39m 14s | 

[jira] [Work logged] (HDFS-16521) DFS API to retrieve slow datanodes

2022-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16521?focusedWorklogId=748300=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-748300
 ]

ASF GitHub Bot logged work on HDFS-16521:
-

Author: ASF GitHub Bot
Created on: 27/Mar/22 05:08
Start Date: 27/Mar/22 05:08
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #4107:
URL: https://github.com/apache/hadoop/pull/4107#issuecomment-1079841252


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m 25s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  buf  |   0m  0s |  |  buf was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 3 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  12m 52s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  25m  9s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   6m 26s |  |  trunk passed with JDK 
Ubuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   6m  6s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   1m 15s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m 21s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m 34s |  |  trunk passed with JDK 
Ubuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   3m 18s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   7m 32s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  23m 13s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 28s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m 55s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   6m  8s |  |  the patch passed with JDK 
Ubuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  cc  |   6m  8s |  |  the patch passed  |
   | -1 :x: |  javac  |   6m  8s | 
[/results-compile-javac-hadoop-hdfs-project-jdkUbuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4107/3/artifact/out/results-compile-javac-hadoop-hdfs-project-jdkUbuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04.txt)
 |  hadoop-hdfs-project-jdkUbuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04 with JDK 
Ubuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04 generated 1 new + 651 unchanged - 0 
fixed = 652 total (was 651)  |
   | +1 :green_heart: |  compile  |   5m 47s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  cc  |   5m 47s |  |  the patch passed  |
   | -1 :x: |  javac  |   5m 47s | 
[/results-compile-javac-hadoop-hdfs-project-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4107/3/artifact/out/results-compile-javac-hadoop-hdfs-project-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt)
 |  hadoop-hdfs-project-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 
with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 generated 1 new + 
629 unchanged - 0 fixed = 630 total (was 629)  |
   | +1 :green_heart: |  blanks  |   0m  1s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m  9s |  |  hadoop-hdfs-project: The 
patch generated 0 new + 456 unchanged - 1 fixed = 456 total (was 457)  |
   | +1 :green_heart: |  mvnsite  |   2m 57s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   2m 13s |  |  the patch passed with JDK 
Ubuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   3m  1s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   7m 37s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m 41s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 26s |  |  hadoop-hdfs-client in the patch 
passed.  |
   | -1 :x: |  unit  | 500m 40s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4107/3/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | -1 :x: |  unit  |  

[jira] [Work logged] (HDFS-16521) DFS API to retrieve slow datanodes

2022-03-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16521?focusedWorklogId=748245=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-748245
 ]

ASF GitHub Bot logged work on HDFS-16521:
-

Author: ASF GitHub Bot
Created on: 26/Mar/22 18:50
Start Date: 26/Mar/22 18:50
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #4107:
URL: https://github.com/apache/hadoop/pull/4107#issuecomment-1079753292


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 57s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  buf  |   0m  1s |  |  buf was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 3 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  12m 26s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  26m  6s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   6m 39s |  |  trunk passed with JDK 
Ubuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   6m  2s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   1m 15s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m  5s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m 20s |  |  trunk passed with JDK 
Ubuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   3m  6s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   7m 11s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  24m  6s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 23s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m 55s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   7m 44s |  |  the patch passed with JDK 
Ubuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  cc  |   7m 44s |  |  the patch passed  |
   | -1 :x: |  javac  |   7m 44s | 
[/results-compile-javac-hadoop-hdfs-project-jdkUbuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4107/2/artifact/out/results-compile-javac-hadoop-hdfs-project-jdkUbuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04.txt)
 |  hadoop-hdfs-project-jdkUbuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04 with JDK 
Ubuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04 generated 1 new + 651 unchanged - 0 
fixed = 652 total (was 651)  |
   | +1 :green_heart: |  compile  |   6m 29s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  cc  |   6m 30s |  |  the patch passed  |
   | -1 :x: |  javac  |   6m 29s | 
[/results-compile-javac-hadoop-hdfs-project-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4107/2/artifact/out/results-compile-javac-hadoop-hdfs-project-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt)
 |  hadoop-hdfs-project-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 
with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 generated 1 new + 
629 unchanged - 0 fixed = 630 total (was 629)  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m 13s |  |  hadoop-hdfs-project: The 
patch generated 0 new + 456 unchanged - 1 fixed = 456 total (was 457)  |
   | +1 :green_heart: |  mvnsite  |   3m  3s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   2m  8s |  |  the patch passed with JDK 
Ubuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   2m 49s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   7m 55s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  24m 32s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 18s |  |  hadoop-hdfs-client in the patch 
passed.  |
   | -1 :x: |  unit  | 405m 51s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4107/2/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | -1 :x: |  unit  |  

[jira] [Work logged] (HDFS-16521) DFS API to retrieve slow datanodes

2022-03-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16521?focusedWorklogId=748124=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-748124
 ]

ASF GitHub Bot logged work on HDFS-16521:
-

Author: ASF GitHub Bot
Created on: 26/Mar/22 01:42
Start Date: 26/Mar/22 01:42
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #4107:
URL: https://github.com/apache/hadoop/pull/4107#issuecomment-1079556487


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m  4s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  buf  |   0m  0s |  |  buf was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 2 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  12m 24s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  26m 27s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   6m 32s |  |  trunk passed with JDK 
Ubuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   6m  9s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   1m 17s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m  7s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m 23s |  |  trunk passed with JDK 
Ubuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   3m  2s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   7m 15s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  24m 11s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 23s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m 40s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   6m 31s |  |  the patch passed with JDK 
Ubuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  cc  |   6m 31s |  |  the patch passed  |
   | -1 :x: |  javac  |   6m 31s | 
[/results-compile-javac-hadoop-hdfs-project-jdkUbuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4107/1/artifact/out/results-compile-javac-hadoop-hdfs-project-jdkUbuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04.txt)
 |  hadoop-hdfs-project-jdkUbuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04 with JDK 
Ubuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04 generated 1 new + 651 unchanged - 0 
fixed = 652 total (was 651)  |
   | +1 :green_heart: |  compile  |   6m  1s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  cc  |   6m  1s |  |  the patch passed  |
   | -1 :x: |  javac  |   6m  1s | 
[/results-compile-javac-hadoop-hdfs-project-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4107/1/artifact/out/results-compile-javac-hadoop-hdfs-project-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt)
 |  hadoop-hdfs-project-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 
with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 generated 1 new + 
629 unchanged - 0 fixed = 630 total (was 629)  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   1m 13s | 
[/results-checkstyle-hadoop-hdfs-project.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4107/1/artifact/out/results-checkstyle-hadoop-hdfs-project.txt)
 |  hadoop-hdfs-project: The patch generated 1 new + 456 unchanged - 1 fixed = 
457 total (was 457)  |
   | +1 :green_heart: |  mvnsite  |   2m 49s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   2m  3s |  |  the patch passed with JDK 
Ubuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   2m 49s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   7m 35s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  23m 41s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  |   2m 17s | 

[jira] [Work logged] (HDFS-16521) DFS API to retrieve slow datanodes

2022-03-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16521?focusedWorklogId=747939=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-747939
 ]

ASF GitHub Bot logged work on HDFS-16521:
-

Author: ASF GitHub Bot
Created on: 25/Mar/22 19:29
Start Date: 25/Mar/22 19:29
Worklog Time Spent: 10m 
  Work Description: virajjasani commented on pull request #4107:
URL: https://github.com/apache/hadoop/pull/4107#issuecomment-1079369473


   @ayushtkn While I agree that JMX metric for slownode is already available, 
not every downstreamer might have access to it directly, for instance in K8S 
managed clusters, unless port forward is enabled (not so common case in prod), 
HDFS downstreamer would not be able to access JMX metrics. We have similar case 
with `DFS.getDataNodeStats()` API, it provides live/decomm/dead node info, 
however such node info is already used by JMX metrics, but when it's about 
downstream or deployment management application trying to use such info, DFS 
APIs are preferred and not JMX metrics due to similar concerns mentioned above.
   
   Moreover, it's not only about downstreamer using the API, we should also 
provide `dfsadmin -report` option to report slownode info for operators, 
something that only an API can offer.
   We only expose slowNode and reportingNodes info for each unique slow peer 
detection, we do not expose other imp data e.g. how many blocks are currently 
available, what is the DFS usage etc with the same JMX metric, and we don't 
even need to. However, providing as much concrete info related to each slow 
node would be API's responsibility.
   With API, we also don't need to keep tuning 
`dfs.datanode.max.nodes.to.report` to adjust how many top N slow nodes we want 
to get exposed (which is a nice limitation for JMX metrics for sure).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 747939)
Time Spent: 20m  (was: 10m)

> DFS API to retrieve slow datanodes
> --
>
> Key: HDFS-16521
> URL: https://issues.apache.org/jira/browse/HDFS-16521
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> In order to build some automation around slow datanodes that regularly show 
> up in the slow peer tracking report, e.g. decommission such nodes and queue 
> them up for external processing and add them back later to the cluster after 
> fixing issues etc, we should expose DFS API to retrieve all slow nodes at a 
> given time.
> Providing such API would also help add an additional option to "dfsadmin 
> -report" that lists slow datanodes info for operators to take a look, 
> specifically useful filter for larger clusters.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16521) DFS API to retrieve slow datanodes

2022-03-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16521?focusedWorklogId=747862=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-747862
 ]

ASF GitHub Bot logged work on HDFS-16521:
-

Author: ASF GitHub Bot
Created on: 25/Mar/22 16:34
Start Date: 25/Mar/22 16:34
Worklog Time Spent: 10m 
  Work Description: virajjasani opened a new pull request #4107:
URL: https://github.com/apache/hadoop/pull/4107


   ### Description of PR
   In order to build some automation around slow datanodes that regularly show 
up in the slow peer tracking report, e.g. decommission such nodes and queue 
them up for external processing and add them back later to the cluster after 
fixing issues etc, we should expose DFS API to retrieve all slow nodes at a 
given time.
   
   Providing such API would also help add an additional option to "dfsadmin 
-report" that lists slow datanodes info for operators to take a look, 
specifically useful filter for larger clusters.
   
   ### How was this patch tested?
   Dev cluster:
   https://user-images.githubusercontent.com/34790606/160162198-f7629df2-81a2-48f8-97bd-14d16855d03b.png;>
   
   https://user-images.githubusercontent.com/34790606/160162218-b76f62eb-0f32-4d92-bc58-c2cbbe2c4848.png;>
   
   
   ### For code changes:
   
   - [X] Does the title or this PR starts with the corresponding JIRA issue id 
(e.g. 'HADOOP-17799. Your PR title ...')?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 747862)
Remaining Estimate: 0h
Time Spent: 10m

> DFS API to retrieve slow datanodes
> --
>
> Key: HDFS-16521
> URL: https://issues.apache.org/jira/browse/HDFS-16521
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In order to build some automation around slow datanodes that regularly show 
> up in the slow peer tracking report, e.g. decommission such nodes and queue 
> them up for external processing and add them back later to the cluster after 
> fixing issues etc, we should expose DFS API to retrieve all slow nodes at a 
> given time.
> Providing such API would also help add an additional option to "dfsadmin 
> -report" that lists slow datanodes info for operators to take a look, 
> specifically useful filter for larger clusters.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org