[jira] [Work logged] (HDFS-16456) EC: Decommission a rack with only on dn will fail when the rack number is equal with replication

2022-05-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16456?focusedWorklogId=773102&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-773102
 ]

ASF GitHub Bot logged work on HDFS-16456:
-

Author: ASF GitHub Bot
Created on: 21/May/22 06:01
Start Date: 21/May/22 06:01
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4304:
URL: https://github.com/apache/hadoop/pull/4304#issuecomment-1133542869

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |  10m 53s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  1s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 2 new or modified test files.  |
    _ branch-3.3 Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m 14s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  23m 42s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  compile  |  18m  3s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  checkstyle  |   3m 18s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  mvnsite  |   4m 14s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  javadoc  |   4m 38s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  spotbugs  |   7m  0s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  shadedclient  |  28m  7s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 57s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m 28s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  17m 17s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |  17m 17s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   3m 13s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   4m 12s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   4m 23s |  |  the patch passed  |
   | +1 :green_heart: |  spotbugs  |   6m 58s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  28m 24s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  18m  3s |  |  hadoop-common in the patch 
passed.  |
   | -1 :x: |  unit  | 239m 44s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4304/2/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m 51s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 440m 42s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.hdfs.server.balancer.TestBalancerWithHANameNodes |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4304/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4304 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux 50c5cd921baa 4.15.0-65-generic #74-Ubuntu SMP Tue Sep 17 
17:06:04 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | branch-3.3 / 509b0f013f0797c99026e276966cf678c0fe576d |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~18.04-b07 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4304/2/testReport/ |
   | Max. process+thread count | 2650 (vs. ulimit of 5500) |
   | modules | C: hadoop-common-project/hadoop-common 
hadoop-hdfs-project/hadoop-hdfs U: . |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4304/2/console |
   | versions | git=2.17.1 maven=3.6.0 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




Issue Time Tracking
---

Worklog Id: (was: 773102)
Time Spent: 2h 40m  (was: 2.5h)

> EC: Decommission a rack with only on dn will fail when the rack number is 
> equal with replication
> 
>
> Key: HDFS-16456
> 

[jira] [Work logged] (HDFS-16586) Purge FsDatasetAsyncDiskService threadgroup; it causes BPServiceActor$CommandProcessingThread IllegalThreadStateException 'fatal exception and exit'

2022-05-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16586?focusedWorklogId=773097&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-773097
 ]

ASF GitHub Bot logged work on HDFS-16586:
-

Author: ASF GitHub Bot
Created on: 21/May/22 04:22
Start Date: 21/May/22 04:22
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4338:
URL: https://github.com/apache/hadoop/pull/4338#issuecomment-1133528157

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 40s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  37m 23s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 42s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |   1m 40s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   1m 22s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 45s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 23s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 46s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   3m 41s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m 53s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 23s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 24s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |   1m 24s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 20s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |   1m 20s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m  1s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 26s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 58s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 33s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   3m 18s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m 18s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 242m 57s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4338/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m 16s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 351m 37s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.TestClientProtocolForPipelineRecovery |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4338/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4338 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux 1515b3777a23 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 
23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / fd5ba66a9e6342c61e969e3f9db2ff570fe202a5 |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   |  Test Results | 
https://ci-h

[jira] [Work logged] (HDFS-16557) BootstrapStandby failed because of checking gap for inprogress EditLogInputStream

2022-05-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16557?focusedWorklogId=773091&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-773091
 ]

ASF GitHub Bot logged work on HDFS-16557:
-

Author: ASF GitHub Bot
Created on: 21/May/22 02:23
Start Date: 21/May/22 02:23
Worklog Time Spent: 10m 
  Work Description: tomscut commented on PR #4219:
URL: https://github.com/apache/hadoop/pull/4219#issuecomment-1133511045

   Hi @xkrogen , please take a look if you have enough bandwidth. Thanks a lot.




Issue Time Tracking
---

Worklog Id: (was: 773091)
Time Spent: 1h 50m  (was: 1h 40m)

> BootstrapStandby failed because of checking gap for inprogress 
> EditLogInputStream
> -
>
> Key: HDFS-16557
> URL: https://issues.apache.org/jira/browse/HDFS-16557
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Tao Li
>Assignee: Tao Li
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2022-04-22-17-17-14-577.png, 
> image-2022-04-22-17-17-14-618.png, image-2022-04-22-17-17-23-113.png, 
> image-2022-04-22-17-17-32-487.png
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> The lastTxId of an inprogress EditLogInputStream lastTxId isn't necessarily 
> HdfsServerConstants.INVALID_TXID. We can determine its status directly by 
> EditLogInputStream#isInProgress.
> We introduced [SBN READ], and set 
> {color:#ff}{{dfs.ha.tail-edits.in-progress=true}}{color}. Then 
> bootstrapStandby, the EditLogInputStream of inProgress is misjudged, 
> resulting in a gap check failure, which causes bootstrapStandby to fail.
> hdfs namenode -bootstrapStandby
> !image-2022-04-22-17-17-32-487.png|width=766,height=161!
> !image-2022-04-22-17-17-14-577.png|width=598,height=187!



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16582) Expose aggregate latency of slow node as perceived by the reporting node

2022-05-20 Thread Tao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li resolved HDFS-16582.
---
Fix Version/s: 3.4.0
   3.3.4
   Resolution: Fixed

> Expose aggregate latency of slow node as perceived by the reporting node
> 
>
> Key: HDFS-16582
> URL: https://issues.apache.org/jira/browse/HDFS-16582
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.4
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> When any datanode is reported to be slower by another node, we expose the 
> slow node as well as the reporting nodes list for the slow node. However, we 
> don't provide latency numbers of the slownode as reported by the reporting 
> node. Having the latency exposed in the metrics would be really helpful for 
> operators to keep a track of how far behind a given slow node is performing 
> compared to the rest of the nodes in the cluster.
> The operator should be able to gather aggregated latencies of all slow nodes 
> with their reporting nodes in Namenode metrics.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16582) Expose aggregate latency of slow node as perceived by the reporting node

2022-05-20 Thread Tao Li (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17540338#comment-17540338
 ] 

Tao Li commented on HDFS-16582:
---

Cherry-picked to branch-3.3.

> Expose aggregate latency of slow node as perceived by the reporting node
> 
>
> Key: HDFS-16582
> URL: https://issues.apache.org/jira/browse/HDFS-16582
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> When any datanode is reported to be slower by another node, we expose the 
> slow node as well as the reporting nodes list for the slow node. However, we 
> don't provide latency numbers of the slownode as reported by the reporting 
> node. Having the latency exposed in the metrics would be really helpful for 
> operators to keep a track of how far behind a given slow node is performing 
> compared to the rest of the nodes in the cluster.
> The operator should be able to gather aggregated latencies of all slow nodes 
> with their reporting nodes in Namenode metrics.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16582) Expose aggregate latency of slow node as perceived by the reporting node

2022-05-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16582?focusedWorklogId=773080&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-773080
 ]

ASF GitHub Bot logged work on HDFS-16582:
-

Author: ASF GitHub Bot
Created on: 21/May/22 01:45
Start Date: 21/May/22 01:45
Worklog Time Spent: 10m 
  Work Description: tomscut commented on PR #4323:
URL: https://github.com/apache/hadoop/pull/4323#issuecomment-1133497676

   Thanks @virajjasani for the contribution!  Thanks @jojochuang for the review!
   
   




Issue Time Tracking
---

Worklog Id: (was: 773080)
Time Spent: 1.5h  (was: 1h 20m)

> Expose aggregate latency of slow node as perceived by the reporting node
> 
>
> Key: HDFS-16582
> URL: https://issues.apache.org/jira/browse/HDFS-16582
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> When any datanode is reported to be slower by another node, we expose the 
> slow node as well as the reporting nodes list for the slow node. However, we 
> don't provide latency numbers of the slownode as reported by the reporting 
> node. Having the latency exposed in the metrics would be really helpful for 
> operators to keep a track of how far behind a given slow node is performing 
> compared to the rest of the nodes in the cluster.
> The operator should be able to gather aggregated latencies of all slow nodes 
> with their reporting nodes in Namenode metrics.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16582) Expose aggregate latency of slow node as perceived by the reporting node

2022-05-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16582?focusedWorklogId=773078&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-773078
 ]

ASF GitHub Bot logged work on HDFS-16582:
-

Author: ASF GitHub Bot
Created on: 21/May/22 01:29
Start Date: 21/May/22 01:29
Worklog Time Spent: 10m 
  Work Description: tomscut merged PR #4323:
URL: https://github.com/apache/hadoop/pull/4323




Issue Time Tracking
---

Worklog Id: (was: 773078)
Time Spent: 1h 20m  (was: 1h 10m)

> Expose aggregate latency of slow node as perceived by the reporting node
> 
>
> Key: HDFS-16582
> URL: https://issues.apache.org/jira/browse/HDFS-16582
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> When any datanode is reported to be slower by another node, we expose the 
> slow node as well as the reporting nodes list for the slow node. However, we 
> don't provide latency numbers of the slownode as reported by the reporting 
> node. Having the latency exposed in the metrics would be really helpful for 
> operators to keep a track of how far behind a given slow node is performing 
> compared to the rest of the nodes in the cluster.
> The operator should be able to gather aggregated latencies of all slow nodes 
> with their reporting nodes in Namenode metrics.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16585) Add @VisibleForTesting in Dispatcher.java after HDFS-16268

2022-05-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16585?focusedWorklogId=773070&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-773070
 ]

ASF GitHub Bot logged work on HDFS-16585:
-

Author: ASF GitHub Bot
Created on: 21/May/22 00:32
Start Date: 21/May/22 00:32
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4337:
URL: https://github.com/apache/hadoop/pull/4337#issuecomment-1133482241

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 42s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  37m 19s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 40s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |   1m 34s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   1m 26s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 45s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 24s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 48s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   3m 42s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m 58s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 24s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 26s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |   1m 26s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 20s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |   1m 20s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m  4s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 26s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 57s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 32s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   3m 18s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m 23s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  | 247m  2s |  |  hadoop-hdfs in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   1m 15s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 355m 50s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4337/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4337 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux 7719c07d5ce8 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 153756847d8b013d00d15acddcb9b7af15d4e7df |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4337/1/testReport/ |
   | Max. process+thread count | 3460 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-

[jira] [Work logged] (HDFS-16582) Expose aggregate latency of slow node as perceived by the reporting node

2022-05-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16582?focusedWorklogId=773069&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-773069
 ]

ASF GitHub Bot logged work on HDFS-16582:
-

Author: ASF GitHub Bot
Created on: 21/May/22 00:31
Start Date: 21/May/22 00:31
Worklog Time Spent: 10m 
  Work Description: tomscut commented on PR #4323:
URL: https://github.com/apache/hadoop/pull/4323#issuecomment-1133482070

   > @tomscut would you like to give it a review too?
   
   Thanks @jojochuang for ping me, I'm looking at this.




Issue Time Tracking
---

Worklog Id: (was: 773069)
Time Spent: 1h 10m  (was: 1h)

> Expose aggregate latency of slow node as perceived by the reporting node
> 
>
> Key: HDFS-16582
> URL: https://issues.apache.org/jira/browse/HDFS-16582
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> When any datanode is reported to be slower by another node, we expose the 
> slow node as well as the reporting nodes list for the slow node. However, we 
> don't provide latency numbers of the slownode as reported by the reporting 
> node. Having the latency exposed in the metrics would be really helpful for 
> operators to keep a track of how far behind a given slow node is performing 
> compared to the rest of the nodes in the cluster.
> The operator should be able to gather aggregated latencies of all slow nodes 
> with their reporting nodes in Namenode metrics.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16582) Expose aggregate latency of slow node as perceived by the reporting node

2022-05-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16582?focusedWorklogId=773066&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-773066
 ]

ASF GitHub Bot logged work on HDFS-16582:
-

Author: ASF GitHub Bot
Created on: 21/May/22 00:20
Start Date: 21/May/22 00:20
Worklog Time Spent: 10m 
  Work Description: jojochuang commented on PR #4323:
URL: https://github.com/apache/hadoop/pull/4323#issuecomment-1133479419

   @tomscut would you like to give it a review too?




Issue Time Tracking
---

Worklog Id: (was: 773066)
Time Spent: 1h  (was: 50m)

> Expose aggregate latency of slow node as perceived by the reporting node
> 
>
> Key: HDFS-16582
> URL: https://issues.apache.org/jira/browse/HDFS-16582
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> When any datanode is reported to be slower by another node, we expose the 
> slow node as well as the reporting nodes list for the slow node. However, we 
> don't provide latency numbers of the slownode as reported by the reporting 
> node. Having the latency exposed in the metrics would be really helpful for 
> operators to keep a track of how far behind a given slow node is performing 
> compared to the rest of the nodes in the cluster.
> The operator should be able to gather aggregated latencies of all slow nodes 
> with their reporting nodes in Namenode metrics.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16582) Expose aggregate latency of slow node as perceived by the reporting node

2022-05-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16582?focusedWorklogId=773065&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-773065
 ]

ASF GitHub Bot logged work on HDFS-16582:
-

Author: ASF GitHub Bot
Created on: 21/May/22 00:19
Start Date: 21/May/22 00:19
Worklog Time Spent: 10m 
  Work Description: jojochuang commented on code in PR #4323:
URL: https://github.com/apache/hadoop/pull/4323#discussion_r878603680


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/SlowPeerJsonReport.java:
##
@@ -0,0 +1,84 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hdfs.server.blockmanagement;
+
+import java.util.SortedSet;
+
+import com.fasterxml.jackson.annotation.JsonProperty;
+
+import org.apache.commons.lang3.builder.EqualsBuilder;
+import org.apache.commons.lang3.builder.HashCodeBuilder;
+import org.apache.hadoop.classification.InterfaceAudience;
+
+/**
+ * This structure is a thin wrapper over slow peer reports to make Json
+ * [de]serialization easy.
+ */
+@InterfaceAudience.Private
+final class SlowPeerJsonReport {

Review Comment:
   this class is adapted from the subclass ReportForJson in SlowPeerTracker.





Issue Time Tracking
---

Worklog Id: (was: 773065)
Time Spent: 50m  (was: 40m)

> Expose aggregate latency of slow node as perceived by the reporting node
> 
>
> Key: HDFS-16582
> URL: https://issues.apache.org/jira/browse/HDFS-16582
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> When any datanode is reported to be slower by another node, we expose the 
> slow node as well as the reporting nodes list for the slow node. However, we 
> don't provide latency numbers of the slownode as reported by the reporting 
> node. Having the latency exposed in the metrics would be really helpful for 
> operators to keep a track of how far behind a given slow node is performing 
> compared to the rest of the nodes in the cluster.
> The operator should be able to gather aggregated latencies of all slow nodes 
> with their reporting nodes in Namenode metrics.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14997) BPServiceActor processes commands from NameNode asynchronously

2022-05-20 Thread Michael Stack (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17540321#comment-17540321
 ] 

Michael Stack commented on HDFS-14997:
--

Linking an issue where the change in how we do command processing uncovers an 
old, existing problem. Perhaps it is of interest.

> BPServiceActor processes commands from NameNode asynchronously
> --
>
> Key: HDFS-14997
> URL: https://issues.apache.org/jira/browse/HDFS-14997
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Xiaoqiao He
>Assignee: Xiaoqiao He
>Priority: Major
> Fix For: 3.3.0, 3.2.3, 3.2.4
>
> Attachments: HDFS-14997-branch-3.2.001.patch, HDFS-14997.001.patch, 
> HDFS-14997.002.patch, HDFS-14997.003.patch, HDFS-14997.004.patch, 
> HDFS-14997.005.patch, HDFS-14997.addendum.patch, 
> image-2019-12-26-16-15-44-814.png
>
>
> There are two core functions, report(#sendHeartbeat, #blockReport, 
> #cacheReport) and #processCommand in #BPServiceActor main process flow. If 
> processCommand cost long time it will block send report flow. Meanwhile 
> processCommand could cost long time(over 1000s the worst case I meet) when IO 
> load  of DataNode is very high. Since some IO operations are under 
> #datasetLock, So it has to wait to acquire #datasetLock long time when 
> process some of commands(such as #DNA_INVALIDATE). In such case, #heartbeat 
> will not send to NameNode in-time, and trigger other disasters.
> I propose to improve #processCommand asynchronously and not block 
> #BPServiceActor to send heartbeat back to NameNode when meet high IO load.
> Notes:
> 1. Lifeline could be one effective solution, however some old branches are 
> not support this feature.
> 2. IO operations under #datasetLock is another issue, I think we should solve 
> it at another JIRA.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16586) Purge FsDatasetAsyncDiskService threadgroup; it causes BPServiceActor$CommandProcessingThread IllegalThreadStateException 'fatal exception and exit'

2022-05-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-16586:
--
Labels: pull-request-available  (was: )

> Purge FsDatasetAsyncDiskService threadgroup; it causes 
> BPServiceActor$CommandProcessingThread IllegalThreadStateException 'fatal 
> exception and exit' 
> -
>
> Key: HDFS-16586
> URL: https://issues.apache.org/jira/browse/HDFS-16586
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 3.3.0, 3.2.3
>Reporter: Michael Stack
>Assignee: Michael Stack
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The below failed block finalize is causing a downstreamer's test to fail when 
> it uses hadoop 3.2.3 or 3.3.0+:
> {code:java}
> 2022-05-19T18:21:08,243 INFO  [Command processor] 
> impl.FsDatasetAsyncDiskService(234): Scheduling blk_1073741840_1016 replica 
> FinalizedReplica, blk_1073741840_1016, FINALIZED
>   getNumBytes()     = 52
>   getBytesOnDisk()  = 52
>   getVisibleLength()= 52
>   getVolume()       = 
> /Users/stack/checkouts/hbase.apache.git/hbase-server/target/test-data/d544dd1e-b42d-8fae-aa9a-99e3eb52f61c/cluster_e8660d1b-733a-b023-2e91-dc3f951cf189/dfs/data/data2
>   getBlockURI()     = 
> file:/Users/stack/checkouts/hbase.apache.git/hbase-server/target/test-data/d544dd1e-b42d-8fae-aa9a-99e3eb52f61c/cluster_e8660d1b-733a-b023-2e91-dc3f951cf189/dfs/data/data2/current/BP-62743752-127.0.0.1-1653009535881/current/finalized/subdir0/subdir0/blk_1073741840
>  for deletion
> 2022-05-19T18:21:08,243 DEBUG [IPC Server handler 0 on default port 54774] 
> metrics.TopMetrics(134): a metric is reported: cmd: delete user: stack.hfs.0 
> (auth:SIMPLE)
> 2022-05-19T18:21:08,243 DEBUG [IPC Server handler 0 on default port 54774] 
> top.TopAuditLogger(78): --- logged event for top service: 
> allowed=true ugi=stack.hfs.0 (auth:SIMPLE) ip=/127.0.0.1 cmd=delete  
> src=/user/stack/test-data/b8167d53-bcd7-c682-a767-55faaf7f3e96/data/default/t1/4499521075f51d5138fe4f1916daf92d/.tmp
>   dst=null  perm=null
> 2022-05-19T18:21:08,243 DEBUG [PacketResponder: 
> BP-62743752-127.0.0.1-1653009535881:blk_1073741830_1006, 
> type=LAST_IN_PIPELINE] datanode.BlockReceiver$PacketResponder(1645): 
> PacketResponder: BP-62743752-127.0.0.1-1653009535881:blk_1073741830_1006, 
> type=LAST_IN_PIPELINE, replyAck=seqno: 901 reply: SUCCESS 
> downstreamAckTimeNanos: 0 flag: 0
> 2022-05-19T18:21:08,243 DEBUG [PacketResponder: 
> BP-62743752-127.0.0.1-1653009535881:blk_1073741830_1006, 
> type=LAST_IN_PIPELINE] datanode.BlockReceiver$PacketResponder(1327): 
> PacketResponder: BP-62743752-127.0.0.1-1653009535881:blk_1073741830_1006, 
> type=LAST_IN_PIPELINE: seqno=-2 waiting for local datanode to finish write.
> 2022-05-19T18:21:08,243 ERROR [Command processor] 
> datanode.BPServiceActor$CommandProcessingThread(1276): Command processor 
> encountered fatal exception and exit.
> java.lang.IllegalThreadStateException: null
>   at java.lang.ThreadGroup.addUnstarted(ThreadGroup.java:865) ~[?:?]
>   at java.lang.Thread.(Thread.java:430) ~[?:?]
>   at java.lang.Thread.(Thread.java:704) ~[?:?]
>   at java.lang.Thread.(Thread.java:525) ~[?:?]
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService$1.newThread(FsDatasetAsyncDiskService.java:113)
>  ~[hadoop-hdfs-3.2.3.jar:?]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.(ThreadPoolExecutor.java:623)
>  ~[?:?]
>   at 
> java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:912)
>  ~[?:?]
>   at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1343) 
> ~[?:?]
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService.execute(FsDatasetAsyncDiskService.java:189)
>  ~[hadoop-hdfs-3.2.3.jar:?]
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService.deleteAsync(FsDatasetAsyncDiskService.java:238)
>  ~[hadoop-hdfs-3.2.3.jar:?]
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.invalidate(FsDatasetImpl.java:2184)
>  ~[hadoop-hdfs-3.2.3.jar:?]
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.invalidate(FsDatasetImpl.java:2103)
>  ~[hadoop-hdfs-3.2.3.jar:?]
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:736)
>  ~[hadoop-hdfs-3.2.3.jar:?]
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:682)
>  ~[hadoop-hdfs-3.2.3.jar:?]
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor$Com

[jira] [Work logged] (HDFS-16586) Purge FsDatasetAsyncDiskService threadgroup; it causes BPServiceActor$CommandProcessingThread IllegalThreadStateException 'fatal exception and exit'

2022-05-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16586?focusedWorklogId=773038&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-773038
 ]

ASF GitHub Bot logged work on HDFS-16586:
-

Author: ASF GitHub Bot
Created on: 20/May/22 22:29
Start Date: 20/May/22 22:29
Worklog Time Spent: 10m 
  Work Description: saintstack opened a new pull request, #4338:
URL: https://github.com/apache/hadoop/pull/4338

   …ServiceActor IllegalThreadStateException 'fatal exception and exit'
   
   
   
   
   
   ### Description of PR
   
   Remove the ThreadGroup used by executor factories; they are unused
   and ThreadGroups auto-destroy when their Thread-member count goes to zero.
   This behavior is incompatible with the configuration we have on the 
per-volume
   executor which is set to let all threads die if no use inside the
   keepalive time.
   
   ### How was this patch tested?
   
   Yes. The downstream project test that was failing passes with this PR in 
place
   
   Adding a test as part of this PR would be awkward.
   
   ### For code changes:
   
   - [x ] Does the title or this PR starts with the corresponding JIRA issue id 
(e.g. 'HADOOP-17799. Your PR title ...')?
   - [ ] Object storage: have the integration tests been executed and the 
endpoint declared according to the connector-specific documentation?
   - [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
   - [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, 
`NOTICE-binary` files?
   
   




Issue Time Tracking
---

Worklog Id: (was: 773038)
Remaining Estimate: 0h
Time Spent: 10m

> Purge FsDatasetAsyncDiskService threadgroup; it causes 
> BPServiceActor$CommandProcessingThread IllegalThreadStateException 'fatal 
> exception and exit' 
> -
>
> Key: HDFS-16586
> URL: https://issues.apache.org/jira/browse/HDFS-16586
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 3.3.0, 3.2.3
>Reporter: Michael Stack
>Assignee: Michael Stack
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The below failed block finalize is causing a downstreamer's test to fail when 
> it uses hadoop 3.2.3 or 3.3.0+:
> {code:java}
> 2022-05-19T18:21:08,243 INFO  [Command processor] 
> impl.FsDatasetAsyncDiskService(234): Scheduling blk_1073741840_1016 replica 
> FinalizedReplica, blk_1073741840_1016, FINALIZED
>   getNumBytes()     = 52
>   getBytesOnDisk()  = 52
>   getVisibleLength()= 52
>   getVolume()       = 
> /Users/stack/checkouts/hbase.apache.git/hbase-server/target/test-data/d544dd1e-b42d-8fae-aa9a-99e3eb52f61c/cluster_e8660d1b-733a-b023-2e91-dc3f951cf189/dfs/data/data2
>   getBlockURI()     = 
> file:/Users/stack/checkouts/hbase.apache.git/hbase-server/target/test-data/d544dd1e-b42d-8fae-aa9a-99e3eb52f61c/cluster_e8660d1b-733a-b023-2e91-dc3f951cf189/dfs/data/data2/current/BP-62743752-127.0.0.1-1653009535881/current/finalized/subdir0/subdir0/blk_1073741840
>  for deletion
> 2022-05-19T18:21:08,243 DEBUG [IPC Server handler 0 on default port 54774] 
> metrics.TopMetrics(134): a metric is reported: cmd: delete user: stack.hfs.0 
> (auth:SIMPLE)
> 2022-05-19T18:21:08,243 DEBUG [IPC Server handler 0 on default port 54774] 
> top.TopAuditLogger(78): --- logged event for top service: 
> allowed=true ugi=stack.hfs.0 (auth:SIMPLE) ip=/127.0.0.1 cmd=delete  
> src=/user/stack/test-data/b8167d53-bcd7-c682-a767-55faaf7f3e96/data/default/t1/4499521075f51d5138fe4f1916daf92d/.tmp
>   dst=null  perm=null
> 2022-05-19T18:21:08,243 DEBUG [PacketResponder: 
> BP-62743752-127.0.0.1-1653009535881:blk_1073741830_1006, 
> type=LAST_IN_PIPELINE] datanode.BlockReceiver$PacketResponder(1645): 
> PacketResponder: BP-62743752-127.0.0.1-1653009535881:blk_1073741830_1006, 
> type=LAST_IN_PIPELINE, replyAck=seqno: 901 reply: SUCCESS 
> downstreamAckTimeNanos: 0 flag: 0
> 2022-05-19T18:21:08,243 DEBUG [PacketResponder: 
> BP-62743752-127.0.0.1-1653009535881:blk_1073741830_1006, 
> type=LAST_IN_PIPELINE] datanode.BlockReceiver$PacketResponder(1327): 
> PacketResponder: BP-62743752-127.0.0.1-1653009535881:blk_1073741830_1006, 
> type=LAST_IN_PIPELINE: seqno=-2 waiting for local datanode to finish write.
> 2022-05-19T18:21:08,243 ERROR [Command processor] 
> datanode.BPServiceActor$CommandProcessingThread(1276): Command processor 
> encountered fatal exception and exit.
> java.lang.IllegalThreadStateException: null
>   at java.lang.ThreadGroup.addUnstarted(ThreadGroup

[jira] [Created] (HDFS-16586) Purge FsDatasetAsyncDiskService threadgroup; it causes BPServiceActor$CommandProcessingThread IllegalThreadStateException 'fatal exception and exit'

2022-05-20 Thread Michael Stack (Jira)
Michael Stack created HDFS-16586:


 Summary: Purge FsDatasetAsyncDiskService threadgroup; it causes 
BPServiceActor$CommandProcessingThread IllegalThreadStateException 'fatal 
exception and exit' 
 Key: HDFS-16586
 URL: https://issues.apache.org/jira/browse/HDFS-16586
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 3.2.3, 3.3.0
Reporter: Michael Stack
Assignee: Michael Stack


The below failed block finalize is causing a downstreamer's test to fail when 
it uses hadoop 3.2.3 or 3.3.0+:
{code:java}
2022-05-19T18:21:08,243 INFO  [Command processor] 
impl.FsDatasetAsyncDiskService(234): Scheduling blk_1073741840_1016 replica 
FinalizedReplica, blk_1073741840_1016, FINALIZED
  getNumBytes()     = 52
  getBytesOnDisk()  = 52
  getVisibleLength()= 52
  getVolume()       = 
/Users/stack/checkouts/hbase.apache.git/hbase-server/target/test-data/d544dd1e-b42d-8fae-aa9a-99e3eb52f61c/cluster_e8660d1b-733a-b023-2e91-dc3f951cf189/dfs/data/data2
  getBlockURI()     = 
file:/Users/stack/checkouts/hbase.apache.git/hbase-server/target/test-data/d544dd1e-b42d-8fae-aa9a-99e3eb52f61c/cluster_e8660d1b-733a-b023-2e91-dc3f951cf189/dfs/data/data2/current/BP-62743752-127.0.0.1-1653009535881/current/finalized/subdir0/subdir0/blk_1073741840
 for deletion
2022-05-19T18:21:08,243 DEBUG [IPC Server handler 0 on default port 54774] 
metrics.TopMetrics(134): a metric is reported: cmd: delete user: stack.hfs.0 
(auth:SIMPLE)
2022-05-19T18:21:08,243 DEBUG [IPC Server handler 0 on default port 54774] 
top.TopAuditLogger(78): --- logged event for top service: 
allowed=true ugi=stack.hfs.0 (auth:SIMPLE) ip=/127.0.0.1 cmd=delete  
src=/user/stack/test-data/b8167d53-bcd7-c682-a767-55faaf7f3e96/data/default/t1/4499521075f51d5138fe4f1916daf92d/.tmp
  dst=null  perm=null
2022-05-19T18:21:08,243 DEBUG [PacketResponder: 
BP-62743752-127.0.0.1-1653009535881:blk_1073741830_1006, type=LAST_IN_PIPELINE] 
datanode.BlockReceiver$PacketResponder(1645): PacketResponder: 
BP-62743752-127.0.0.1-1653009535881:blk_1073741830_1006, type=LAST_IN_PIPELINE, 
replyAck=seqno: 901 reply: SUCCESS downstreamAckTimeNanos: 0 flag: 0
2022-05-19T18:21:08,243 DEBUG [PacketResponder: 
BP-62743752-127.0.0.1-1653009535881:blk_1073741830_1006, type=LAST_IN_PIPELINE] 
datanode.BlockReceiver$PacketResponder(1327): PacketResponder: 
BP-62743752-127.0.0.1-1653009535881:blk_1073741830_1006, type=LAST_IN_PIPELINE: 
seqno=-2 waiting for local datanode to finish write.
2022-05-19T18:21:08,243 ERROR [Command processor] 
datanode.BPServiceActor$CommandProcessingThread(1276): Command processor 
encountered fatal exception and exit.
java.lang.IllegalThreadStateException: null
  at java.lang.ThreadGroup.addUnstarted(ThreadGroup.java:865) ~[?:?]
  at java.lang.Thread.(Thread.java:430) ~[?:?]
  at java.lang.Thread.(Thread.java:704) ~[?:?]
  at java.lang.Thread.(Thread.java:525) ~[?:?]
  at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService$1.newThread(FsDatasetAsyncDiskService.java:113)
 ~[hadoop-hdfs-3.2.3.jar:?]
  at 
java.util.concurrent.ThreadPoolExecutor$Worker.(ThreadPoolExecutor.java:623)
 ~[?:?]
  at 
java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:912) 
~[?:?]
  at 
java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1343) 
~[?:?]
  at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService.execute(FsDatasetAsyncDiskService.java:189)
 ~[hadoop-hdfs-3.2.3.jar:?]
  at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService.deleteAsync(FsDatasetAsyncDiskService.java:238)
 ~[hadoop-hdfs-3.2.3.jar:?]
  at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.invalidate(FsDatasetImpl.java:2184)
 ~[hadoop-hdfs-3.2.3.jar:?]
  at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.invalidate(FsDatasetImpl.java:2103)
 ~[hadoop-hdfs-3.2.3.jar:?]
  at 
org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:736)
 ~[hadoop-hdfs-3.2.3.jar:?]
  at 
org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:682)
 ~[hadoop-hdfs-3.2.3.jar:?]
  at 
org.apache.hadoop.hdfs.server.datanode.BPServiceActor$CommandProcessingThread.processCommand(BPServiceActor.java:1318)
 ~[hadoop-hdfs-3.2.3.jar:?]
  at 
org.apache.hadoop.hdfs.server.datanode.BPServiceActor$CommandProcessingThread.lambda$enqueue$2(BPServiceActor.java:1364)
 ~[hadoop-hdfs-3.2.3.jar:?]
  at 
org.apache.hadoop.hdfs.server.datanode.BPServiceActor$CommandProcessingThread.processQueue(BPServiceActor.java:1291)
 ~[hadoop-hdfs-3.2.3.jar:?]
  at 
org.apache.hadoop.hdfs.server.datanode.BPServiceActor$CommandProcessingThread.run(BPServiceActor.java:1274)
 ~[hadoop-hdfs-3.2.3.jar:?]
2022-05-19T18:21:08,243 DEBUG [DataXceiver for client 

[jira] [Work logged] (HDFS-16583) DatanodeAdminDefaultMonitor can get stuck in an infinite loop

2022-05-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16583?focusedWorklogId=772991&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-772991
 ]

ASF GitHub Bot logged work on HDFS-16583:
-

Author: ASF GitHub Bot
Created on: 20/May/22 18:58
Start Date: 20/May/22 18:58
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4332:
URL: https://github.com/apache/hadoop/pull/4332#issuecomment-1133217136

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m  3s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  37m 50s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 44s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |   1m 33s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   1m 24s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 44s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 25s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 45s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   3m 45s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  23m 14s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 21s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 25s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |   1m 25s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 19s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |   1m 19s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 59s | 
[/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4332/2/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs-project/hadoop-hdfs: The patch generated 2 new + 6 unchanged - 
0 fixed = 8 total (was 6)  |
   | +1 :green_heart: |  mvnsite  |   1m 25s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 57s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 30s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   3m 21s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m 39s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 432m 39s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4332/2/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m 13s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 542m 33s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4332/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4332 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux 56a95fee91d3 4.15.0-65-generic #74-Ubuntu SMP Tue Sep 17 
17:06:04 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 18f25f515a0f64ec80ba590de915bf615c4f27cc |
   | Default Java | Private Build-1

[jira] [Work logged] (HDFS-16583) DatanodeAdminDefaultMonitor can get stuck in an infinite loop

2022-05-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16583?focusedWorklogId=772962&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-772962
 ]

ASF GitHub Bot logged work on HDFS-16583:
-

Author: ASF GitHub Bot
Created on: 20/May/22 18:27
Start Date: 20/May/22 18:27
Worklog Time Spent: 10m 
  Work Description: sodonnel commented on PR #4332:
URL: https://github.com/apache/hadoop/pull/4332#issuecomment-1133193488

   Yea, the BackoffMonitor already used the cancelled nodes technique.
   
   Initially, I tried to split the the stopMaintenance method, but then I 
realised there is a piace in the check loop where it yields the write lock, so 
someone could cancel a node and it would slip in while the lock is dropped, and 
could still cause the issue.
   
   So I figured it was better to do the asynchronous cancel in the same way as 
the backOff monitor to fix it completely. 




Issue Time Tracking
---

Worklog Id: (was: 772962)
Time Spent: 40m  (was: 0.5h)

> DatanodeAdminDefaultMonitor can get stuck in an infinite loop
> -
>
> Key: HDFS-16583
> URL: https://issues.apache.org/jira/browse/HDFS-16583
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> We encountered a case where the decommission monitor in the namenode got 
> stuck for about 6 hours. The logs give:
> {code}
> 2022-05-15 01:09:25,490 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.HeartbeatManager: Stopping 
> maintenance of dead node 10.185.3.132:50010
> 2022-05-15 01:10:20,918 INFO org.apache.hadoop.http.HttpServer2: Process 
> Thread Dump: jsp requested
> 
> 2022-05-15 01:19:06,810 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: 
> PendingReconstructionMonitor timed out blk_4501753665_3428271426
> 2022-05-15 01:19:06,810 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: 
> PendingReconstructionMonitor timed out blk_4501753659_3428271420
> 2022-05-15 01:19:06,810 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: 
> PendingReconstructionMonitor timed out blk_4501753662_3428271423
> 2022-05-15 01:19:06,810 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: 
> PendingReconstructionMonitor timed out blk_4501753663_3428271424
> 2022-05-15 06:00:57,281 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.HeartbeatManager: Stopping 
> maintenance of dead node 10.185.3.34:50010
> 2022-05-15 06:00:58,105 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem write lock 
> held for 17492614 ms via
> java.lang.Thread.getStackTrace(Thread.java:1559)
> org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:263)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:220)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1601)
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.run(DatanodeAdminManager.java:496)
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> java.lang.Thread.run(Thread.java:748)
>   Number of suppressed write-lock reports: 0
>   Longest write-lock held interval: 17492614
> {code}
> We only have the one thread dump triggered by the FC:
> {code}
> Thread 80 (DatanodeAdminMonitor-0):
>   State: RUNNABLE
>   Blocked count: 16
>   Waited count: 453693
>   Stack:
> 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.check(DatanodeAdminManager.java:538)
> 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.run(DatanodeAdminManager.java:494)
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> 
> java.util.

[jira] [Updated] (HDFS-16585) Add @VisibleForTesting in Dispatcher.java after HDFS-16268

2022-05-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-16585:
--
Labels: pull-request-available  (was: )

> Add @VisibleForTesting in Dispatcher.java after HDFS-16268
> --
>
> Key: HDFS-16585
> URL: https://issues.apache.org/jira/browse/HDFS-16585
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Wei-Chiu Chuang
>Assignee: Ashutosh Gupta
>Priority: Trivial
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The scope of a few methods were opened up by HDFS-16268 to facilitate unit 
> testing. We should annotate them with {{@VisibleForTesting}} so that they 
> don't get used by production code.
> The affected methods include:
> PendingMove
> markMovedIfGoodBlock
> isGoodBlockCandidate



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16585) Add @VisibleForTesting in Dispatcher.java after HDFS-16268

2022-05-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16585?focusedWorklogId=772969&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-772969
 ]

ASF GitHub Bot logged work on HDFS-16585:
-

Author: ASF GitHub Bot
Created on: 20/May/22 18:35
Start Date: 20/May/22 18:35
Worklog Time Spent: 10m 
  Work Description: ashutoshcipher opened a new pull request, #4337:
URL: https://github.com/apache/hadoop/pull/4337

   ### Description of PR
   Add @VisibleForTesting in Dispatcher.java
   * JIRA: HDFS-16585
   
   
   - [x] Does the title or this PR starts with the corresponding JIRA issue id 
(e.g. 'HADOOP-17799. Your PR title ...')?
   




Issue Time Tracking
---

Worklog Id: (was: 772969)
Remaining Estimate: 0h
Time Spent: 10m

> Add @VisibleForTesting in Dispatcher.java after HDFS-16268
> --
>
> Key: HDFS-16585
> URL: https://issues.apache.org/jira/browse/HDFS-16585
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Wei-Chiu Chuang
>Assignee: Ashutosh Gupta
>Priority: Trivial
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The scope of a few methods were opened up by HDFS-16268 to facilitate unit 
> testing. We should annotate them with {{@VisibleForTesting}} so that they 
> don't get used by production code.
> The affected methods include:
> PendingMove
> markMovedIfGoodBlock
> isGoodBlockCandidate



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16453) Upgrade okhttp from 2.7.5 to 4.9.3

2022-05-20 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka resolved HDFS-16453.
--
Fix Version/s: 3.4.0
   3.3.4
   Resolution: Fixed

Committed to trunk and branch-3.3. Thank you [~ivan.viaznikov] for your report 
and thank you [~groot] for your contribution!

> Upgrade okhttp from 2.7.5 to 4.9.3
> --
>
> Key: HDFS-16453
> URL: https://issues.apache.org/jira/browse/HDFS-16453
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 3.3.1
>Reporter: Ivan Viaznikov
>Assignee: Ashutosh Gupta
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.4
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> {{org.apache.hadoop:hadoop-hdfs-client}} comes with 
> {{com.squareup.okhttp:okhttp:2.7.5}} as a dependency, which is vulnerable to 
> an information disclosure issue due to how the contents of sensitive headers, 
> such as the {{Authorization}} header, can be logged when an 
> {{IllegalArgumentException}} is thrown.
> This issue could allow an attacker or malicious user who has access to the 
> logs to obtain the sensitive contents of the affected headers which could 
> facilitate further attacks.
> Fixed in {{5.0.0-alpha3}} by 
> [this|https://github.com/square/okhttp/commit/dcc6483b7dc6d9c0b8e03ff7c30c13f3c75264a5]
>  commit. The fix was cherry-picked and backported into {{4.9.2}} with 
> [this|https://github.com/square/okhttp/commit/1fd7c0afdc2cee9ba982b07d49662af7f60e1518]
>  commit.
> Requesting you to clarify if this dependency will be updated to a fixed 
> version in the following releases



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-16585) Add @VisibleForTesting in Dispatcher.java after HDFS-16268

2022-05-20 Thread Ashutosh Gupta (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Gupta reassigned HDFS-16585:
-

Assignee: Ashutosh Gupta

> Add @VisibleForTesting in Dispatcher.java after HDFS-16268
> --
>
> Key: HDFS-16585
> URL: https://issues.apache.org/jira/browse/HDFS-16585
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Wei-Chiu Chuang
>Assignee: Ashutosh Gupta
>Priority: Trivial
>
> The scope of a few methods were opened up by HDFS-16268 to facilitate unit 
> testing. We should annotate them with {{@VisibleForTesting}} so that they 
> don't get used by production code.
> The affected methods include:
> PendingMove
> markMovedIfGoodBlock
> isGoodBlockCandidate



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16453) Upgrade okhttp from 2.7.5 to 4.9.3

2022-05-20 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-16453:
-
Summary: Upgrade okhttp from 2.7.5 to 4.9.3  (was: okhttp vulnerable 
library update)

> Upgrade okhttp from 2.7.5 to 4.9.3
> --
>
> Key: HDFS-16453
> URL: https://issues.apache.org/jira/browse/HDFS-16453
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 3.3.1
>Reporter: Ivan Viaznikov
>Assignee: Ashutosh Gupta
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> {{org.apache.hadoop:hadoop-hdfs-client}} comes with 
> {{com.squareup.okhttp:okhttp:2.7.5}} as a dependency, which is vulnerable to 
> an information disclosure issue due to how the contents of sensitive headers, 
> such as the {{Authorization}} header, can be logged when an 
> {{IllegalArgumentException}} is thrown.
> This issue could allow an attacker or malicious user who has access to the 
> logs to obtain the sensitive contents of the affected headers which could 
> facilitate further attacks.
> Fixed in {{5.0.0-alpha3}} by 
> [this|https://github.com/square/okhttp/commit/dcc6483b7dc6d9c0b8e03ff7c30c13f3c75264a5]
>  commit. The fix was cherry-picked and backported into {{4.9.2}} with 
> [this|https://github.com/square/okhttp/commit/1fd7c0afdc2cee9ba982b07d49662af7f60e1518]
>  commit.
> Requesting you to clarify if this dependency will be updated to a fixed 
> version in the following releases



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16453) okhttp vulnerable library update

2022-05-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16453?focusedWorklogId=772946&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-772946
 ]

ASF GitHub Bot logged work on HDFS-16453:
-

Author: ASF GitHub Bot
Created on: 20/May/22 17:53
Start Date: 20/May/22 17:53
Worklog Time Spent: 10m 
  Work Description: aajisaka commented on PR #4229:
URL: https://github.com/apache/hadoop/pull/4229#issuecomment-1133168703

   Merged. Thank you @ashutoshcipher for your contribution and thanks 
@steveloughran for your review.




Issue Time Tracking
---

Worklog Id: (was: 772946)
Time Spent: 1h  (was: 50m)

> okhttp vulnerable library update
> 
>
> Key: HDFS-16453
> URL: https://issues.apache.org/jira/browse/HDFS-16453
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 3.3.1
>Reporter: Ivan Viaznikov
>Assignee: Ashutosh Gupta
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> {{org.apache.hadoop:hadoop-hdfs-client}} comes with 
> {{com.squareup.okhttp:okhttp:2.7.5}} as a dependency, which is vulnerable to 
> an information disclosure issue due to how the contents of sensitive headers, 
> such as the {{Authorization}} header, can be logged when an 
> {{IllegalArgumentException}} is thrown.
> This issue could allow an attacker or malicious user who has access to the 
> logs to obtain the sensitive contents of the affected headers which could 
> facilitate further attacks.
> Fixed in {{5.0.0-alpha3}} by 
> [this|https://github.com/square/okhttp/commit/dcc6483b7dc6d9c0b8e03ff7c30c13f3c75264a5]
>  commit. The fix was cherry-picked and backported into {{4.9.2}} with 
> [this|https://github.com/square/okhttp/commit/1fd7c0afdc2cee9ba982b07d49662af7f60e1518]
>  commit.
> Requesting you to clarify if this dependency will be updated to a fixed 
> version in the following releases



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16453) okhttp vulnerable library update

2022-05-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16453?focusedWorklogId=772945&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-772945
 ]

ASF GitHub Bot logged work on HDFS-16453:
-

Author: ASF GitHub Bot
Created on: 20/May/22 17:53
Start Date: 20/May/22 17:53
Worklog Time Spent: 10m 
  Work Description: aajisaka merged PR #4229:
URL: https://github.com/apache/hadoop/pull/4229




Issue Time Tracking
---

Worklog Id: (was: 772945)
Time Spent: 50m  (was: 40m)

> okhttp vulnerable library update
> 
>
> Key: HDFS-16453
> URL: https://issues.apache.org/jira/browse/HDFS-16453
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 3.3.1
>Reporter: Ivan Viaznikov
>Assignee: Ashutosh Gupta
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> {{org.apache.hadoop:hadoop-hdfs-client}} comes with 
> {{com.squareup.okhttp:okhttp:2.7.5}} as a dependency, which is vulnerable to 
> an information disclosure issue due to how the contents of sensitive headers, 
> such as the {{Authorization}} header, can be logged when an 
> {{IllegalArgumentException}} is thrown.
> This issue could allow an attacker or malicious user who has access to the 
> logs to obtain the sensitive contents of the affected headers which could 
> facilitate further attacks.
> Fixed in {{5.0.0-alpha3}} by 
> [this|https://github.com/square/okhttp/commit/dcc6483b7dc6d9c0b8e03ff7c30c13f3c75264a5]
>  commit. The fix was cherry-picked and backported into {{4.9.2}} with 
> [this|https://github.com/square/okhttp/commit/1fd7c0afdc2cee9ba982b07d49662af7f60e1518]
>  commit.
> Requesting you to clarify if this dependency will be updated to a fixed 
> version in the following releases



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-16585) Add @VisibleForTesting in Dispatcher.java after HDFS-16268

2022-05-20 Thread Wei-Chiu Chuang (Jira)
Wei-Chiu Chuang created HDFS-16585:
--

 Summary: Add @VisibleForTesting in Dispatcher.java after HDFS-16268
 Key: HDFS-16585
 URL: https://issues.apache.org/jira/browse/HDFS-16585
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Wei-Chiu Chuang


The scope of a few methods were opened up by HDFS-16268 to facilitate unit 
testing. We should annotate them with {{@VisibleForTesting}} so that they don't 
get used by production code.

The affected methods include:
PendingMove
markMovedIfGoodBlock
isGoodBlockCandidate




--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16453) okhttp vulnerable library update

2022-05-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16453?focusedWorklogId=772865&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-772865
 ]

ASF GitHub Bot logged work on HDFS-16453:
-

Author: ASF GitHub Bot
Created on: 20/May/22 14:35
Start Date: 20/May/22 14:35
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4229:
URL: https://github.com/apache/hadoop/pull/4229#issuecomment-1132972302

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 56s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  shelldocs  |   0m  0s |  |  Shelldocs was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m 55s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  28m  3s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  25m  1s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |  21m 32s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   4m 31s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |  19m 59s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   8m 27s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   7m 31s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +0 :ok: |  spotbugs  |   0m 27s |  |  branch/hadoop-project no spotbugs 
output file (spotbugsXml.xml)  |
   | +0 :ok: |  spotbugs  |   0m 27s |  |  
branch/hadoop-client-modules/hadoop-client no spotbugs output file 
(spotbugsXml.xml)  |
   | +1 :green_heart: |  shadedclient  |  57m 21s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 37s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |  26m 23s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  24m 18s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |  24m 18s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  21m 34s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |  21m 34s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   4m 23s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |  19m 34s |  |  the patch passed  |
   | +1 :green_heart: |  shellcheck  |   0m  0s |  |  No new issues.  |
   | +1 :green_heart: |  xml  |   0m  6s |  |  The patch has no ill-formed XML 
file.  |
   | +1 :green_heart: |  javadoc  |   8m 21s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   7m 25s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +0 :ok: |  spotbugs  |   0m 26s |  |  hadoop-project has no data from 
spotbugs  |
   | +0 :ok: |  spotbugs  |   0m 26s |  |  hadoop-client-modules/hadoop-client 
has no data from spotbugs  |
   | +1 :green_heart: |  shadedclient  |  57m 47s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 1058m 18s | 
[/patch-unit-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4229/15/artifact/out/patch-unit-root.txt)
 |  root in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   2m 18s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 1433m 47s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.mapred.TestLocalDistributedCacheManager |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4229/15/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4229 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvn

[jira] [Work logged] (HDFS-16584) Record StandbyNameNode information when Balancer is running

2022-05-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16584?focusedWorklogId=772737&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-772737
 ]

ASF GitHub Bot logged work on HDFS-16584:
-

Author: ASF GitHub Bot
Created on: 20/May/22 09:27
Start Date: 20/May/22 09:27
Worklog Time Spent: 10m 
  Work Description: jianghuazhu commented on PR #4333:
URL: https://github.com/apache/hadoop/pull/4333#issuecomment-1132683576

   @ayushtkn  @Hexiaoqiao , can you guys help to review this pr.
   Thank you very much.
   




Issue Time Tracking
---

Worklog Id: (was: 772737)
Time Spent: 40m  (was: 0.5h)

> Record StandbyNameNode information when Balancer is running
> ---
>
> Key: HDFS-16584
> URL: https://issues.apache.org/jira/browse/HDFS-16584
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer & mover, namenode
>Affects Versions: 3.3.0
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2022-05-19-20-23-23-825.png
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> When the Balancer is running, we allow block data to be fetched from the 
> StandbyNameNode, which is nice. Here are some logs:
>  !image-2022-05-19-20-23-23-825.png! 
> But we have no way of knowing which NameNode the request was made to. We 
> should log more detailed information, such as the host associated with the 
> StandbyNameNode.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16584) Record StandbyNameNode information when Balancer is running

2022-05-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16584?focusedWorklogId=772735&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-772735
 ]

ASF GitHub Bot logged work on HDFS-16584:
-

Author: ASF GitHub Bot
Created on: 20/May/22 09:25
Start Date: 20/May/22 09:25
Worklog Time Spent: 10m 
  Work Description: jianghuazhu commented on PR #4333:
URL: https://github.com/apache/hadoop/pull/4333#issuecomment-1132680794

   Here are some examples of online clusters. E.g:
   `2022-05-20 17:13:46,532 INFO 
org.apache.hadoop.hdfs.server.balancer.NameNodeConnector: Request 
#getLiveDatanodeStorageReport to Standby NameNode success. remoteAddress: 
hadoop-.org`
   `2022-05-20 17:13:46,576 INFO 
org.apache.hadoop.hdfs.server.balancer.NameNodeConnector: Request #getBlocks to 
Standby NameNode success.remoteAddress:hadoop-.org`
   
   Here hadoop-.org plays the role of Standby NameNode.




Issue Time Tracking
---

Worklog Id: (was: 772735)
Time Spent: 0.5h  (was: 20m)

> Record StandbyNameNode information when Balancer is running
> ---
>
> Key: HDFS-16584
> URL: https://issues.apache.org/jira/browse/HDFS-16584
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer & mover, namenode
>Affects Versions: 3.3.0
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2022-05-19-20-23-23-825.png
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When the Balancer is running, we allow block data to be fetched from the 
> StandbyNameNode, which is nice. Here are some logs:
>  !image-2022-05-19-20-23-23-825.png! 
> But we have no way of knowing which NameNode the request was made to. We 
> should log more detailed information, such as the host associated with the 
> StandbyNameNode.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org