[jira] [Commented] (HDFS-17532) RBF: Allow router state store cache update to overwrite and delete in parallel

2024-05-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17849280#comment-17849280
 ] 

ASF GitHub Bot commented on HDFS-17532:
---

hadoop-yetus commented on PR #6839:
URL: https://github.com/apache/hadoop/pull/6839#issuecomment-2129423948

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 31s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  0s |  |  xmllint was not available.  |
   | +0 :ok: |  markdownlint  |   0m  0s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 2 new or modified test files.  |
    _ trunk Compile Tests _ |
   | -1 :x: |  mvninstall  |   3m 27s | 
[/branch-mvninstall-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6839/10/artifact/out/branch-mvninstall-root.txt)
 |  root in trunk failed.  |
   | +1 :green_heart: |  compile  |   3m 36s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |   0m 31s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   0m 32s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 39s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 38s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 26s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   1m 16s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  37m 46s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 38s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 33s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |   0m 33s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 30s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |   0m 30s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 19s | 
[/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs-rbf.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6839/10/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs-rbf.txt)
 |  hadoop-hdfs-project/hadoop-hdfs-rbf: The patch generated 1 new + 1 
unchanged - 0 fixed = 2 total (was 1)  |
   | +1 :green_heart: |  mvnsite  |   0m 33s |  |  the patch passed  |
   | -1 :x: |  javadoc  |   0m 29s | 
[/results-javadoc-javadoc-hadoop-hdfs-project_hadoop-hdfs-rbf-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6839/10/artifact/out/results-javadoc-javadoc-hadoop-hdfs-project_hadoop-hdfs-rbf-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt)
 |  
hadoop-hdfs-project_hadoop-hdfs-rbf-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
 with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 generated 1 new + 0 
unchanged - 0 fixed = 1 total (was 0)  |
   | +1 :green_heart: |  javadoc  |   0m 25s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   1m 22s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  33m 29s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  29m 59s |  |  hadoop-hdfs-rbf in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 41s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 123m 43s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.45 ServerAPI=1.45 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6839/10/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6839 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint 
markdownlint |
   | uname | Linux 7f83cd0f118e 5.15.0-106-generic #116-Ubuntu SMP Wed Apr 17 
09:17:56 

[jira] [Commented] (HDFS-17536) RBF: Format safe-mode related logic and fix a race

2024-05-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17849279#comment-17849279
 ] 

ASF GitHub Bot commented on HDFS-17536:
---

hadoop-yetus commented on PR #6844:
URL: https://github.com/apache/hadoop/pull/6844#issuecomment-2129396098

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 29s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | -1 :x: |  mvninstall  |  22m  1s | 
[/branch-mvninstall-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6844/1/artifact/out/branch-mvninstall-root.txt)
 |  root in trunk failed.  |
   | +1 :green_heart: |  compile  |   0m 40s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |   0m 37s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   0m 30s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 42s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 44s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 30s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   1m 20s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  33m 31s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 31s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 31s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |   0m 31s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 29s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |   0m 29s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 17s | 
[/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs-rbf.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6844/1/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs-rbf.txt)
 |  hadoop-hdfs-project/hadoop-hdfs-rbf: The patch generated 1 new + 2 
unchanged - 0 fixed = 3 total (was 2)  |
   | +1 :green_heart: |  mvnsite  |   0m 32s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 28s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 23s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | -1 :x: |  spotbugs  |   1m 24s | 
[/new-spotbugs-hadoop-hdfs-project_hadoop-hdfs-rbf.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6844/1/artifact/out/new-spotbugs-hadoop-hdfs-project_hadoop-hdfs-rbf.html)
 |  hadoop-hdfs-project/hadoop-hdfs-rbf generated 1 new + 0 unchanged - 0 fixed 
= 1 total (was 0)  |
   | +1 :green_heart: |  shadedclient  |  33m 47s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  30m 46s |  |  hadoop-hdfs-rbf in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 37s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 136m  9s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | SpotBugs | module:hadoop-hdfs-project/hadoop-hdfs-rbf |
   |  |  Inconsistent synchronization of 
org.apache.hadoop.hdfs.server.federation.router.RouterSafemodeService.startupTime;
 locked 50% of time  Unsynchronized access at RouterSafemodeService.java:50% of 
time  Unsynchronized access at RouterSafemodeService.java:[line 146] |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.44 ServerAPI=1.44 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6844/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6844 |
   | Optional Tests | dupname asflicense compile javac 

[jira] [Commented] (HDFS-17532) RBF: Allow router state store cache update to overwrite and delete in parallel

2024-05-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17849265#comment-17849265
 ] 

ASF GitHub Bot commented on HDFS-17532:
---

hadoop-yetus commented on PR #6839:
URL: https://github.com/apache/hadoop/pull/6839#issuecomment-2129282223

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 19s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  1s |  |  xmllint was not available.  |
   | +0 :ok: |  markdownlint  |   0m  1s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 2 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  33m 12s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 24s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |   0m 21s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   0m 20s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 25s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 37s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 19s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   0m 54s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  19m 50s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 18s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 19s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |   0m 19s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 16s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |   0m 16s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 11s | 
[/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs-rbf.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6839/9/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs-rbf.txt)
 |  hadoop-hdfs-project/hadoop-hdfs-rbf: The patch generated 2 new + 1 
unchanged - 0 fixed = 3 total (was 1)  |
   | +1 :green_heart: |  mvnsite  |   0m 20s |  |  the patch passed  |
   | -1 :x: |  javadoc  |   0m 17s | 
[/results-javadoc-javadoc-hadoop-hdfs-project_hadoop-hdfs-rbf-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6839/9/artifact/out/results-javadoc-javadoc-hadoop-hdfs-project_hadoop-hdfs-rbf-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt)
 |  
hadoop-hdfs-project_hadoop-hdfs-rbf-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
 with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 generated 1 new + 0 
unchanged - 0 fixed = 1 total (was 0)  |
   | +1 :green_heart: |  javadoc  |   0m 16s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | -1 :x: |  spotbugs  |   0m 52s | 
[/new-spotbugs-hadoop-hdfs-project_hadoop-hdfs-rbf.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6839/9/artifact/out/new-spotbugs-hadoop-hdfs-project_hadoop-hdfs-rbf.html)
 |  hadoop-hdfs-project/hadoop-hdfs-rbf generated 1 new + 0 unchanged - 0 fixed 
= 1 total (was 0)  |
   | +1 :green_heart: |  shadedclient  |  19m  3s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  30m 43s |  |  hadoop-hdfs-rbf in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 29s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 113m 25s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | SpotBugs | module:hadoop-hdfs-project/hadoop-hdfs-rbf |
   |  |  Exceptional return value of 
java.util.concurrent.ThreadPoolExecutor.submit(Callable) ignored in 
org.apache.hadoop.hdfs.server.federation.store.driver.StateStoreDriver.handleOverwriteAndDelete(List,
 List)  At StateStoreDriver.java:ignored in 

[jira] [Commented] (HDFS-17532) RBF: Allow router state store cache update to overwrite and delete in parallel

2024-05-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17849263#comment-17849263
 ] 

ASF GitHub Bot commented on HDFS-17532:
---

hadoop-yetus commented on PR #6839:
URL: https://github.com/apache/hadoop/pull/6839#issuecomment-2129273278

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 20s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  0s |  |  xmllint was not available.  |
   | +0 :ok: |  markdownlint  |   0m  0s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 2 new or modified test files.  |
    _ trunk Compile Tests _ |
   | -1 :x: |  mvninstall  |  33m 23s | 
[/branch-mvninstall-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6839/8/artifact/out/branch-mvninstall-root.txt)
 |  root in trunk failed.  |
   | +1 :green_heart: |  compile  |   0m 26s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |   0m 21s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   0m 21s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 28s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 33s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 18s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   0m 54s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  20m 12s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 18s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 18s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |   0m 18s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 17s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |   0m 17s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 12s | 
[/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs-rbf.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6839/8/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs-rbf.txt)
 |  hadoop-hdfs-project/hadoop-hdfs-rbf: The patch generated 2 new + 1 
unchanged - 0 fixed = 3 total (was 1)  |
   | +1 :green_heart: |  mvnsite  |   0m 21s |  |  the patch passed  |
   | -1 :x: |  javadoc  |   0m 17s | 
[/results-javadoc-javadoc-hadoop-hdfs-project_hadoop-hdfs-rbf-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6839/8/artifact/out/results-javadoc-javadoc-hadoop-hdfs-project_hadoop-hdfs-rbf-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt)
 |  
hadoop-hdfs-project_hadoop-hdfs-rbf-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
 with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 generated 1 new + 0 
unchanged - 0 fixed = 1 total (was 0)  |
   | +1 :green_heart: |  javadoc  |   0m 16s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | -1 :x: |  spotbugs  |   0m 52s | 
[/new-spotbugs-hadoop-hdfs-project_hadoop-hdfs-rbf.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6839/8/artifact/out/new-spotbugs-hadoop-hdfs-project_hadoop-hdfs-rbf.html)
 |  hadoop-hdfs-project/hadoop-hdfs-rbf generated 1 new + 0 unchanged - 0 fixed 
= 1 total (was 0)  |
   | +1 :green_heart: |  shadedclient  |  20m 12s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  26m 54s |  |  hadoop-hdfs-rbf in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 27s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 111m 18s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | SpotBugs | module:hadoop-hdfs-project/hadoop-hdfs-rbf |
   |  |  Exceptional return value of 
java.util.concurrent.ThreadPoolExecutor.submit(Callable) ignored in 

[jira] [Commented] (HDFS-17532) RBF: Allow router state store cache update to overwrite and delete in parallel

2024-05-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17849262#comment-17849262
 ] 

ASF GitHub Bot commented on HDFS-17532:
---

hadoop-yetus commented on PR #6839:
URL: https://github.com/apache/hadoop/pull/6839#issuecomment-2129257791

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 46s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  0s |  |  xmllint was not available.  |
   | +0 :ok: |  markdownlint  |   0m  0s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  50m 32s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 39s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |   0m 36s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   0m 31s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 43s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 49s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 30s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   1m 22s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  39m 14s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 31s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 34s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |   0m 34s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 28s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |   0m 28s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 19s | 
[/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs-rbf.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6839/7/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs-rbf.txt)
 |  hadoop-hdfs-project/hadoop-hdfs-rbf: The patch generated 3 new + 1 
unchanged - 0 fixed = 4 total (was 1)  |
   | +1 :green_heart: |  mvnsite  |   0m 32s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 29s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 23s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | -1 :x: |  spotbugs  |   1m 24s | 
[/new-spotbugs-hadoop-hdfs-project_hadoop-hdfs-rbf.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6839/7/artifact/out/new-spotbugs-hadoop-hdfs-project_hadoop-hdfs-rbf.html)
 |  hadoop-hdfs-project/hadoop-hdfs-rbf generated 1 new + 0 unchanged - 0 fixed 
= 1 total (was 0)  |
   | +1 :green_heart: |  shadedclient  |  38m 39s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  |  36m 43s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6839/7/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt)
 |  hadoop-hdfs-rbf in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 36s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 181m 46s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | SpotBugs | module:hadoop-hdfs-project/hadoop-hdfs-rbf |
   |  |  Exceptional return value of 
java.util.concurrent.ThreadPoolExecutor.submit(Callable) ignored in 
org.apache.hadoop.hdfs.server.federation.store.CachedRecordStore.overrideExpiredRecords(QueryResult)
  At CachedRecordStore.java:ignored in 
org.apache.hadoop.hdfs.server.federation.store.CachedRecordStore.overrideExpiredRecords(QueryResult)
  At CachedRecordStore.java:[line 243] |
   | Failed junit tests | hadoop.hdfs.server.federation.router.TestRouterRpc |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | 

[jira] [Commented] (HDFS-17532) RBF: Allow router state store cache update to overwrite and delete in parallel

2024-05-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17849249#comment-17849249
 ] 

ASF GitHub Bot commented on HDFS-17532:
---

hadoop-yetus commented on PR #6839:
URL: https://github.com/apache/hadoop/pull/6839#issuecomment-2129175735

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 32s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  0s |  |  xmllint was not available.  |
   | +0 :ok: |  markdownlint  |   0m  0s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  45m 16s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 38s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |   0m 34s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   0m 29s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 39s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 39s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 29s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   1m 19s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  35m 10s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 32s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 33s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |   0m 33s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 28s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |   0m 28s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 19s | 
[/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs-rbf.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6839/5/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs-rbf.txt)
 |  hadoop-hdfs-project/hadoop-hdfs-rbf: The patch generated 3 new + 1 
unchanged - 0 fixed = 4 total (was 1)  |
   | +1 :green_heart: |  mvnsite  |   0m 32s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 28s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 24s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | -1 :x: |  spotbugs  |   1m 21s | 
[/new-spotbugs-hadoop-hdfs-project_hadoop-hdfs-rbf.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6839/5/artifact/out/new-spotbugs-hadoop-hdfs-project_hadoop-hdfs-rbf.html)
 |  hadoop-hdfs-project/hadoop-hdfs-rbf generated 1 new + 0 unchanged - 0 fixed 
= 1 total (was 0)  |
   | +1 :green_heart: |  shadedclient  |  34m 54s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  |  25m 16s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6839/5/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt)
 |  hadoop-hdfs-rbf in the patch failed.  |
   | +1 :green_heart: |  asflicense  |   0m 43s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 156m 42s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | SpotBugs | module:hadoop-hdfs-project/hadoop-hdfs-rbf |
   |  |  Exceptional return value of 
java.util.concurrent.ThreadPoolExecutor.submit(Callable) ignored in 
org.apache.hadoop.hdfs.server.federation.store.CachedRecordStore.overrideExpiredRecords(QueryResult,
 boolean)  At CachedRecordStore.java:ignored in 
org.apache.hadoop.hdfs.server.federation.store.CachedRecordStore.overrideExpiredRecords(QueryResult,
 boolean)  At CachedRecordStore.java:[line 235] |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.45 ServerAPI=1.45 base: 

[jira] [Commented] (HDFS-17532) RBF: Allow router state store cache update to overwrite and delete in parallel

2024-05-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17849247#comment-17849247
 ] 

ASF GitHub Bot commented on HDFS-17532:
---

hadoop-yetus commented on PR #6839:
URL: https://github.com/apache/hadoop/pull/6839#issuecomment-2129168597

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 30s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  1s |  |  xmllint was not available.  |
   | +0 :ok: |  markdownlint  |   0m  1s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  45m  2s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 40s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |   0m 35s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   0m 30s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 42s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 42s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 32s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   1m 20s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  34m 52s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 28s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 31s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |   0m 31s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 27s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |   0m 27s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 18s | 
[/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs-rbf.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6839/4/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs-rbf.txt)
 |  hadoop-hdfs-project/hadoop-hdfs-rbf: The patch generated 1 new + 1 
unchanged - 0 fixed = 2 total (was 1)  |
   | +1 :green_heart: |  mvnsite  |   0m 31s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 30s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 24s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   1m 24s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  35m 19s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  30m 23s |  |  hadoop-hdfs-rbf in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 39s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 161m 33s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.45 ServerAPI=1.45 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6839/4/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6839 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint 
markdownlint |
   | uname | Linux 443e13f1bf14 5.15.0-106-generic #116-Ubuntu SMP Wed Apr 17 
09:17:56 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 6146f3a1547f47a0b06594b2e74032c1532e61d8 |
   | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6839/4/testReport/ |
   | 

[jira] [Updated] (HDFS-17536) RBF: Format safe-mode related logic and fix a race

2024-05-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-17536:
--
Labels: pull-request-available  (was: )

> RBF: Format safe-mode related logic and fix a race 
> ---
>
> Key: HDFS-17536
> URL: https://issues.apache.org/jira/browse/HDFS-17536
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
>
> RBF: Format safe-mode related logic and fix a race.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17536) RBF: Format safe-mode related logic and fix a race

2024-05-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17849244#comment-17849244
 ] 

ASF GitHub Bot commented on HDFS-17536:
---

ZanderXu opened a new pull request, #6844:
URL: https://github.com/apache/hadoop/pull/6844

   Both `RouterAdminServer#enterSafeMode()` and 
`RouterSafemodeService#periodicInvoke()#leave` can change the router state at 
the same time. 
   
   Safe-mode change logic should be condensed into one method. And some races 
may happen in the current implementation, such as:
   
   1. `RouterAdminServer#enterSafeMode()` set router stat to 
`RouterServiceState.SAFEMODE`
   2. `RouterSafemodeService#periodicInvoke()#leave` got true when checking 
`safeMode && !isSafeModeSetManually`
   3. `RouterAdminServer#enterSafeMode()` set `safeMode` and 
`isSafeModeSetManually` to `true`
   4. `RouterAdminServer#enterSafeMode()` get `true` when checking safe-mode 
   5. `RouterSafemodeService#periodicInvoke()#leave` call `leave()` to leave 
safe-mode.
   
   This RBF is not in safe-mode and `safeMode` is `false`, but 
`isSafeModeSetManually` is `true`.




> RBF: Format safe-mode related logic and fix a race 
> ---
>
> Key: HDFS-17536
> URL: https://issues.apache.org/jira/browse/HDFS-17536
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>
> RBF: Format safe-mode related logic and fix a race.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17532) RBF: Allow router state store cache update to overwrite and delete in parallel

2024-05-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17849233#comment-17849233
 ] 

ASF GitHub Bot commented on HDFS-17532:
---

ZanderXu commented on code in PR #6839:
URL: https://github.com/apache/hadoop/pull/6839#discussion_r1613173537


##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/store/driver/StateStoreDriver.java:
##
@@ -88,6 +101,13 @@ public boolean init(final Configuration config, final 
String id,
 return false;
   }
 }
+
+if (conf.getBoolean(
+RBFConfigKeys.FEDERATION_STORE_MEMBERSHIP_ASYNC_OVERRIDE,
+RBFConfigKeys.FEDERATION_STORE_MEMBERSHIP_ASYNC_OVERRIDE_DEFAULT)) {
+  executor = new ThreadPoolExecutor(2, 2, 1L, TimeUnit.MINUTES, new 
LinkedBlockingQueue<>());

Review Comment:
   You can refer HDFS-16848 to change this configuration to the number of 
threads.



##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/store/driver/StateStoreDriver.java:
##
@@ -17,13 +17,23 @@
  */
 package org.apache.hadoop.hdfs.server.federation.store.driver;
 
+import java.io.IOException;
 import java.net.InetAddress;
+import java.util.ArrayList;
 import java.util.Collection;
+import java.util.HashMap;

Review Comment:
   unused.



##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/store/driver/StateStoreDriver.java:
##
@@ -206,4 +231,48 @@ private String getHostname() {
 }
 return hostname;
   }
+
+  /**
+   * Try to overwrite records in commitRecords and remove records in 
deleteRecords.
+   * Should return null if async mode is used. Else return removed records.
+   * @param commitRecords records to overwrite in state store
+   * @param deleteRecords records to remove from state store
+   * @param  record class
+   * @return null if async mode is used, else removed records
+   */
+  public  List handleOverwriteAndDelete(List 
commitRecords,
+  List deleteRecords) throws IOException {
+Callable overwriteCallable =
+() -> putAll(commitRecords, true, false);
+Callable> deletionCallable = () -> 
removeMultiple(deleteRecords);
+
+if (executor != null) {
+  // In async mode, just submit and let the tasks do their work and return 
asap.
+  if (!commitRecords.isEmpty()) {
+executor.submit(overwriteCallable);
+  }
+  if (!deleteRecords.isEmpty()) {
+executor.submit(deletionCallable);
+  }
+  return null;
+} else {
+  try {
+List result = new ArrayList<>();
+if (!commitRecords.isEmpty()) {
+  overwriteCallable.call();
+}
+if (!deleteRecords.isEmpty()) {
+  Map removedRecords = deletionCallable.call();
+  for (Map.Entry entry : removedRecords.entrySet()) {
+if (entry.getValue()) {
+  result.add(entry.getKey());
+}
+  }
+}
+return result;
+  } catch (Exception e) {
+throw new IOException(e);
+  }
+}
+  }

Review Comment:
   ```
   public  List handleOverwriteAndDelete(List 
commitRecords,
 List deleteRecords) throws IOException {
   List result = null;
   try {
 // overwrite all expired records.
 if (commitRecords != null && !commitRecords.isEmpty()) {
   Callable overwriteCallable =
   () -> putAll(commitRecords, true, false);
   if (executor != null) {
 executor.submit(overwriteCallable);
   } else {
 overwriteCallable.call();
   }
 }
 
 // delete all deletable records.
 if (deleteRecords != null && !deleteRecords.isEmpty()) {
   Callable> deletionCallable = () -> 
removeMultiple(deleteRecords);
   if (executor != null) {
 executor.submit(deletionCallable);
   } else {
 result = new ArrayList<>();
 Map removedRecords = deletionCallable.call();
 for (Map.Entry entry : removedRecords.entrySet()) {
   if (entry.getValue()) {
 result.add(entry.getKey());
   }
 }
   }
 }
   } catch (Exception e) {
 throw new IOException(e);
   }
   return result;
 }
   ```





> RBF: Allow router state store cache update to overwrite and delete in parallel
> --
>
> Key: HDFS-17532
> URL: https://issues.apache.org/jira/browse/HDFS-17532
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, rbf
>Reporter: Felix N
>Assignee: Felix N
>Priority: Minor
>  Labels: pull-request-available
>
> Current implementation for 

[jira] [Commented] (HDFS-17532) RBF: Allow router state store cache update to overwrite and delete in parallel

2024-05-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17849231#comment-17849231
 ] 

ASF GitHub Bot commented on HDFS-17532:
---

hadoop-yetus commented on PR #6839:
URL: https://github.com/apache/hadoop/pull/6839#issuecomment-2129075360

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 19s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  0s |  |  xmllint was not available.  |
   | +0 :ok: |  markdownlint  |   0m  0s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  31m 53s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 24s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |   0m 22s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   0m 20s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 28s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 31s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 20s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   0m 51s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  19m 41s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 20s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 19s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |   0m 19s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 16s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |   0m 16s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 12s | 
[/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs-rbf.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6839/6/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs-rbf.txt)
 |  hadoop-hdfs-project/hadoop-hdfs-rbf: The patch generated 3 new + 1 
unchanged - 0 fixed = 4 total (was 1)  |
   | +1 :green_heart: |  mvnsite  |   0m 20s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 17s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 16s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | -1 :x: |  spotbugs  |   0m 49s | 
[/new-spotbugs-hadoop-hdfs-project_hadoop-hdfs-rbf.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6839/6/artifact/out/new-spotbugs-hadoop-hdfs-project_hadoop-hdfs-rbf.html)
 |  hadoop-hdfs-project/hadoop-hdfs-rbf generated 1 new + 0 unchanged - 0 fixed 
= 1 total (was 0)  |
   | +1 :green_heart: |  shadedclient  |  19m 37s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  30m 46s |  |  hadoop-hdfs-rbf in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 28s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 112m 11s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | SpotBugs | module:hadoop-hdfs-project/hadoop-hdfs-rbf |
   |  |  Exceptional return value of 
java.util.concurrent.ThreadPoolExecutor.submit(Callable) ignored in 
org.apache.hadoop.hdfs.server.federation.store.CachedRecordStore.overrideExpiredRecords(QueryResult,
 boolean)  At CachedRecordStore.java:ignored in 
org.apache.hadoop.hdfs.server.federation.store.CachedRecordStore.overrideExpiredRecords(QueryResult,
 boolean)  At CachedRecordStore.java:[line 235] |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.45 ServerAPI=1.45 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6839/6/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6839 |
   | Optional Tests | dupname 

[jira] [Commented] (HDFS-17529) RBF: Improve router state store cache entry deletion

2024-05-23 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17849138#comment-17849138
 ] 

ASF GitHub Bot commented on HDFS-17529:
---

ZanderXu commented on PR #6833:
URL: https://github.com/apache/hadoop/pull/6833#issuecomment-2128331182

   Merged. Thanks @kokonguyen191 for your contribution. 




> RBF: Improve router state store cache entry deletion
> 
>
> Key: HDFS-17529
> URL: https://issues.apache.org/jira/browse/HDFS-17529
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, rbf
>Reporter: Felix N
>Assignee: Felix N
>Priority: Major
>  Labels: pull-request-available
>
> Current implementation for router state store update is quite inefficient, so 
> much that when routers are removed and a lot of NameNodeMembership records 
> are deleted in a short burst, the deletions triggered a router safemode in 
> our cluster and caused a lot of troubles.
> This ticket aims to improve the deletion process for ZK state store 
> implementation.
> See HDFS-17532 for the other half of this improvement



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17529) RBF: Improve router state store cache entry deletion

2024-05-23 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17849137#comment-17849137
 ] 

ASF GitHub Bot commented on HDFS-17529:
---

ZanderXu merged PR #6833:
URL: https://github.com/apache/hadoop/pull/6833




> RBF: Improve router state store cache entry deletion
> 
>
> Key: HDFS-17529
> URL: https://issues.apache.org/jira/browse/HDFS-17529
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, rbf
>Reporter: Felix N
>Assignee: Felix N
>Priority: Major
>  Labels: pull-request-available
>
> Current implementation for router state store update is quite inefficient, so 
> much that when routers are removed and a lot of NameNodeMembership records 
> are deleted in a short burst, the deletions triggered a router safemode in 
> our cluster and caused a lot of troubles.
> This ticket aims to improve the deletion process for ZK state store 
> implementation.
> See HDFS-17532 for the other half of this improvement



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17518) In the lease monitor, if a file is closed, we should sync the editslog

2024-05-23 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17848991#comment-17848991
 ] 

ASF GitHub Bot commented on HDFS-17518:
---

ThinkerLei commented on code in PR #6809:
URL: https://github.com/apache/hadoop/pull/6809#discussion_r1611850764


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/LeaseManager.java:
##
@@ -626,7 +626,8 @@ private synchronized boolean checkLeases(Collection 
leasesToCheck) {
 }
   }
   // If a lease recovery happened, we need to sync later.

Review Comment:
   @vinayakumarb Thank you for your reply. How about changing the method 
`checkLeases` to return true?





> In the lease monitor, if a file is closed, we should sync the editslog
> --
>
> Key: HDFS-17518
> URL: https://issues.apache.org/jira/browse/HDFS-17518
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: lei w
>Assignee: lei w
>Priority: Minor
>  Labels: pull-request-available
>
> In the lease monitor, if a file is closed,  method checklease will return 
> true, and then the edits log will not be sync. In my opinion, we should sync 
> the edits log to avoid not synchronizing the state to the standby NameNode 
> for a long time.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17529) RBF: Improve router state store cache entry deletion

2024-05-23 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17848919#comment-17848919
 ] 

ASF GitHub Bot commented on HDFS-17529:
---

hadoop-yetus commented on PR #6833:
URL: https://github.com/apache/hadoop/pull/6833#issuecomment-2126931785

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 46s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  1s |  |  xmllint was not available.  |
   | +0 :ok: |  markdownlint  |   0m  1s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 2 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  49m 44s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 40s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |   0m 36s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   0m 28s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 41s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 42s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 30s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   1m 21s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  40m 23s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 35s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 37s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |   0m 37s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 31s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |   0m 31s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 20s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 34s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 32s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 25s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   1m 34s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  39m 47s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  32m 56s |  |  hadoop-hdfs-rbf in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 36s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 179m 28s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.44 ServerAPI=1.44 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6833/6/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6833 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint 
markdownlint |
   | uname | Linux 1be2770da36d 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 
15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 2b7aeaa91aa57e5b26bced05e23c81d21adfe1da |
   | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6833/6/testReport/ |
   | Max. process+thread count | 3751 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 
hadoop-hdfs-project/hadoop-hdfs-rbf |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6833/6/console |
   | versions | git=2.25.1 maven=3.6.3 

[jira] [Commented] (HDFS-17529) RBF: Improve router state store cache entry deletion

2024-05-23 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17848881#comment-17848881
 ] 

ASF GitHub Bot commented on HDFS-17529:
---

ZanderXu commented on code in PR #6833:
URL: https://github.com/apache/hadoop/pull/6833#discussion_r1611258872


##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/store/CachedRecordStore.java:
##
@@ -198,8 +195,15 @@ public void overrideExpiredRecords(QueryResult query) 
throws IOException {
 if (commitRecords.size() > 0) {
   getDriver().putAll(commitRecords, true, false);
 }
-if (deleteRecords.size() > 0) {
-  newRecords.removeAll(deleteRecords);
+if (!toDeleteRecords.isEmpty()) {
+  for (Map.Entry entry : 
getDriver().removeMultiple(toDeleteRecords).entrySet()) {
+if (entry.getValue()) {
+  deletedRecords.add(entry.getKey());

Review Comment:
   Here changing to `newRecords.remove(entry.getKey())`, we can remove  
`deletedRecords`.





> RBF: Improve router state store cache entry deletion
> 
>
> Key: HDFS-17529
> URL: https://issues.apache.org/jira/browse/HDFS-17529
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, rbf
>Reporter: Felix N
>Assignee: Felix N
>Priority: Major
>  Labels: pull-request-available
>
> Current implementation for router state store update is quite inefficient, so 
> much that when routers are removed and a lot of NameNodeMembership records 
> are deleted in a short burst, the deletions triggered a router safemode in 
> our cluster and caused a lot of troubles.
> This ticket aims to improve the deletion process for ZK state store 
> implementation.
> See HDFS-17532 for the other half of this improvement



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17528) FsImageValidation: set txid when saving a new image

2024-05-22 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17848796#comment-17848796
 ] 

ASF GitHub Bot commented on HDFS-17528:
---

szetszwo commented on PR #6828:
URL: https://github.com/apache/hadoop/pull/6828#issuecomment-2126096755

   @vinayakumarb , thanks a lot for reviewing this!




> FsImageValidation: set txid when saving a new image
> ---
>
> Key: HDFS-17528
> URL: https://issues.apache.org/jira/browse/HDFS-17528
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Reporter: Tsz-wo Sze
>Assignee: Tsz-wo Sze
>Priority: Major
>  Labels: pull-request-available
>
> - When the fsimage is specified as a file and the FsImageValidation tool 
> saves a new image (for removing inaccessible inodes), the txid is not set.  
> Then, the resulted image will have 0 as its txid.
> - When the fsimage is specified as a directory, the txid is set.  However, it 
> will get NPE since NameNode metrics is uninitialized (although the metrics is 
> not used by FsImageValidation).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17459) [FGL] Summarize this feature

2024-05-21 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17848414#comment-17848414
 ] 

ASF GitHub Bot commented on HDFS-17459:
---

hfutatzhanghb commented on code in PR #6737:
URL: https://github.com/apache/hadoop/pull/6737#discussion_r1609154301


##
hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/NamenodeFGL.md:
##
@@ -0,0 +1,210 @@
+
+
+HDFS Namenode Fine-grained Locking
+==
+
+ [FGL] Summarize this feature 
> -
>
> Key: HDFS-17459
> URL: https://issues.apache.org/jira/browse/HDFS-17459
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: ZanderXu
>Assignee: Felix N
>Priority: Major
>  Labels: pull-request-available
>
> Write a doc to summarize this feature so we can merge it into the trunk.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17518) In the lease monitor, if a file is closed, we should sync the editslog

2024-05-21 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17848367#comment-17848367
 ] 

ASF GitHub Bot commented on HDFS-17518:
---

vinayakumarb commented on code in PR #6809:
URL: https://github.com/apache/hadoop/pull/6809#discussion_r1608866841


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/LeaseManager.java:
##
@@ -626,7 +626,8 @@ private synchronized boolean checkLeases(Collection 
leasesToCheck) {
 }
   }
   // If a lease recovery happened, we need to sync later.

Review Comment:
   as mentioned above, please change below logic to call logSync() always.





> In the lease monitor, if a file is closed, we should sync the editslog
> --
>
> Key: HDFS-17518
> URL: https://issues.apache.org/jira/browse/HDFS-17518
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: lei w
>Assignee: lei w
>Priority: Minor
>  Labels: pull-request-available
>
> In the lease monitor, if a file is closed,  method checklease will return 
> true, and then the edits log will not be sync. In my opinion, we should sync 
> the edits log to avoid not synchronizing the state to the standby NameNode 
> for a long time.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17518) In the lease monitor, if a file is closed, we should sync the editslog

2024-05-21 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17848365#comment-17848365
 ] 

ASF GitHub Bot commented on HDFS-17518:
---

vinayakumarb commented on code in PR #6809:
URL: https://github.com/apache/hadoop/pull/6809#discussion_r1608864957


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/LeaseManager.java:
##
@@ -626,7 +626,8 @@ private synchronized boolean checkLeases(Collection 
leasesToCheck) {
 }
   }
   // If a lease recovery happened, we need to sync later.

Review Comment:
   I dont think that special case needs to be handled. If there is no txn, then 
also calling logSync() wont be a problem.
   
   If there is no edit txn, logSync() will just return without doing anything.





> In the lease monitor, if a file is closed, we should sync the editslog
> --
>
> Key: HDFS-17518
> URL: https://issues.apache.org/jira/browse/HDFS-17518
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: lei w
>Assignee: lei w
>Priority: Minor
>  Labels: pull-request-available
>
> In the lease monitor, if a file is closed,  method checklease will return 
> true, and then the edits log will not be sync. In my opinion, we should sync 
> the edits log to avoid not synchronizing the state to the standby NameNode 
> for a long time.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17529) Improve router state store cache entry deletion

2024-05-21 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17848113#comment-17848113
 ] 

ASF GitHub Bot commented on HDFS-17529:
---

hadoop-yetus commented on PR #6833:
URL: https://github.com/apache/hadoop/pull/6833#issuecomment-2122116959

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 46s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  1s |  |  xmllint was not available.  |
   | +0 :ok: |  markdownlint  |   0m  1s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 2 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  49m 18s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 41s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |   0m 36s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   0m 30s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 40s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 41s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 29s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   1m 20s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  38m 43s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 30s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 33s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |   0m 33s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 29s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |   0m 29s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 18s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 31s |  |  the patch passed  |
   | -1 :x: |  javadoc  |   0m 29s | 
[/patch-javadoc-hadoop-hdfs-project_hadoop-hdfs-rbf-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6833/4/artifact/out/patch-javadoc-hadoop-hdfs-project_hadoop-hdfs-rbf-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt)
 |  hadoop-hdfs-rbf in the patch failed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.  |
   | +1 :green_heart: |  javadoc  |   0m 23s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   1m 20s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  38m 52s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  33m  8s |  |  hadoop-hdfs-rbf in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 36s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 176m 18s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.44 ServerAPI=1.44 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6833/4/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6833 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint 
markdownlint |
   | uname | Linux db3ba1bfe5a3 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 
15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 35b9915a7d90f0d824fb584c28f0b4885000130e |
   | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6833/4/testReport/ 

[jira] [Commented] (HDFS-17529) Improve router state store cache entry deletion

2024-05-21 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17848070#comment-17848070
 ] 

ASF GitHub Bot commented on HDFS-17529:
---

kokonguyen191 commented on code in PR #6833:
URL: https://github.com/apache/hadoop/pull/6833#discussion_r1607729934


##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/store/driver/StateStoreRecordOperations.java:
##
@@ -127,6 +128,17 @@  StateStoreOperationResult putAll(
   @AtMostOnce
boolean remove(T record) throws IOException;
 
+  /**
+   * Remove multiple records.
+   *
+   * @param  Record class of the records.
+   * @param records Records to be removed.
+   * @return Map of record -> boolean indicating if the record has being 
removed successfully.

Review Comment:
   Fixed





> Improve router state store cache entry deletion
> ---
>
> Key: HDFS-17529
> URL: https://issues.apache.org/jira/browse/HDFS-17529
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, rbf
>Reporter: Felix N
>Assignee: Felix N
>Priority: Major
>  Labels: pull-request-available
>
> Current implementation for router state store update is quite inefficient, so 
> much that when routers are removed and a lot of NameNodeMembership records 
> are deleted in a short burst, the deletions triggered a router safemode in 
> our cluster and caused a lot of troubles.
> This ticket aims to improve the deletion process for ZK state store 
> implementation.
> See HDFS-17532 for the other half of this improvement



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17529) Improve router state store cache entry deletion

2024-05-21 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17848061#comment-17848061
 ] 

ASF GitHub Bot commented on HDFS-17529:
---

ZanderXu commented on code in PR #6833:
URL: https://github.com/apache/hadoop/pull/6833#discussion_r1607709901


##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/store/driver/StateStoreRecordOperations.java:
##
@@ -127,6 +128,17 @@  StateStoreOperationResult putAll(
   @AtMostOnce
boolean remove(T record) throws IOException;
 
+  /**
+   * Remove multiple records.
+   *
+   * @param  Record class of the records.
+   * @param records Records to be removed.
+   * @return Map of record -> boolean indicating if the record has being 
removed successfully.

Review Comment:
   ```
   [ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-javadoc-plugin:3.0.1:javadoc-no-fork 
(default-cli) on project hadoop-hdfs-rbf: An error has occurred in Javadoc 
report generation: 
   [ERROR] Exit code: 1 - javadoc: warning - You have specified the HTML 
version as HTML 4.01 by using the -html4 option.
   [ERROR] The default is currently HTML5 and the support for HTML 4.01 will be 
removed
   [ERROR] in a future release. To suppress this warning, please ensure that 
any HTML constructs
   [ERROR] in your comments are valid in HTML5, and remove the -html4 option.
   [ERROR] 
/home/jenkins/jenkins-agent/workspace/hadoop-multibranch_PR-6833/ubuntu-focal/src/hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/store/driver/StateStoreRecordOperations.java:136:
 error: bad use of '>'
   [ERROR]* @return Map of record -> boolean indicating any entries being 
deleted by this record.
   [ERROR] ^
   [ERROR] javadoc: warning - invalid usage of tag >
   ```
   
   @kokonguyen191 It seems that `->` is not allowed in the javadoc.





> Improve router state store cache entry deletion
> ---
>
> Key: HDFS-17529
> URL: https://issues.apache.org/jira/browse/HDFS-17529
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, rbf
>Reporter: Felix N
>Assignee: Felix N
>Priority: Major
>  Labels: pull-request-available
>
> Current implementation for router state store update is quite inefficient, so 
> much that when routers are removed and a lot of NameNodeMembership records 
> are deleted in a short burst, the deletions triggered a router safemode in 
> our cluster and caused a lot of troubles.
> This ticket aims to improve the deletion process for ZK state store 
> implementation.
> See HDFS-17532 for the other half of this improvement



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17529) Improve router state store cache entry deletion

2024-05-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17848047#comment-17848047
 ] 

ASF GitHub Bot commented on HDFS-17529:
---

hadoop-yetus commented on PR #6833:
URL: https://github.com/apache/hadoop/pull/6833#issuecomment-2121778835

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 46s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  0s |  |  xmllint was not available.  |
   | +0 :ok: |  markdownlint  |   0m  0s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 2 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  49m 34s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 41s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |   0m 36s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   0m 30s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 41s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 43s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 31s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   1m 22s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  39m  4s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 32s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 32s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |   0m 32s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 28s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |   0m 28s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 18s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 32s |  |  the patch passed  |
   | -1 :x: |  javadoc  |   0m 28s | 
[/patch-javadoc-hadoop-hdfs-project_hadoop-hdfs-rbf-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6833/3/artifact/out/patch-javadoc-hadoop-hdfs-project_hadoop-hdfs-rbf-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt)
 |  hadoop-hdfs-rbf in the patch failed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.  |
   | +1 :green_heart: |  javadoc  |   0m 23s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   1m 20s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  39m  1s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  33m 38s |  |  hadoop-hdfs-rbf in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 36s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 177m 29s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.44 ServerAPI=1.44 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6833/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6833 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint 
markdownlint |
   | uname | Linux 1ecfff136614 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 
15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 00bca37b88cdf179a429030c4b53fc2c69e2ef54 |
   | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6833/3/testReport/ 

[jira] [Commented] (HDFS-17529) Improve router state store cache entry deletion

2024-05-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17848038#comment-17848038
 ] 

ASF GitHub Bot commented on HDFS-17529:
---

ZanderXu commented on code in PR #6833:
URL: https://github.com/apache/hadoop/pull/6833#discussion_r1607542930


##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/store/driver/StateStoreRecordOperations.java:
##
@@ -127,6 +128,17 @@  StateStoreOperationResult putAll(
   @AtMostOnce
boolean remove(T record) throws IOException;
 
+  /**
+   * Remove multiple records.
+   *
+   * @param  Record class of the records.
+   * @param records Records to be removed.
+   * @return Map of record -> boolean indicating any entries being deleted by 
this record.

Review Comment:
   `Map of record -> boolean indicating if the record has being removed 
successfully`



##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/store/driver/StateStoreRecordOperations.java:
##
@@ -152,4 +164,17 @@  StateStoreOperationResult putAll(
int remove(Class clazz, Query query)
   throws IOException;
 
+  /**
+   * Remove all records of a specific class that match any query in a list of 
queries.
+   * Requires the getAll implementation to fetch fresh records on each call.
+   *
+   * @param clazz The class to match the records with.
+   * @param queries Queries (logical OR) to filter what to remove.
+   * @param  Record class of the records.
+   * @return Map of query to number of records deleted by that query.

Review Comment:
   `Map of query to number of records removed by that query.`





> Improve router state store cache entry deletion
> ---
>
> Key: HDFS-17529
> URL: https://issues.apache.org/jira/browse/HDFS-17529
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, rbf
>Reporter: Felix N
>Assignee: Felix N
>Priority: Major
>  Labels: pull-request-available
>
> Current implementation for router state store update is quite inefficient, so 
> much that when routers are removed and a lot of NameNodeMembership records 
> are deleted in a short burst, the deletions triggered a router safemode in 
> our cluster and caused a lot of troubles.
> This ticket aims to improve the deletion process for ZK state store 
> implementation.
> See HDFS-17532 for the other half of this improvement



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17529) Improve router state store cache entry deletion

2024-05-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17848029#comment-17848029
 ] 

ASF GitHub Bot commented on HDFS-17529:
---

ZanderXu commented on code in PR #6833:
URL: https://github.com/apache/hadoop/pull/6833#discussion_r1607499422


##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/store/driver/impl/StateStoreBaseImpl.java:
##
@@ -86,4 +89,37 @@ public  boolean remove(T record) 
throws IOException {
 Class recordClass = (Class)StateStoreUtils.getRecordClass(clazz);
 return remove(recordClass, query) == 1;
   }
+
+  @Override
+  public  Map removeMultiple(List 
records) throws IOException {
+assert !records.isEmpty();
+// Fall back to iterative remove() calls if all records don't share 1 class
+Class expectedClazz = records.get(0).getClass();
+if (!records.stream().allMatch(x -> x.getClass() == expectedClazz)) {
+  Map result = new HashMap<>();
+  for (T record : records) {
+result.put(record, remove(record));
+  }
+  return result;
+}
+
+final List> queries = new ArrayList<>();
+for (T record: records) {
+  queries.add(new Query<>(record));
+}
+@SuppressWarnings("unchecked")
+Class recordClass = (Class) 
StateStoreUtils.getRecordClass(expectedClazz);
+Map, Integer> result = remove(recordClass, queries);
+return result.entrySet().stream()
+.collect(Collectors.toMap(e -> e.getKey().getPartial(), e -> 
e.getValue() > 0));

Review Comment:
   `remove(T record)` returns true if `remove(recordClass, query)` is 1. But 
here is `e.getValue() > 0`. So how about make them consistent? 
   
   Here, how about using `e.getValue() == 1`?



##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/store/TestStateStoreMembershipState.java:
##
@@ -565,7 +568,7 @@ public void testRegistrationExpiredRaceCondition()
 // Load cache
 MembershipStore memStoreSpy = spy(membershipStore);
 DelayAnswer delayer = new DelayAnswer(LOG);
-doAnswer(delayer).when(memStoreSpy).overrideExpiredRecords(any());
+doAnswer(delayer).when(memStoreSpy).overrideExpiredRecords(any(), 
anyBoolean());

Review Comment:
   remove this `anyBoolean()`



##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/store/driver/impl/StateStoreZooKeeperImpl.java:
##
@@ -284,51 +288,88 @@ public  StateStoreOperationResult 
putAll(
   }
 
   @Override
-  public  int remove(
-  Class clazz, Query query) throws IOException {
+  public  Map, Integer> remove(Class clazz,
+  List> queries) throws IOException {
 verifyDriverReady();
-if (query == null) {
-  return 0;
+// Track how many entries are deleted by each query
+Map, Integer> ret = new HashMap<>();
+final List trueRemoved = Collections.synchronizedList(new 
ArrayList<>());
+if (queries.isEmpty()) {
+  return ret;
 }
 
 // Read the current data
 long start = monotonicNow();
-List records = null;
+List records;
 try {
   QueryResult result = get(clazz);
   records = result.getRecords();
 } catch (IOException ex) {
   LOG.error("Cannot get existing records", ex);
   getMetrics().addFailure(monotonicNow() - start);
-  return 0;
+  return ret;
 }
 
 // Check the records to remove
 String znode = getZNodeForClass(clazz);
-List recordsToRemove = filterMultiple(query, records);
+Set recordsToRemove = new HashSet<>();
+Map, List> queryToRecords = new HashMap<>();
+for (Query query : queries) {
+  List filtered = filterMultiple(query, records);
+  queryToRecords.put(query, filtered);
+  recordsToRemove.addAll(filtered);
+}
 
 // Remove the records
-int removed = 0;
-for (T existingRecord : recordsToRemove) {
+List> callables = new ArrayList<>();
+recordsToRemove.forEach(existingRecord -> callables.add(() -> {
   LOG.info("Removing \"{}\"", existingRecord);
   try {
 String primaryKey = getPrimaryKey(existingRecord);
 String path = getNodePath(znode, primaryKey);
 if (zkManager.delete(path)) {
-  removed++;
+  trueRemoved.add(existingRecord);
 } else {
   LOG.error("Did not remove \"{}\"", existingRecord);
 }
   } catch (Exception e) {
 LOG.error("Cannot remove \"{}\"", existingRecord, e);
 getMetrics().addFailure(monotonicNow() - start);
   }
+  return null;
+}));
+try {
+  if (enableConcurrent) {
+executorService.invokeAll(callables);
+  } else {
+for (Callable callable : callables) {
+  callable.call();
+}
+  }
+} catch (Exception e) {
+  LOG.error("Record removal failed : {}", e.getMessage(), e);
+  throw new 

[jira] [Commented] (HDFS-17464) Improve some logs output in class FsDatasetImpl

2024-05-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17848024#comment-17848024
 ] 

ASF GitHub Bot commented on HDFS-17464:
---

haiyang1987 commented on PR #6724:
URL: https://github.com/apache/hadoop/pull/6724#issuecomment-2121550133

   Committed to trunk.
   Thanks @hfutatzhanghb  for your contributions and @ZanderXu @ayushtkn review!




> Improve some logs output in class FsDatasetImpl
> ---
>
> Key: HDFS-17464
> URL: https://issues.apache.org/jira/browse/HDFS-17464
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.4.0
>Reporter: farmmamba
>Assignee: farmmamba
>Priority: Minor
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17464) Improve some logs output in class FsDatasetImpl

2024-05-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17848023#comment-17848023
 ] 

ASF GitHub Bot commented on HDFS-17464:
---

haiyang1987 merged PR #6724:
URL: https://github.com/apache/hadoop/pull/6724




> Improve some logs output in class FsDatasetImpl
> ---
>
> Key: HDFS-17464
> URL: https://issues.apache.org/jira/browse/HDFS-17464
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.4.0
>Reporter: farmmamba
>Assignee: farmmamba
>Priority: Minor
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17531) RBF: Asynchronous router RPC.

2024-05-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17848017#comment-17848017
 ] 

ASF GitHub Bot commented on HDFS-17531:
---

slfan1989 commented on PR #6838:
URL: https://github.com/apache/hadoop/pull/6838#issuecomment-2121500478

   > @ayushtkn @slfan1989 hi, thanks for you replay, I sent the discussion to 
[common-...@hadoop.apache.org](mailto:common-...@hadoop.apache.org).
   
   This pr has too many changes and affects multiple modules, causing the 
compilation to time out. I have seen the discussion emails, and the usual 
discussion process may take 5-7 days. 




> RBF: Asynchronous router RPC.
> -
>
> Key: HDFS-17531
> URL: https://issues.apache.org/jira/browse/HDFS-17531
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jian Zhang
>Assignee: Jian Zhang
>Priority: Major
>  Labels: pull-request-available
> Attachments: Aynchronous router.pdf, HDFS-17531.001.patch, 
> image-2024-05-19-18-07-51-282.png
>
>
> *Description*
> Currently, the main function of the Router service is to accept client 
> requests, forward the requests to the corresponding downstream ns, and then 
> return the results of the downstream ns to the client. The link is as follows:
> *!image-2024-05-19-18-07-51-282.png|width=900,height=300!*
> The main threads involved in the rpc link are:
> {*}Read{*}: Get the client request and put it into the call queue *(1)*
> {*}Handler{*}:
> Extract call *(2)* from the call queue, process the call, generate a new 
> call, place it in the call of the connection thread, and wait for the call 
> processing to complete *(3)*
> After being awakened by the connection thread, process the response and put 
> it into the response queue *(5)*
> *Connection:*
> Hold the link with downstream ns, send the call from the call to the 
> downstream ns (via {*}rpcRequestThread{*}), and obtain a response from ns. 
> Based on the call in the response, notify the call to complete processing 
> *(4)*
> *Responder:*
> Retrieve the response queue from the queue *(6)* and return it to the client
>  
> *Shortcoming*
> Even if the *connection* thread can send more requests to downstream 
> nameservices, since *(3)* and *(4)* are synchronous, when the *handler* 
> thread adds the call to connection.calls, it needs to wait until the 
> *connection* notifies the call to complete, and then Only after the response 
> is put into the response queue can a new call be obtained from the call queue 
> and processed. Therefore, the concurrency performance of the router is 
> limited by the number of handlers; a simple example is as follows: If the 
> number of handlers is 1 and the maximum number of calls in the connection 
> thread is 10, then even if the connection thread can send 10 requests to the 
> downstream ns, since the number of handlers is 1, the router can only process 
> one request after another. 
>  
> Since the performance of router rpc is mainly limited by the number of 
> handlers, the most effective way to improve rpc performance currently is to 
> increase the number of handlers. Letting the router create a large number of 
> handler threads will also increase the number of thread switches and cannot 
> maximize the use of machine performance.
>  
> There are usually multiple ns downstream of the router. If the handler 
> forwards the request to an ns with poor performance, it will cause the 
> handler to wait for a long time. Due to the reduction of available handlers, 
> the router's ability to handle ns requests with normal performance will be 
> reduced. From the perspective of the client, the performance of the 
> downstream ns of the router has deteriorated at this time. We often find that 
> the call queue of the downstream ns is not high, but the call queue of the 
> router is very high.
>  
> Therefore, although the main function of the router is to federate and handle 
> requests from multiple NSs, the current synchronous RPC performance cannot 
> satisfy the scenario where there are many NSs downstream of the router. Even 
> if the concurrent performance of the router can be improved by increasing the 
> number of handlers, it is still relatively slow. More threads will increase 
> the CPU context switching time, and in fact many of the handler threads are 
> in a blocked state, which is undoubtedly a waste of thread resources. When a 
> request enters the router, there is no guarantee that there will be a running 
> handler at this time.
>  
> Therefore, I consider asynchronous router rpc. Please view the *pdf* for the 
> complete solution.
>  
> Welcome everyone to exchange and discuss!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, 

[jira] [Commented] (HDFS-17464) Improve some logs output in class FsDatasetImpl

2024-05-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847961#comment-17847961
 ] 

ASF GitHub Bot commented on HDFS-17464:
---

ayushtkn commented on PR #6724:
URL: https://github.com/apache/hadoop/pull/6724#issuecomment-2121001399

   @haiyang1987 / @ZanderXu anyone hitting the merge button?




> Improve some logs output in class FsDatasetImpl
> ---
>
> Key: HDFS-17464
> URL: https://issues.apache.org/jira/browse/HDFS-17464
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.4.0
>Reporter: farmmamba
>Assignee: farmmamba
>Priority: Minor
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17098) DatanodeManager does not handle null storage type properly

2024-05-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847888#comment-17847888
 ] 

ASF GitHub Bot commented on HDFS-17098:
---

Hexiaoqiao opened a new pull request, #6840:
URL: https://github.com/apache/hadoop/pull/6840

   
   
   ### Description of PR
   1. From https://github.com/apache/hadoop/pull/6035 which contributed by 
@teamconfx.
   2. Fix checkstyle and try to trigger Yetus again.
   
   ### How was this patch tested?
   
   
   ### For code changes:
   
   - [ ] Does the title or this PR starts with the corresponding JIRA issue id 
(e.g. 'HADOOP-17799. Your PR title ...')?
   - [ ] Object storage: have the integration tests been executed and the 
endpoint declared according to the connector-specific documentation?
   - [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
   - [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, 
`NOTICE-binary` files?
   




> DatanodeManager does not handle null storage type properly
> --
>
> Key: HDFS-17098
> URL: https://issues.apache.org/jira/browse/HDFS-17098
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ConfX
>Priority: Critical
>  Labels: pull-request-available
> Attachments: reproduce.sh
>
>
> h2. What happened:
> Got a {{NullPointerException}} without message when sorting datanodes in 
> {{{}NetworkTopology{}}}.
> h2. Where's the bug:
> In line 654 of {{{}DatanodeManager{}}}, the manager creates a second sorter 
> using the standard {{Comparator}} class:
> {noformat}
> Comparator comp =
>         Comparator.comparing(DatanodeInfoWithStorage::getStorageType);
> secondarySort = list -> Collections.sort(list, comp);{noformat}
> This comparator is then used in {{NetworkTopology}} as a secondary sort to 
> break ties:
> {noformat}
> if (secondarySort != null) {
>         // a secondary sort breaks the tie between nodes.
>         secondarySort.accept(nodesList);
> }{noformat}
> However, if the storage type is {{{}null{}}}, a {{NullPointerException}} 
> would be thrown since the default {{Comparator.comparing}} cannot handle 
> comparison between null values.
> h2. How to reproduce:
> (1) Set {{dfs.heartbeat.interval}} to {{{}1753310367{}}}, and 
> {{dfs.namenode.read.considerStorageType}} to {{true}}
> (2) Run test: 
> {{org.apache.hadoop.hdfs.server.blockmanagement.TestSortLocatedBlock#testAviodStaleAndSlowDatanodes}}
> h2. Stacktrace:
> {noformat}
> java.lang.NullPointerException
>     at 
> java.base/java.util.Comparator.lambda$comparing$77a9974f$1(Comparator.java:469)
>     at java.base/java.util.TimSort.countRunAndMakeAscending(TimSort.java:355)
>     at java.base/java.util.TimSort.sort(TimSort.java:220)
>     at java.base/java.util.Arrays.sort(Arrays.java:1515)
>     at java.base/java.util.ArrayList.sort(ArrayList.java:1750)
>     at java.base/java.util.Collections.sort(Collections.java:179)
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.lambda$createSecondaryNodeSorter$0(DatanodeManager.java:654)
>     at 
> org.apache.hadoop.net.NetworkTopology.sortByDistance(NetworkTopology.java:983)
>     at 
> org.apache.hadoop.net.NetworkTopology.sortByDistanceUsingNetworkLocation(NetworkTopology.java:946)
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.sortLocatedBlock(DatanodeManager.java:637)
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.sortLocatedBlocks(DatanodeManager.java:554)
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.TestSortLocatedBlock.testAviodStaleAndSlowDatanodes(TestSortLocatedBlock.java:144){noformat}
> For an easy reproduction, run the reproduce.sh in the attachment. We are 
> happy to provide a patch if this issue is confirmed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17529) Improve router state store cache entry deletion

2024-05-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847861#comment-17847861
 ] 

ASF GitHub Bot commented on HDFS-17529:
---

hadoop-yetus commented on PR #6833:
URL: https://github.com/apache/hadoop/pull/6833#issuecomment-2120301224

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |  17m 31s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  0s |  |  xmllint was not available.  |
   | +0 :ok: |  markdownlint  |   0m  0s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 2 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  50m  0s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 41s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |   0m 36s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   0m 29s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 41s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 42s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 30s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   1m 19s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  41m 29s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | -1 :x: |  mvninstall  |   0m 28s | 
[/patch-mvninstall-hadoop-hdfs-project_hadoop-hdfs-rbf.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6833/2/artifact/out/patch-mvninstall-hadoop-hdfs-project_hadoop-hdfs-rbf.txt)
 |  hadoop-hdfs-rbf in the patch failed.  |
   | -1 :x: |  compile  |   0m 32s | 
[/patch-compile-hadoop-hdfs-project_hadoop-hdfs-rbf-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6833/2/artifact/out/patch-compile-hadoop-hdfs-project_hadoop-hdfs-rbf-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt)
 |  hadoop-hdfs-rbf in the patch failed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.  |
   | -1 :x: |  javac  |   0m 32s | 
[/patch-compile-hadoop-hdfs-project_hadoop-hdfs-rbf-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6833/2/artifact/out/patch-compile-hadoop-hdfs-project_hadoop-hdfs-rbf-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt)
 |  hadoop-hdfs-rbf in the patch failed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.  |
   | -1 :x: |  compile  |   0m 28s | 
[/patch-compile-hadoop-hdfs-project_hadoop-hdfs-rbf-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6833/2/artifact/out/patch-compile-hadoop-hdfs-project_hadoop-hdfs-rbf-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.txt)
 |  hadoop-hdfs-rbf in the patch failed with JDK Private 
Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.  |
   | -1 :x: |  javac  |   0m 28s | 
[/patch-compile-hadoop-hdfs-project_hadoop-hdfs-rbf-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6833/2/artifact/out/patch-compile-hadoop-hdfs-project_hadoop-hdfs-rbf-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.txt)
 |  hadoop-hdfs-rbf in the patch failed with JDK Private 
Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 18s |  |  the patch passed  |
   | -1 :x: |  mvnsite  |   0m 29s | 
[/patch-mvnsite-hadoop-hdfs-project_hadoop-hdfs-rbf.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6833/2/artifact/out/patch-mvnsite-hadoop-hdfs-project_hadoop-hdfs-rbf.txt)
 |  hadoop-hdfs-rbf in the patch failed.  |
   | -1 :x: |  javadoc  |   0m 28s | 
[/patch-javadoc-hadoop-hdfs-project_hadoop-hdfs-rbf-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6833/2/artifact/out/patch-javadoc-hadoop-hdfs-project_hadoop-hdfs-rbf-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt)
 |  hadoop-hdfs-rbf in the 

[jira] [Commented] (HDFS-17532) Allow router state store cache update to overwrite and delete in parallel

2024-05-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847846#comment-17847846
 ] 

ASF GitHub Bot commented on HDFS-17532:
---

hadoop-yetus commented on PR #6839:
URL: https://github.com/apache/hadoop/pull/6839#issuecomment-2120145776

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |  15m  9s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  0s |  |  xmllint was not available.  |
   | +0 :ok: |  markdownlint  |   0m  0s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | -1 :x: |  mvninstall  |   2m  6s | 
[/branch-mvninstall-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6839/1/artifact/out/branch-mvninstall-root.txt)
 |  root in trunk failed.  |
   | -1 :x: |  compile  |   0m 24s | 
[/branch-compile-hadoop-hdfs-project_hadoop-hdfs-rbf-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6839/1/artifact/out/branch-compile-hadoop-hdfs-project_hadoop-hdfs-rbf-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt)
 |  hadoop-hdfs-rbf in trunk failed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.  |
   | -1 :x: |  compile  |   0m 24s | 
[/branch-compile-hadoop-hdfs-project_hadoop-hdfs-rbf-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6839/1/artifact/out/branch-compile-hadoop-hdfs-project_hadoop-hdfs-rbf-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.txt)
 |  hadoop-hdfs-rbf in trunk failed with JDK Private 
Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.  |
   | -0 :warning: |  checkstyle  |   0m 21s | 
[/buildtool-branch-checkstyle-hadoop-hdfs-project_hadoop-hdfs-rbf.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6839/1/artifact/out/buildtool-branch-checkstyle-hadoop-hdfs-project_hadoop-hdfs-rbf.txt)
 |  The patch fails to run checkstyle in hadoop-hdfs-rbf  |
   | -1 :x: |  mvnsite  |   0m 23s | 
[/branch-mvnsite-hadoop-hdfs-project_hadoop-hdfs-rbf.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6839/1/artifact/out/branch-mvnsite-hadoop-hdfs-project_hadoop-hdfs-rbf.txt)
 |  hadoop-hdfs-rbf in trunk failed.  |
   | -1 :x: |  javadoc  |   0m 23s | 
[/branch-javadoc-hadoop-hdfs-project_hadoop-hdfs-rbf-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6839/1/artifact/out/branch-javadoc-hadoop-hdfs-project_hadoop-hdfs-rbf-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt)
 |  hadoop-hdfs-rbf in trunk failed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.  |
   | -1 :x: |  javadoc  |   0m 23s | 
[/branch-javadoc-hadoop-hdfs-project_hadoop-hdfs-rbf-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6839/1/artifact/out/branch-javadoc-hadoop-hdfs-project_hadoop-hdfs-rbf-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.txt)
 |  hadoop-hdfs-rbf in trunk failed with JDK Private 
Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.  |
   | -1 :x: |  spotbugs  |   0m 23s | 
[/branch-spotbugs-hadoop-hdfs-project_hadoop-hdfs-rbf.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6839/1/artifact/out/branch-spotbugs-hadoop-hdfs-project_hadoop-hdfs-rbf.txt)
 |  hadoop-hdfs-rbf in trunk failed.  |
   | +1 :green_heart: |  shadedclient  |   2m 45s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | -1 :x: |  mvninstall  |   0m 23s | 
[/patch-mvninstall-hadoop-hdfs-project_hadoop-hdfs-rbf.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6839/1/artifact/out/patch-mvninstall-hadoop-hdfs-project_hadoop-hdfs-rbf.txt)
 |  hadoop-hdfs-rbf in the patch failed.  |
   | -1 :x: |  compile  |   0m 23s | 
[/patch-compile-hadoop-hdfs-project_hadoop-hdfs-rbf-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6839/1/artifact/out/patch-compile-hadoop-hdfs-project_hadoop-hdfs-rbf-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt)
 |  hadoop-hdfs-rbf in the patch failed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.  |
   | -1 :x: |  javac  |   0m 23s | 

[jira] [Commented] (HDFS-17532) Allow router state store cache update to overwrite and delete in parallel

2024-05-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847835#comment-17847835
 ] 

ASF GitHub Bot commented on HDFS-17532:
---

kokonguyen191 commented on PR #6839:
URL: https://github.com/apache/hadoop/pull/6839#issuecomment-2120093100

   @ZanderXu This is the other half split from 
https://github.com/apache/hadoop/pull/6833, can help me review if you're free, 
thanks!




> Allow router state store cache update to overwrite and delete in parallel
> -
>
> Key: HDFS-17532
> URL: https://issues.apache.org/jira/browse/HDFS-17532
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, rbf
>Reporter: Felix N
>Assignee: Felix N
>Priority: Minor
>  Labels: pull-request-available
>
> Current implementation for router state store update is quite inefficient, so 
> much that when routers are removed and a lot of NameNodeMembership records 
> are deleted in a short burst, the deletions triggered a router safemode in 
> our cluster and caused a lot of troubles.
> This ticket aims to allow the overwrite part and delete part of 
> org.apache.hadoop.hdfs.server.federation.store.CachedRecordStore#overrideExpiredRecords
>  to run in parallel.
> See HDFS-17529 for the other half of this improvement.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17532) Allow router state store cache update to overwrite and delete in parallel

2024-05-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847833#comment-17847833
 ] 

ASF GitHub Bot commented on HDFS-17532:
---

kokonguyen191 opened a new pull request, #6839:
URL: https://github.com/apache/hadoop/pull/6839

   ### Description of PR
   
   This ticket aims to allow the overwrite part and delete part of 
org.apache.hadoop.hdfs.server.federation.store.CachedRecordStore#overrideExpiredRecords
 to run in parallel.
   
   Sister ticket to HDFS-17529




> Allow router state store cache update to overwrite and delete in parallel
> -
>
> Key: HDFS-17532
> URL: https://issues.apache.org/jira/browse/HDFS-17532
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, rbf
>Reporter: Felix N
>Assignee: Felix N
>Priority: Minor
>
> Current implementation for router state store update is quite inefficient, so 
> much that when routers are removed and a lot of NameNodeMembership records 
> are deleted in a short burst, the deletions triggered a router safemode in 
> our cluster and caused a lot of troubles.
> This ticket aims to allow the overwrite part and delete part of 
> org.apache.hadoop.hdfs.server.federation.store.CachedRecordStore#overrideExpiredRecords
>  to run in parallel.
> See HDFS-17529 for the other half of this improvement.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17532) Allow router state store cache update to overwrite and delete in parallel

2024-05-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-17532:
--
Labels: pull-request-available  (was: )

> Allow router state store cache update to overwrite and delete in parallel
> -
>
> Key: HDFS-17532
> URL: https://issues.apache.org/jira/browse/HDFS-17532
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, rbf
>Reporter: Felix N
>Assignee: Felix N
>Priority: Minor
>  Labels: pull-request-available
>
> Current implementation for router state store update is quite inefficient, so 
> much that when routers are removed and a lot of NameNodeMembership records 
> are deleted in a short burst, the deletions triggered a router safemode in 
> our cluster and caused a lot of troubles.
> This ticket aims to allow the overwrite part and delete part of 
> org.apache.hadoop.hdfs.server.federation.store.CachedRecordStore#overrideExpiredRecords
>  to run in parallel.
> See HDFS-17529 for the other half of this improvement.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17529) Improve router state store cache entry deletion

2024-05-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847816#comment-17847816
 ] 

ASF GitHub Bot commented on HDFS-17529:
---

kokonguyen191 commented on PR #6833:
URL: https://github.com/apache/hadoop/pull/6833#issuecomment-2120015079

   @ZanderXu Thanks for the review, I have updated the codes + changed the 
ticket/PR title for the deletion part only, will open another PR for the async 
part later. I'm a bit confused about point 3, can you elaborate a bit on that 
part?




> Improve router state store cache entry deletion
> ---
>
> Key: HDFS-17529
> URL: https://issues.apache.org/jira/browse/HDFS-17529
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, rbf
>Reporter: Felix N
>Assignee: Felix N
>Priority: Major
>  Labels: pull-request-available
>
> Current implementation for router state store update is quite inefficient, so 
> much that when routers are removed and a lot of NameNodeMembership records 
> are deleted in a short burst, the deletions triggered a router safemode in 
> our cluster and caused a lot of troubles.
> This ticket contains 2 parts: improving the deletion process for ZK state 
> store implementation, and allowing the overwrite part and delete part of 
> org.apache.hadoop.hdfs.server.federation.store.CachedRecordStore#overrideExpiredRecords
>  to run in parallel.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17529) Improve router state store cache update

2024-05-19 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847756#comment-17847756
 ] 

ASF GitHub Bot commented on HDFS-17529:
---

kokonguyen191 commented on PR #6833:
URL: https://github.com/apache/hadoop/pull/6833#issuecomment-2119557407

   @ZanderXu Can you help me take a look when you're free, thanks!




> Improve router state store cache update
> ---
>
> Key: HDFS-17529
> URL: https://issues.apache.org/jira/browse/HDFS-17529
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, rbf
>Reporter: Felix N
>Assignee: Felix N
>Priority: Major
>  Labels: pull-request-available
>
> Current implementation for router state store update is quite inefficient, so 
> much that when routers are removed and a lot of NameNodeMembership records 
> are deleted in a short burst, the deletions triggered a router safemode in 
> our cluster and caused a lot of troubles.
> This ticket contains 2 parts: improving the deletion process for ZK state 
> store implementation, and allowing the overwrite part and delete part of 
> org.apache.hadoop.hdfs.server.federation.store.CachedRecordStore#overrideExpiredRecords
>  to run in parallel.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17529) Improve router state store cache update

2024-05-19 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847755#comment-17847755
 ] 

ASF GitHub Bot commented on HDFS-17529:
---

kokonguyen191 commented on PR #6833:
URL: https://github.com/apache/hadoop/pull/6833#issuecomment-2119556931

   The failed unit tests really look like they were related to the changes but 
they aren't. Both tests fail without the patch, and seem to have failed for 
some past MRs already.




> Improve router state store cache update
> ---
>
> Key: HDFS-17529
> URL: https://issues.apache.org/jira/browse/HDFS-17529
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, rbf
>Reporter: Felix N
>Assignee: Felix N
>Priority: Major
>  Labels: pull-request-available
>
> Current implementation for router state store update is quite inefficient, so 
> much that when routers are removed and a lot of NameNodeMembership records 
> are deleted in a short burst, the deletions triggered a router safemode in 
> our cluster and caused a lot of troubles.
> This ticket contains 2 parts: improving the deletion process for ZK state 
> store implementation, and allowing the overwrite part and delete part of 
> org.apache.hadoop.hdfs.server.federation.store.CachedRecordStore#overrideExpiredRecords
>  to run in parallel.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17531) RBF: Aynchronous router RPC.

2024-05-19 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847745#comment-17847745
 ] 

ASF GitHub Bot commented on HDFS-17531:
---

KeeProMise commented on PR #6838:
URL: https://github.com/apache/hadoop/pull/6838#issuecomment-2119506196

   @ayushtkn @slfan1989 hi, thanks for you replay, I sent the discussion to 
common-...@hadoop.apache.org.




> RBF: Aynchronous router RPC.
> 
>
> Key: HDFS-17531
> URL: https://issues.apache.org/jira/browse/HDFS-17531
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jian Zhang
>Assignee: Jian Zhang
>Priority: Major
>  Labels: pull-request-available
> Attachments: Aynchronous router.pdf, HDFS-17531.001.patch, 
> image-2024-05-19-18-07-51-282.png
>
>
> *Description*
> Currently, the main function of the Router service is to accept client 
> requests, forward the requests to the corresponding downstream ns, and then 
> return the results of the downstream ns to the client. The link is as follows:
> *!image-2024-05-19-18-07-51-282.png|width=900,height=300!*
> The main threads involved in the rpc link are:
> {*}Read{*}: Get the client request and put it into the call queue *(1)*
> {*}Handler{*}:
> Extract call *(2)* from the call queue, process the call, generate a new 
> call, place it in the call of the connection thread, and wait for the call 
> processing to complete *(3)*
> After being awakened by the connection thread, process the response and put 
> it into the response queue *(5)*
> *Connection:*
> Hold the link with downstream ns, send the call from the call to the 
> downstream ns (via {*}rpcRequestThread{*}), and obtain a response from ns. 
> Based on the call in the response, notify the call to complete processing 
> *(4)*
> *Responder:*
> Retrieve the response queue from the queue *(6)* and return it to the client
>  
> *Shortcoming*
> Even if the *connection* thread can send more requests to downstream 
> nameservices, since *(3)* and *(4)* are synchronous, when the *handler* 
> thread adds the call to connection.calls, it needs to wait until the 
> *connection* notifies the call to complete, and then Only after the response 
> is put into the response queue can a new call be obtained from the call queue 
> and processed. Therefore, the concurrency performance of the router is 
> limited by the number of handlers; a simple example is as follows: If the 
> number of handlers is 1 and the maximum number of calls in the connection 
> thread is 10, then even if the connection thread can send 10 requests to the 
> downstream ns, since the number of handlers is 1, the router can only process 
> one request after another. 
>  
> Since the performance of router rpc is mainly limited by the number of 
> handlers, the most effective way to improve rpc performance currently is to 
> increase the number of handlers. Letting the router create a large number of 
> handler threads will also increase the number of thread switches and cannot 
> maximize the use of machine performance.
>  
> There are usually multiple ns downstream of the router. If the handler 
> forwards the request to an ns with poor performance, it will cause the 
> handler to wait for a long time. Due to the reduction of available handlers, 
> the router's ability to handle ns requests with normal performance will be 
> reduced. From the perspective of the client, the performance of the 
> downstream ns of the router has deteriorated at this time. We often find that 
> the call queue of the downstream ns is not high, but the call queue of the 
> router is very high.
>  
> Therefore, although the main function of the router is to federate and handle 
> requests from multiple NSs, the current synchronous RPC performance cannot 
> satisfy the scenario where there are many NSs downstream of the router. Even 
> if the concurrent performance of the router can be improved by increasing the 
> number of handlers, it is still relatively slow. More threads will increase 
> the CPU context switching time, and in fact many of the handler threads are 
> in a blocked state, which is undoubtedly a waste of thread resources. When a 
> request enters the router, there is no guarantee that there will be a running 
> handler at this time.
>  
> Therefore, I consider asynchronous router rpc. Please view the *pdf* for the 
> complete solution.
>  
> Welcome everyone to exchange and discuss!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17531) RBF: Aynchronous router RPC.

2024-05-19 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847740#comment-17847740
 ] 

ASF GitHub Bot commented on HDFS-17531:
---

slfan1989 commented on PR #6838:
URL: https://github.com/apache/hadoop/pull/6838#issuecomment-2119459510

   @KeeProMise Thanks for the contribution! This pr is too large and it seems 
that it cannot be reviewed.
   
   Let’s first follow the process and discuss it on the hadoop-common mailing 
list.  We should split the PR for easier review and also provide benchmark data.




> RBF: Aynchronous router RPC.
> 
>
> Key: HDFS-17531
> URL: https://issues.apache.org/jira/browse/HDFS-17531
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jian Zhang
>Assignee: Jian Zhang
>Priority: Major
>  Labels: pull-request-available
> Attachments: Aynchronous router.pdf, HDFS-17531.001.patch, 
> image-2024-05-19-18-07-51-282.png
>
>
> *Description*
> Currently, the main function of the Router service is to accept client 
> requests, forward the requests to the corresponding downstream ns, and then 
> return the results of the downstream ns to the client. The link is as follows:
> *!image-2024-05-19-18-07-51-282.png|width=900,height=300!*
> The main threads involved in the rpc link are:
> {*}Read{*}: Get the client request and put it into the call queue *(1)*
> {*}Handler{*}:
> Extract call *(2)* from the call queue, process the call, generate a new 
> call, place it in the call of the connection thread, and wait for the call 
> processing to complete *(3)*
> After being awakened by the connection thread, process the response and put 
> it into the response queue *(5)*
> *Connection:*
> Hold the link with downstream ns, send the call from the call to the 
> downstream ns (via {*}rpcRequestThread{*}), and obtain a response from ns. 
> Based on the call in the response, notify the call to complete processing 
> *(4)*
> *Responder:*
> Retrieve the response queue from the queue *(6)* and return it to the client
>  
> *Shortcoming*
> Even if the *connection* thread can send more requests to downstream 
> nameservices, since *(3)* and *(4)* are synchronous, when the *handler* 
> thread adds the call to connection.calls, it needs to wait until the 
> *connection* notifies the call to complete, and then Only after the response 
> is put into the response queue can a new call be obtained from the call queue 
> and processed. Therefore, the concurrency performance of the router is 
> limited by the number of handlers; a simple example is as follows: If the 
> number of handlers is 1 and the maximum number of calls in the connection 
> thread is 10, then even if the connection thread can send 10 requests to the 
> downstream ns, since the number of handlers is 1, the router can only process 
> one request after another. 
>  
> Since the performance of router rpc is mainly limited by the number of 
> handlers, the most effective way to improve rpc performance currently is to 
> increase the number of handlers. Letting the router create a large number of 
> handler threads will also increase the number of thread switches and cannot 
> maximize the use of machine performance.
>  
> There are usually multiple ns downstream of the router. If the handler 
> forwards the request to an ns with poor performance, it will cause the 
> handler to wait for a long time. Due to the reduction of available handlers, 
> the router's ability to handle ns requests with normal performance will be 
> reduced. From the perspective of the client, the performance of the 
> downstream ns of the router has deteriorated at this time. We often find that 
> the call queue of the downstream ns is not high, but the call queue of the 
> router is very high.
>  
> Therefore, although the main function of the router is to federate and handle 
> requests from multiple NSs, the current synchronous RPC performance cannot 
> satisfy the scenario where there are many NSs downstream of the router. Even 
> if the concurrent performance of the router can be improved by increasing the 
> number of handlers, it is still relatively slow. More threads will increase 
> the CPU context switching time, and in fact many of the handler threads are 
> in a blocked state, which is undoubtedly a waste of thread resources. When a 
> request enters the router, there is no guarantee that there will be a running 
> handler at this time.
>  
> Therefore, I consider asynchronous router rpc. Please view the *pdf* for the 
> complete solution.
>  
> Welcome everyone to exchange and discuss!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, 

[jira] [Commented] (HDFS-17531) RBF: Aynchronous router RPC.

2024-05-19 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847730#comment-17847730
 ] 

ASF GitHub Bot commented on HDFS-17531:
---

ayushtkn commented on PR #6838:
URL: https://github.com/apache/hadoop/pull/6838#issuecomment-2119352612

   passing by: discuss threads should be on hadoop dev mailing lists




> RBF: Aynchronous router RPC.
> 
>
> Key: HDFS-17531
> URL: https://issues.apache.org/jira/browse/HDFS-17531
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jian Zhang
>Assignee: Jian Zhang
>Priority: Major
>  Labels: pull-request-available
> Attachments: Aynchronous router.pdf, HDFS-17531.001.patch, 
> image-2024-05-19-18-07-51-282.png
>
>
> *Description*
> Currently, the main function of the Router service is to accept client 
> requests, forward the requests to the corresponding downstream ns, and then 
> return the results of the downstream ns to the client. The link is as follows:
> *!image-2024-05-19-18-07-51-282.png|width=900,height=300!*
> The main threads involved in the rpc link are:
> {*}Read{*}: Get the client request and put it into the call queue *(1)*
> {*}Handler{*}:
> Extract call *(2)* from the call queue, process the call, generate a new 
> call, place it in the call of the connection thread, and wait for the call 
> processing to complete *(3)*
> After being awakened by the connection thread, process the response and put 
> it into the response queue *(5)*
> *Connection:*
> Hold the link with downstream ns, send the call from the call to the 
> downstream ns (via {*}rpcRequestThread{*}), and obtain a response from ns. 
> Based on the call in the response, notify the call to complete processing 
> *(4)*
> *Responder:*
> Retrieve the response queue from the queue *(6)* and return it to the client
>  
> *Shortcoming*
> Even if the *connection* thread can send more requests to downstream 
> nameservices, since *(3)* and *(4)* are synchronous, when the *handler* 
> thread adds the call to connection.calls, it needs to wait until the 
> *connection* notifies the call to complete, and then Only after the response 
> is put into the response queue can a new call be obtained from the call queue 
> and processed. Therefore, the concurrency performance of the router is 
> limited by the number of handlers; a simple example is as follows: If the 
> number of handlers is 1 and the maximum number of calls in the connection 
> thread is 10, then even if the connection thread can send 10 requests to the 
> downstream ns, since the number of handlers is 1, the router can only process 
> one request after another. 
>  
> Since the performance of router rpc is mainly limited by the number of 
> handlers, the most effective way to improve rpc performance currently is to 
> increase the number of handlers. Letting the router create a large number of 
> handler threads will also increase the number of thread switches and cannot 
> maximize the use of machine performance.
>  
> There are usually multiple ns downstream of the router. If the handler 
> forwards the request to an ns with poor performance, it will cause the 
> handler to wait for a long time. Due to the reduction of available handlers, 
> the router's ability to handle ns requests with normal performance will be 
> reduced. From the perspective of the client, the performance of the 
> downstream ns of the router has deteriorated at this time. We often find that 
> the call queue of the downstream ns is not high, but the call queue of the 
> router is very high.
>  
> Therefore, although the main function of the router is to federate and handle 
> requests from multiple NSs, the current synchronous RPC performance cannot 
> satisfy the scenario where there are many NSs downstream of the router. Even 
> if the concurrent performance of the router can be improved by increasing the 
> number of handlers, it is still relatively slow. More threads will increase 
> the CPU context switching time, and in fact many of the handler threads are 
> in a blocked state, which is undoubtedly a waste of thread resources. When a 
> request enters the router, there is no guarantee that there will be a running 
> handler at this time.
>  
> Therefore, I consider asynchronous router rpc. Please view the *pdf* for the 
> complete solution.
>  
> Welcome everyone to exchange and discuss!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17531) RBF: Aynchronous router RPC.

2024-05-19 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847663#comment-17847663
 ] 

ASF GitHub Bot commented on HDFS-17531:
---

KeeProMise opened a new pull request, #6838:
URL: https://github.com/apache/hadoop/pull/6838

   
   
   ### Description of PR
   see: https://issues.apache.org/jira/browse/HDFS-17531
   
   ### How was this patch tested?
   TestNoNamenodesAvailableLongTime
   TestObserverWithRouter
   TestRouterFederationRename
   TestRouterFederationRenamePermission
   TestRouterQuota
   TestRouterRefreshSuperUserGroupsConfiguration
   TestRouterRpc
   TestRouterRpcMultiDestination
   TestRouterRPCMultipleDestinationMountTableResolver
   TestRouterRpcSingleNS
   TestRouterRpcStoragePolicySatisfier
   TestRouterUserMappings
   
   ### For code changes:
   
   - [ ] Does the title or this PR starts with the corresponding JIRA issue id 
(e.g. 'HADOOP-17799. Your PR title ...')?
   - [ ] Object storage: have the integration tests been executed and the 
endpoint declared according to the connector-specific documentation?
   - [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
   - [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, 
`NOTICE-binary` files?
   
   




> RBF: Aynchronous router RPC.
> 
>
> Key: HDFS-17531
> URL: https://issues.apache.org/jira/browse/HDFS-17531
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jian Zhang
>Assignee: Jian Zhang
>Priority: Major
> Attachments: Aynchronous router.pdf, image-2024-05-19-18-07-51-282.png
>
>
> *Description*
> Currently, the main function of the Router service is to accept client 
> requests, forward the requests to the corresponding downstream ns, and then 
> return the results of the downstream ns to the client. The link is as follows:
> *!image-2024-05-19-18-07-51-282.png|width=900,height=300!*
> The main threads involved in the rpc link are:
> {*}Read{*}: Get the client request and put it into the call queue *(1)*
> {*}Handler{*}:
> Extract call *(2)* from the call queue, process the call, generate a new 
> call, place it in the call of the connection thread, and wait for the call 
> processing to complete *(3)*
> After being awakened by the connection thread, process the response and put 
> it into the response queue *(5)*
> *Connection:*
> Hold the link with downstream ns, send the call from the call to the 
> downstream ns (via {*}rpcRequestThread{*}), and obtain a response from ns. 
> Based on the call in the response, notify the call to complete processing 
> *(4)*
> *Responder:*
> Retrieve the response queue from the queue *(6)* and return it to the client
>  
> *Shortcoming*
> Even if the *connection* thread can send more requests to downstream 
> nameservices, since *(3)* and *(4)* are synchronous, when the *handler* 
> thread adds the call to connection.calls, it needs to wait until the 
> *connection* notifies the call to complete, and then Only after the response 
> is put into the response queue can a new call be obtained from the call queue 
> and processed. Therefore, the concurrency performance of the router is 
> limited by the number of handlers; a simple example is as follows: If the 
> number of handlers is 1 and the maximum number of calls in the connection 
> thread is 10, then even if the connection thread can send 10 requests to the 
> downstream ns, since the number of handlers is 1, the router can only process 
> one request after another. 
>  
> Since the performance of router rpc is mainly limited by the number of 
> handlers, the most effective way to improve rpc performance currently is to 
> increase the number of handlers. Letting the router create a large number of 
> handler threads will also increase the number of thread switches and cannot 
> maximize the use of machine performance.
>  
> There are usually multiple ns downstream of the router. If the handler 
> forwards the request to an ns with poor performance, it will cause the 
> handler to wait for a long time. Due to the reduction of available handlers, 
> the router's ability to handle ns requests with normal performance will be 
> reduced. From the perspective of the client, the performance of the 
> downstream ns of the router has deteriorated at this time. We often find that 
> the call queue of the downstream ns is not high, but the call queue of the 
> router is very high.
>  
> Therefore, although the main function of the router is to federate and handle 
> requests from multiple NSs, the current synchronous RPC performance cannot 
> satisfy the scenario where there are many NSs downstream of the router. Even 
> if the concurrent performance of 

[jira] [Updated] (HDFS-17531) RBF: Aynchronous router RPC.

2024-05-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-17531:
--
Labels: pull-request-available  (was: )

> RBF: Aynchronous router RPC.
> 
>
> Key: HDFS-17531
> URL: https://issues.apache.org/jira/browse/HDFS-17531
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jian Zhang
>Assignee: Jian Zhang
>Priority: Major
>  Labels: pull-request-available
> Attachments: Aynchronous router.pdf, image-2024-05-19-18-07-51-282.png
>
>
> *Description*
> Currently, the main function of the Router service is to accept client 
> requests, forward the requests to the corresponding downstream ns, and then 
> return the results of the downstream ns to the client. The link is as follows:
> *!image-2024-05-19-18-07-51-282.png|width=900,height=300!*
> The main threads involved in the rpc link are:
> {*}Read{*}: Get the client request and put it into the call queue *(1)*
> {*}Handler{*}:
> Extract call *(2)* from the call queue, process the call, generate a new 
> call, place it in the call of the connection thread, and wait for the call 
> processing to complete *(3)*
> After being awakened by the connection thread, process the response and put 
> it into the response queue *(5)*
> *Connection:*
> Hold the link with downstream ns, send the call from the call to the 
> downstream ns (via {*}rpcRequestThread{*}), and obtain a response from ns. 
> Based on the call in the response, notify the call to complete processing 
> *(4)*
> *Responder:*
> Retrieve the response queue from the queue *(6)* and return it to the client
>  
> *Shortcoming*
> Even if the *connection* thread can send more requests to downstream 
> nameservices, since *(3)* and *(4)* are synchronous, when the *handler* 
> thread adds the call to connection.calls, it needs to wait until the 
> *connection* notifies the call to complete, and then Only after the response 
> is put into the response queue can a new call be obtained from the call queue 
> and processed. Therefore, the concurrency performance of the router is 
> limited by the number of handlers; a simple example is as follows: If the 
> number of handlers is 1 and the maximum number of calls in the connection 
> thread is 10, then even if the connection thread can send 10 requests to the 
> downstream ns, since the number of handlers is 1, the router can only process 
> one request after another. 
>  
> Since the performance of router rpc is mainly limited by the number of 
> handlers, the most effective way to improve rpc performance currently is to 
> increase the number of handlers. Letting the router create a large number of 
> handler threads will also increase the number of thread switches and cannot 
> maximize the use of machine performance.
>  
> There are usually multiple ns downstream of the router. If the handler 
> forwards the request to an ns with poor performance, it will cause the 
> handler to wait for a long time. Due to the reduction of available handlers, 
> the router's ability to handle ns requests with normal performance will be 
> reduced. From the perspective of the client, the performance of the 
> downstream ns of the router has deteriorated at this time. We often find that 
> the call queue of the downstream ns is not high, but the call queue of the 
> router is very high.
>  
> Therefore, although the main function of the router is to federate and handle 
> requests from multiple NSs, the current synchronous RPC performance cannot 
> satisfy the scenario where there are many NSs downstream of the router. Even 
> if the concurrent performance of the router can be improved by increasing the 
> number of handlers, it is still relatively slow. More threads will increase 
> the CPU context switching time, and in fact many of the handler threads are 
> in a blocked state, which is undoubtedly a waste of thread resources. When a 
> request enters the router, there is no guarantee that there will be a running 
> handler at this time.
>  
> Therefore, I consider asynchronous router rpc. Please view the *pdf* for the 
> complete solution.
>  
> Welcome everyone to exchange and discuss!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17410) [FGL] Client RPCs that changes file attributes supports fine-grained lock

2024-05-18 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847610#comment-17847610
 ] 

ASF GitHub Bot commented on HDFS-17410:
---

hfutatzhanghb commented on code in PR #6634:
URL: https://github.com/apache/hadoop/pull/6634#discussion_r1605921502


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java:
##
@@ -3654,14 +3654,15 @@ void setQuota(String src, long nsQuota, long ssQuota, 
StorageType type)
   checkSuperuserPrivilege(operationName, src);
 }
 try {
-  writeLock();
+  // Need to compute the curren space usage
+  writeLock(FSNamesystemLockMode.GLOBAL);

Review Comment:
   @ZanderXu Got it, Thanks sir.





> [FGL] Client RPCs that changes file attributes supports fine-grained lock
> -
>
> Key: HDFS-17410
> URL: https://issues.apache.org/jira/browse/HDFS-17410
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
>
> There are some client RPCs are used to change file attributes.
> This ticket is used to make these RPCs supporting fine-grained lock.
>  * setReplication
>  * getStoragePolicies
>  * setStoragePolicy
>  * unsetStoragePolicy
>  * satisfyStoragePolicy
>  * getStoragePolicy
>  * setPermission
>  * setOwner
>  * setTimes
>  * concat
>  * truncate
>  * setQuota
>  * getQuotaUsage
>  * modifyAclEntries
>  * removeAclEntries
>  * removeDefaultAcl
>  * removeAcl
>  * setAcl
>  * getAclStatus
>  * getEZForPath
>  * getEnclosingRoot
>  * listEncryptionZones
>  * reencryptEncryptionZone
>  * listReencryptionStatus
>  * setXAttr
>  * getXAttrs
>  * listXAttrs
>  * removeXAttr



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17410) [FGL] Client RPCs that changes file attributes supports fine-grained lock

2024-05-18 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847524#comment-17847524
 ] 

ASF GitHub Bot commented on HDFS-17410:
---

ZanderXu commented on code in PR #6634:
URL: https://github.com/apache/hadoop/pull/6634#discussion_r1605789539


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java:
##
@@ -3654,14 +3654,15 @@ void setQuota(String src, long nsQuota, long ssQuota, 
StorageType type)
   checkSuperuserPrivilege(operationName, src);
 }
 try {
-  writeLock();
+  // Need to compute the curren space usage
+  writeLock(FSNamesystemLockMode.GLOBAL);

Review Comment:
   `computeQuotaUsage` needs to rely on block state to get 
storagespaceConsumed, so the GLOBAL lock is used here. After HDFS-17497 is 
merged, this global lock can be replaced by the fs lock.





> [FGL] Client RPCs that changes file attributes supports fine-grained lock
> -
>
> Key: HDFS-17410
> URL: https://issues.apache.org/jira/browse/HDFS-17410
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
>
> There are some client RPCs are used to change file attributes.
> This ticket is used to make these RPCs supporting fine-grained lock.
>  * setReplication
>  * getStoragePolicies
>  * setStoragePolicy
>  * unsetStoragePolicy
>  * satisfyStoragePolicy
>  * getStoragePolicy
>  * setPermission
>  * setOwner
>  * setTimes
>  * concat
>  * truncate
>  * setQuota
>  * getQuotaUsage
>  * modifyAclEntries
>  * removeAclEntries
>  * removeDefaultAcl
>  * removeAcl
>  * setAcl
>  * getAclStatus
>  * getEZForPath
>  * getEnclosingRoot
>  * listEncryptionZones
>  * reencryptEncryptionZone
>  * listReencryptionStatus
>  * setXAttr
>  * getXAttrs
>  * listXAttrs
>  * removeXAttr



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17529) Improve router state store cache update

2024-05-17 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847290#comment-17847290
 ] 

ASF GitHub Bot commented on HDFS-17529:
---

hadoop-yetus commented on PR #6833:
URL: https://github.com/apache/hadoop/pull/6833#issuecomment-2117603479

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 31s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  0s |  |  xmllint was not available.  |
   | +0 :ok: |  markdownlint  |   0m  0s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 2 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  44m 32s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 41s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |   0m 37s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   0m 32s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 45s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 45s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 34s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   1m 25s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  33m 54s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 32s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 32s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |   0m 32s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 30s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |   0m 30s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 20s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 33s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 30s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 25s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   1m 20s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  33m 28s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  |  30m  5s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6833/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt)
 |  hadoop-hdfs-rbf in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 40s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 158m 32s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.hdfs.server.federation.router.security.token.TestSQLDelegationTokenSecretManagerImpl
 |
   |   | hadoop.hdfs.server.federation.store.driver.TestStateStoreMySQL |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.45 ServerAPI=1.45 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6833/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6833 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint 
markdownlint |
   | uname | Linux 1ff0867a1e1e 5.15.0-106-generic #116-Ubuntu SMP Wed Apr 17 
09:17:56 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / ea4e518423a537eced42b86222955195aea361f6 |
   | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 

[jira] [Commented] (HDFS-17529) Improve router state store cache update

2024-05-17 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847230#comment-17847230
 ] 

ASF GitHub Bot commented on HDFS-17529:
---

kokonguyen191 opened a new pull request, #6833:
URL: https://github.com/apache/hadoop/pull/6833

   ### Description of PR
   
   Current implementation for router state store update is quite inefficient, 
so much that when routers are removed and a lot of NameNodeMembership records 
are deleted in a short burst, the deletions triggered a router safemode in our 
cluster and caused a lot of troubles.
   
   This ticket contains 2 parts: improving the deletion process for ZK state 
store implementation, and allowing the overwrite part and delete part of 
org.apache.hadoop.hdfs.server.federation.store.CachedRecordStore#overrideExpiredRecords
 to run in parallel.
   
   ### How was this patch tested?
   UT
   
   ### For code changes:
   
   - [x] Does the title or this PR starts with the corresponding JIRA issue id 
(e.g. 'HADOOP-17799. Your PR title ...')?




> Improve router state store cache update
> ---
>
> Key: HDFS-17529
> URL: https://issues.apache.org/jira/browse/HDFS-17529
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, rbf
>Reporter: Felix N
>Assignee: Felix N
>Priority: Major
>
> Current implementation for router state store update is quite inefficient, so 
> much that when routers are removed and a lot of NameNodeMembership records 
> are deleted in a short burst, the deletions triggered a router safemode in 
> our cluster and caused a lot of troubles.
> This ticket contains 2 parts: improving the deletion process for ZK state 
> store implementation, and allowing the overwrite part and delete part of 
> org.apache.hadoop.hdfs.server.federation.store.CachedRecordStore#overrideExpiredRecords
>  to run in parallel.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17529) Improve router state store cache update

2024-05-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-17529:
--
Labels: pull-request-available  (was: )

> Improve router state store cache update
> ---
>
> Key: HDFS-17529
> URL: https://issues.apache.org/jira/browse/HDFS-17529
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, rbf
>Reporter: Felix N
>Assignee: Felix N
>Priority: Major
>  Labels: pull-request-available
>
> Current implementation for router state store update is quite inefficient, so 
> much that when routers are removed and a lot of NameNodeMembership records 
> are deleted in a short burst, the deletions triggered a router safemode in 
> our cluster and caused a lot of troubles.
> This ticket contains 2 parts: improving the deletion process for ZK state 
> store implementation, and allowing the overwrite part and delete part of 
> org.apache.hadoop.hdfs.server.federation.store.CachedRecordStore#overrideExpiredRecords
>  to run in parallel.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17410) [FGL] Client RPCs that changes file attributes supports fine-grained lock

2024-05-17 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847203#comment-17847203
 ] 

ASF GitHub Bot commented on HDFS-17410:
---

hfutatzhanghb commented on code in PR #6634:
URL: https://github.com/apache/hadoop/pull/6634#discussion_r1604540074


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java:
##
@@ -3654,14 +3654,15 @@ void setQuota(String src, long nsQuota, long ssQuota, 
StorageType type)
   checkSuperuserPrivilege(operationName, src);
 }
 try {
-  writeLock();
+  // Need to compute the curren space usage
+  writeLock(FSNamesystemLockMode.GLOBAL);

Review Comment:
   @ZanderXu Sir. I can not understand why we use GLOBAL lock here. Seems that 
setQuota does not involve blocks opertions.   Also in `getQuotaUsage`



##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java:
##
@@ -3654,14 +3654,15 @@ void setQuota(String src, long nsQuota, long ssQuota, 
StorageType type)
   checkSuperuserPrivilege(operationName, src);
 }
 try {
-  writeLock();
+  // Need to compute the curren space usage

Review Comment:
   typo for current.





> [FGL] Client RPCs that changes file attributes supports fine-grained lock
> -
>
> Key: HDFS-17410
> URL: https://issues.apache.org/jira/browse/HDFS-17410
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
>
> There are some client RPCs are used to change file attributes.
> This ticket is used to make these RPCs supporting fine-grained lock.
>  * setReplication
>  * getStoragePolicies
>  * setStoragePolicy
>  * unsetStoragePolicy
>  * satisfyStoragePolicy
>  * getStoragePolicy
>  * setPermission
>  * setOwner
>  * setTimes
>  * concat
>  * truncate
>  * setQuota
>  * getQuotaUsage
>  * modifyAclEntries
>  * removeAclEntries
>  * removeDefaultAcl
>  * removeAcl
>  * setAcl
>  * getAclStatus
>  * getEZForPath
>  * getEnclosingRoot
>  * listEncryptionZones
>  * reencryptEncryptionZone
>  * listReencryptionStatus
>  * setXAttr
>  * getXAttrs
>  * listXAttrs
>  * removeXAttr



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17521) EC: Fix calculation errors caused by special index order

2024-05-16 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847143#comment-17847143
 ] 

ASF GitHub Bot commented on HDFS-17521:
---

haiyang1987 commented on PR #6813:
URL: https://github.com/apache/hadoop/pull/6813#issuecomment-2116574613

   > @ZanderXu Yes, I agree! This PR should be move to common. Should I create 
another HADOOP JIAR? At least I can't change HDFS jira to Hadoop jira.
   
   you can move hdfs to common 
   
   
![image](https://github.com/apache/hadoop/assets/3760130/0a726074-f5c4-4ffd-9dc1-e5b7a26aba75)
   




> EC: Fix calculation errors caused by special index order
> 
>
> Key: HDFS-17521
> URL: https://issues.apache.org/jira/browse/HDFS-17521
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chenyu Zheng
>Assignee: Chenyu Zheng
>Priority: Critical
>  Labels: pull-request-available
>
> I found that if the erasedIndexes distribution is such that the parity index 
> is in front of the data index, ec will produce wrong results when decoding.
> In fact, HDFS-15186 has described this problem, but does not fundamentally 
> solve it.
> The reason is that the code assumes that erasedIndexes is preceded by the 
> data index and followed by parity index. If there is a parity index placed in 
> front of the data index, a calculation error will occur.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17521) EC: Fix calculation errors caused by special index order

2024-05-16 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847133#comment-17847133
 ] 

ASF GitHub Bot commented on HDFS-17521:
---

zhengchenyu commented on PR #6813:
URL: https://github.com/apache/hadoop/pull/6813#issuecomment-2116531137

   > Make sense.
   > 
   > Maybe you should move this ticket to HADOOP and move the UT to common.
   @ZanderXu 
   Yes, I agree! This PR should be move to common. Should I create another 
HADOOP JIAR? At least I can't change HDFS jira to Hadoop jira.




> EC: Fix calculation errors caused by special index order
> 
>
> Key: HDFS-17521
> URL: https://issues.apache.org/jira/browse/HDFS-17521
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chenyu Zheng
>Assignee: Chenyu Zheng
>Priority: Critical
>  Labels: pull-request-available
>
> I found that if the erasedIndexes distribution is such that the parity index 
> is in front of the data index, ec will produce wrong results when decoding.
> In fact, HDFS-15186 has described this problem, but does not fundamentally 
> solve it.
> The reason is that the code assumes that erasedIndexes is preceded by the 
> data index and followed by parity index. If there is a parity index placed in 
> front of the data index, a calculation error will occur.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17509) RBF: Fix ClientProtocol.concat will throw NPE if tgr is a empty file.

2024-05-16 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847131#comment-17847131
 ] 

ASF GitHub Bot commented on HDFS-17509:
---

ZanderXu commented on PR #6784:
URL: https://github.com/apache/hadoop/pull/6784#issuecomment-2116524515

   Merged. Thanks @LiuGuH for your contribution. 




> RBF: Fix ClientProtocol.concat  will throw NPE if tgr is a empty file.
> --
>
> Key: HDFS-17509
> URL: https://issues.apache.org/jira/browse/HDFS-17509
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: liuguanghua
>Priority: Minor
>  Labels: pull-request-available
>
> hdfs dfs -concat  /tmp/merge /tmp/t1 /tmp/t2
> When /tmp/merge is a empty file, this command will throw NPE via DFSRouter. 
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17509) RBF: Fix ClientProtocol.concat will throw NPE if tgr is a empty file.

2024-05-16 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847130#comment-17847130
 ] 

ASF GitHub Bot commented on HDFS-17509:
---

ZanderXu merged PR #6784:
URL: https://github.com/apache/hadoop/pull/6784




> RBF: Fix ClientProtocol.concat  will throw NPE if tgr is a empty file.
> --
>
> Key: HDFS-17509
> URL: https://issues.apache.org/jira/browse/HDFS-17509
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: liuguanghua
>Priority: Minor
>  Labels: pull-request-available
>
> hdfs dfs -concat  /tmp/merge /tmp/t1 /tmp/t2
> When /tmp/merge is a empty file, this command will throw NPE via DFSRouter. 
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17518) In the lease monitor, if a file is closed, we should sync the editslog

2024-05-16 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847005#comment-17847005
 ] 

ASF GitHub Bot commented on HDFS-17518:
---

ThinkerLei commented on code in PR #6809:
URL: https://github.com/apache/hadoop/pull/6809#discussion_r1603594294


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/LeaseManager.java:
##
@@ -626,7 +626,8 @@ private synchronized boolean checkLeases(Collection 
leasesToCheck) {
 }
   }
   // If a lease recovery happened, we need to sync later.

Review Comment:
   @vinayakumarb  Thank you for your reply. I understand what you mean. But 
there is another case where calling `logSync()` is not required as mentioned in 
[HDFS-17519](https://issues.apache.org/jira/browse/HDFS-17519.) Do we need to 
consider this scenario described in 
[HDFS-17519](https://issues.apache.org/jira/browse/HDFS-17519)?





> In the lease monitor, if a file is closed, we should sync the editslog
> --
>
> Key: HDFS-17518
> URL: https://issues.apache.org/jira/browse/HDFS-17518
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: lei w
>Assignee: lei w
>Priority: Minor
>  Labels: pull-request-available
>
> In the lease monitor, if a file is closed,  method checklease will return 
> true, and then the edits log will not be sync. In my opinion, we should sync 
> the edits log to avoid not synchronizing the state to the standby NameNode 
> for a long time.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17518) In the lease monitor, if a file is closed, we should sync the editslog

2024-05-16 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846949#comment-17846949
 ] 

ASF GitHub Bot commented on HDFS-17518:
---

vinayakumarb commented on code in PR #6809:
URL: https://github.com/apache/hadoop/pull/6809#discussion_r160870


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/LeaseManager.java:
##
@@ -626,7 +626,8 @@ private synchronized boolean checkLeases(Collection 
leasesToCheck) {
 }
   }
   // If a lease recovery happened, we need to sync later.

Review Comment:
   @Hexiaoqiao Its perfectly fine if concurrently other write operations also 
call`logSync()`. This transaction also will get included in the logSync() 
called by other threads. Its true for vice versa as well.
   
   @ThinkerLei I wanted to make the if-else condition simple.
   ```
   boolean completed=fsn.internalReleaseLease();
   if (!needSync)) { 
   needSync = true;
   } 
   ```





> In the lease monitor, if a file is closed, we should sync the editslog
> --
>
> Key: HDFS-17518
> URL: https://issues.apache.org/jira/browse/HDFS-17518
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: lei w
>Assignee: lei w
>Priority: Minor
>  Labels: pull-request-available
>
> In the lease monitor, if a file is closed,  method checklease will return 
> true, and then the edits log will not be sync. In my opinion, we should sync 
> the edits log to avoid not synchronizing the state to the standby NameNode 
> for a long time.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17521) EC: Fix calculation errors caused by special index order

2024-05-16 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846920#comment-17846920
 ] 

ASF GitHub Bot commented on HDFS-17521:
---

zhangshuyan0 commented on PR #6813:
URL: https://github.com/apache/hadoop/pull/6813#issuecomment-2114986048

   @zhengchenyu Thanks for your reply. I agree.




> EC: Fix calculation errors caused by special index order
> 
>
> Key: HDFS-17521
> URL: https://issues.apache.org/jira/browse/HDFS-17521
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chenyu Zheng
>Assignee: Chenyu Zheng
>Priority: Critical
>  Labels: pull-request-available
>
> I found that if the erasedIndexes distribution is such that the parity index 
> is in front of the data index, ec will produce wrong results when decoding.
> In fact, HDFS-15186 has described this problem, but does not fundamentally 
> solve it.
> The reason is that the code assumes that erasedIndexes is preceded by the 
> data index and followed by parity index. If there is a parity index placed in 
> front of the data index, a calculation error will occur.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17521) EC: Fix calculation errors caused by special index order

2024-05-16 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846904#comment-17846904
 ] 

ASF GitHub Bot commented on HDFS-17521:
---

zhengchenyu commented on PR #6813:
URL: https://github.com/apache/hadoop/pull/6813#issuecomment-2114792376

   > @zhengchenyu Thanks for your explanation! I got it. This PR LGTM. I'm just 
curious about one thing now: when does the parity index precede the data index 
in current code? I haven't found one.
   
   [HDFS-15186](https://issues.apache.org/jira/browse/HDFS-15186) has described 
this problem, but does not fundamentally solve it.
   I think we should not expect to find this scenario, and if it occurs, it 
will cause problems with unrecoverable data.
   




> EC: Fix calculation errors caused by special index order
> 
>
> Key: HDFS-17521
> URL: https://issues.apache.org/jira/browse/HDFS-17521
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chenyu Zheng
>Assignee: Chenyu Zheng
>Priority: Critical
>  Labels: pull-request-available
>
> I found that if the erasedIndexes distribution is such that the parity index 
> is in front of the data index, ec will produce wrong results when decoding.
> In fact, HDFS-15186 has described this problem, but does not fundamentally 
> solve it.
> The reason is that the code assumes that erasedIndexes is preceded by the 
> data index and followed by parity index. If there is a parity index placed in 
> front of the data index, a calculation error will occur.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17521) EC: Fix calculation errors caused by special index order

2024-05-16 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846899#comment-17846899
 ] 

ASF GitHub Bot commented on HDFS-17521:
---

zhangshuyan0 commented on PR #6813:
URL: https://github.com/apache/hadoop/pull/6813#issuecomment-2114775230

   @zhengchenyu Thanks for your explanation! I got it. This PR LGTM. 
   I'm just curious about one thing now: when does the parity index precede the 
data index in current code?  I haven't found one.




> EC: Fix calculation errors caused by special index order
> 
>
> Key: HDFS-17521
> URL: https://issues.apache.org/jira/browse/HDFS-17521
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chenyu Zheng
>Assignee: Chenyu Zheng
>Priority: Critical
>  Labels: pull-request-available
>
> I found that if the erasedIndexes distribution is such that the parity index 
> is in front of the data index, ec will produce wrong results when decoding.
> In fact, HDFS-15186 has described this problem, but does not fundamentally 
> solve it.
> The reason is that the code assumes that erasedIndexes is preceded by the 
> data index and followed by parity index. If there is a parity index placed in 
> front of the data index, a calculation error will occur.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17158) Show the rate of metrics in EC recovery task.

2024-05-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846793#comment-17846793
 ] 

ASF GitHub Bot commented on HDFS-17158:
---

YuanbenWang closed pull request #5947: HDFS-17158. Show the rate of metrics in 
EC recovery task.
URL: https://github.com/apache/hadoop/pull/5947




> Show the rate of metrics in EC recovery task.
> -
>
> Key: HDFS-17158
> URL: https://issues.apache.org/jira/browse/HDFS-17158
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, metrics
>Reporter: WangYuanben
>Assignee: WangYuanben
>Priority: Minor
>  Labels: pull-request-available
> Attachments: 123124124.png, image2023-8-18_16-26-14.png
>
>
> From
> !image2023-8-18_16-26-14.png|width=551,height=83!
> To
> !123124124.png|width=559,height=100!
> These metrics may show the network and CPU load of the machine.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17524) OIV: add Transformed processor which reconstructs an fsimage from another fsimage file

2024-05-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846757#comment-17846757
 ] 

ASF GitHub Bot commented on HDFS-17524:
---

hadoop-yetus commented on PR #6820:
URL: https://github.com/apache/hadoop/pull/6820#issuecomment-2113459362

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m 02s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  spotbugs  |   0m 01s |  |  spotbugs executables are not 
available.  |
   | +0 :ok: |  codespell  |   0m 01s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m 01s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  markdownlint  |   0m 01s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m 01s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m 00s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  | 120m 10s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   8m 13s |  |  trunk passed  |
   | +1 :green_heart: |  checkstyle  |   6m 39s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   8m 43s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   8m 59s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  | 187m 27s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   6m 44s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   5m 09s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |   5m 09s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m 01s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   3m 20s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   5m 47s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   5m 02s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  | 201m 30s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  asflicense  |   8m 11s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 548m 31s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | GITHUB PR | https://github.com/apache/hadoop/pull/6820 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets markdownlint 
|
   | uname | MINGW64_NT-10.0-17763 45ce868edf92 3.4.10-87d57229.x86_64 
2024-02-14 20:17 UTC x86_64 Msys |
   | Build tool | maven |
   | Personality | /c/hadoop/dev-support/bin/hadoop.sh |
   | git revision | trunk / 0ab78d70a4518262f7d5fe759b9e1575bc087e33 |
   | Default Java | Azul Systems, Inc.-1.8.0_332-b09 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6820/4/testReport/
 |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6820/4/console
 |
   | versions | git=2.44.0.windows.1 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




> OIV: add Transformed processor which reconstructs an fsimage from another 
> fsimage file
> --
>
> Key: HDFS-17524
> URL: https://issues.apache.org/jira/browse/HDFS-17524
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: tools
>Affects Versions: 3.2.0, 3.3.4
>Reporter: Xiaobao Wu
>Priority: Major
>  Labels: pull-request-available
>
> *Background:*
> The Image file generated by the existing Hadoop 3.3.4 version cannot be 
> forward compatible . In the high version of HDFS, the fsimage file conversion 
> tool is provided to support the generation of forward compatible fsimage file 
> to support the downgrade operation.
> {*}Description{*}:
> Because there are differences in the structure and loading methods of some 
> Sections between high and low versions of fsimage files, especially the 
> StringTable Section. This will make it impossible to downgrade to a lower 
> version of HDFS ( e.g., 3.1.1 ) in higher versions ( e.g., 3.3.4 ), because 
> when the lower version of HDFS loads the fsimage file generated by the higher 
> version of HDFS, there will be an ArrayIndexOutOfBoundsException.
>  
> The code differences are as follows:
> {code:java}
> // 3.3.4  
> static 

[jira] [Commented] (HDFS-17524) OIV: add Transformed processor which reconstructs an fsimage from another fsimage file

2024-05-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846605#comment-17846605
 ] 

ASF GitHub Bot commented on HDFS-17524:
---

hadoop-yetus commented on PR #6820:
URL: https://github.com/apache/hadoop/pull/6820#issuecomment-2112366410

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 21s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  markdownlint  |   0m  0s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | -1 :x: |  mvninstall  |   1m 59s | 
[/branch-mvninstall-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6820/4/artifact/out/branch-mvninstall-root.txt)
 |  root in trunk failed.  |
   | -1 :x: |  compile  |   0m 21s | 
[/branch-compile-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6820/4/artifact/out/branch-compile-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt)
 |  hadoop-hdfs in trunk failed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.  |
   | -1 :x: |  compile  |   0m 20s | 
[/branch-compile-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6820/4/artifact/out/branch-compile-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.txt)
 |  hadoop-hdfs in trunk failed with JDK Private 
Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.  |
   | -0 :warning: |  checkstyle  |   0m 19s | 
[/buildtool-branch-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6820/4/artifact/out/buildtool-branch-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  The patch fails to run checkstyle in hadoop-hdfs  |
   | -1 :x: |  mvnsite  |   0m 22s | 
[/branch-mvnsite-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6820/4/artifact/out/branch-mvnsite-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in trunk failed.  |
   | -1 :x: |  javadoc  |   0m 21s | 
[/branch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6820/4/artifact/out/branch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt)
 |  hadoop-hdfs in trunk failed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.  |
   | -1 :x: |  javadoc  |   0m 22s | 
[/branch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6820/4/artifact/out/branch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.txt)
 |  hadoop-hdfs in trunk failed with JDK Private 
Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.  |
   | -1 :x: |  spotbugs  |   0m 22s | 
[/branch-spotbugs-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6820/4/artifact/out/branch-spotbugs-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in trunk failed.  |
   | +1 :green_heart: |  shadedclient  |   2m 31s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | -1 :x: |  mvninstall  |   0m 22s | 
[/patch-mvninstall-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6820/4/artifact/out/patch-mvninstall-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch failed.  |
   | -1 :x: |  compile  |   0m 22s | 
[/patch-compile-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6820/4/artifact/out/patch-compile-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt)
 |  hadoop-hdfs in the patch failed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.  |
   | -1 :x: |  javac  |   0m 22s | 

[jira] [Commented] (HDFS-17098) DatanodeManager does not handle null storage type properly

2024-05-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846546#comment-17846546
 ] 

ASF GitHub Bot commented on HDFS-17098:
---

Hexiaoqiao commented on PR #6035:
URL: https://github.com/apache/hadoop/pull/6035#issuecomment-2111897786

   @teamconfx Would you mind to fix the checkstyle before checkin?




> DatanodeManager does not handle null storage type properly
> --
>
> Key: HDFS-17098
> URL: https://issues.apache.org/jira/browse/HDFS-17098
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ConfX
>Priority: Critical
>  Labels: pull-request-available
> Attachments: reproduce.sh
>
>
> h2. What happened:
> Got a {{NullPointerException}} without message when sorting datanodes in 
> {{{}NetworkTopology{}}}.
> h2. Where's the bug:
> In line 654 of {{{}DatanodeManager{}}}, the manager creates a second sorter 
> using the standard {{Comparator}} class:
> {noformat}
> Comparator comp =
>         Comparator.comparing(DatanodeInfoWithStorage::getStorageType);
> secondarySort = list -> Collections.sort(list, comp);{noformat}
> This comparator is then used in {{NetworkTopology}} as a secondary sort to 
> break ties:
> {noformat}
> if (secondarySort != null) {
>         // a secondary sort breaks the tie between nodes.
>         secondarySort.accept(nodesList);
> }{noformat}
> However, if the storage type is {{{}null{}}}, a {{NullPointerException}} 
> would be thrown since the default {{Comparator.comparing}} cannot handle 
> comparison between null values.
> h2. How to reproduce:
> (1) Set {{dfs.heartbeat.interval}} to {{{}1753310367{}}}, and 
> {{dfs.namenode.read.considerStorageType}} to {{true}}
> (2) Run test: 
> {{org.apache.hadoop.hdfs.server.blockmanagement.TestSortLocatedBlock#testAviodStaleAndSlowDatanodes}}
> h2. Stacktrace:
> {noformat}
> java.lang.NullPointerException
>     at 
> java.base/java.util.Comparator.lambda$comparing$77a9974f$1(Comparator.java:469)
>     at java.base/java.util.TimSort.countRunAndMakeAscending(TimSort.java:355)
>     at java.base/java.util.TimSort.sort(TimSort.java:220)
>     at java.base/java.util.Arrays.sort(Arrays.java:1515)
>     at java.base/java.util.ArrayList.sort(ArrayList.java:1750)
>     at java.base/java.util.Collections.sort(Collections.java:179)
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.lambda$createSecondaryNodeSorter$0(DatanodeManager.java:654)
>     at 
> org.apache.hadoop.net.NetworkTopology.sortByDistance(NetworkTopology.java:983)
>     at 
> org.apache.hadoop.net.NetworkTopology.sortByDistanceUsingNetworkLocation(NetworkTopology.java:946)
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.sortLocatedBlock(DatanodeManager.java:637)
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.sortLocatedBlocks(DatanodeManager.java:554)
>     at 
> org.apache.hadoop.hdfs.server.blockmanagement.TestSortLocatedBlock.testAviodStaleAndSlowDatanodes(TestSortLocatedBlock.java:144){noformat}
> For an easy reproduction, run the reproduce.sh in the attachment. We are 
> happy to provide a patch if this issue is confirmed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16368) DFSAdmin supports refresh topology info without restarting namenode

2024-05-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846512#comment-17846512
 ] 

ASF GitHub Bot commented on HDFS-16368:
---

hfutatzhanghb commented on PR #3743:
URL: https://github.com/apache/hadoop/pull/3743#issuecomment-2111693845

   > Not sure why this PR has been pick up again. IIRC we have discussed add 
tools to support refresh topology online many times. I would like to share my 
concern again. This is one high-risky operation, especially for one large 
cluster. Considering the case that change the cluster topology randomly then 
namenode failover or restart, there will be block transfer flood where the 
replications of one block should be going to have enough racks. Before all 
blocks have enough racks (which is cost long times) the cluster could enter 
high-load phase and couldn't process any requests from client. In one word, be 
careful to refresh topology online if there is no stability guarantee measures.
   
   @Hexiaoqiao Sir, thanks for your reminding. I got your concern here.  I 
think this PR just provide a tool which can refresh topology without restarting 
namenode. As for how to use this tool safely to prevent causing transfer flood 
belongs to end-user.  They can modify a small part of topologay_mappings.data 
every time, then refreshTopology and wait for transfering completed.  Do this 
procedure until all parts of topologay_mappings.data updated finally. 




>  DFSAdmin supports refresh topology info without restarting namenode
> 
>
> Key: HDFS-16368
> URL: https://issues.apache.org/jira/browse/HDFS-16368
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: dfsadmin, namanode
>Affects Versions: 2.7.7, 3.3.1
>Reporter: farmmamba
>Assignee: farmmamba
>Priority: Major
>  Labels: features, pull-request-available
> Attachments: 0001.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Currently in HDFS, if we update the rack info for rack-awareness, we may need 
> to rolling restart namenodes to let it be effective. If cluster is large, the 
> cost time of rolling restart namenodes is very log. So, we develope a method 
> to refresh topology info without rolling restart namenodes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17520) TestDFSAdmin.testAllDatanodesReconfig and TestDFSAdmin.testDecommissionDataNodesReconfig failed

2024-05-14 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846464#comment-17846464
 ] 

ASF GitHub Bot commented on HDFS-17520:
---

slfan1989 merged PR #6812:
URL: https://github.com/apache/hadoop/pull/6812




> TestDFSAdmin.testAllDatanodesReconfig and 
> TestDFSAdmin.testDecommissionDataNodesReconfig failed
> ---
>
> Key: HDFS-17520
> URL: https://issues.apache.org/jira/browse/HDFS-17520
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
>
> {code:java}
> [ERROR] Tests run: 21, Failures: 3, Errors: 0, Skipped: 0, Time elapsed: 
> 44.521 s <<< FAILURE! - in org.apache.hadoop.hdfs.tools.TestDFSAdmin
> [ERROR] testAllDatanodesReconfig(org.apache.hadoop.hdfs.tools.TestDFSAdmin)  
> Time elapsed: 2.086 s  <<< FAILURE!
> java.lang.AssertionError: 
> Expecting:
>  <["Reconfiguring status for node [127.0.0.1:43731]: SUCCESS: Changed 
> property dfs.datanode.peer.stats.enabled",
> " From: "false"",
> " To: "true"",
> "started at Fri May 10 13:02:51 UTC 2024 and finished at Fri May 10 
> 13:02:51 UTC 2024."]>
> to contain subsequence:
>  <["SUCCESS: Changed property dfs.datanode.peer.stats.enabled",
> " From: "false"",
> " To: "true""]>
>   at 
> org.apache.hadoop.hdfs.tools.TestDFSAdmin.testAllDatanodesReconfig(TestDFSAdmin.java:1286)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>   at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
>   at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17528) FsImageValidation: set txid when saving a new image

2024-05-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-17528:
--
Labels: pull-request-available  (was: )

> FsImageValidation: set txid when saving a new image
> ---
>
> Key: HDFS-17528
> URL: https://issues.apache.org/jira/browse/HDFS-17528
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Reporter: Tsz-wo Sze
>Assignee: Tsz-wo Sze
>Priority: Major
>  Labels: pull-request-available
>
> - When the fsimage is specified as a file and the FsImageValidation tool 
> saves a new image (for removing inaccessible inodes), the txid is not set.  
> Then, the resulted image will have 0 as its txid.
> - When the fsimage is specified as a directory, the txid is set.  However, it 
> will get NPE since NameNode metrics is uninitialized (although the metrics is 
> not used by FsImageValidation).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17528) FsImageValidation: set txid when saving a new image

2024-05-14 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846440#comment-17846440
 ] 

ASF GitHub Bot commented on HDFS-17528:
---

szetszwo opened a new pull request, #6828:
URL: https://github.com/apache/hadoop/pull/6828

   ### Description of PR
   
   - When the fsimage is specified as a file and the FsImageValidation tool 
saves a new image (for removing inaccessible inodes), the txid is not set. 
Then, the resulted image will have 0 as its txid.
   - When the fsimage is specified as a directory, the txid is set. However, it 
will get NPE since NameNode metrics is uninitialized (although the metrics is 
not used by FsImageValidation).
   
   ### How was this patch tested?
   
   Tested manually
   - before: the output file is `fsimage.ckpt_000` (i.e. txid 
is 0)
   > 2024-05-14 13:37:27,531 [main] INFO  namenode.FSImageFormatProtobuf 
(FSImageFormatProtobuf.java:save(732)) - Saving image file 
/Users/szetszwo/hadoop/fsimage/current/newFsImage5968764763996132609/current/fsimage.ckpt_000
 using no compression
   > 2024-05-14 13:37:30,522 [main] INFO  namenode.FSImageFormatProtobuf 
(FSImageFormatProtobuf.java:save(736)) - Image file 
/Users/szetszwo/hadoop/fsimage/current/newFsImage5968764763996132609/current/fsimage.ckpt_000
 of size 200392059 bytes saved in 2 seconds .
   
   - after: the output file is `fsimage.ckpt_23945925442` with correct 
txid
   > 2024-05-14 13:38:32,414 [main] INFO  namenode.FSImage 
(FSImage.java:save(1223)) - save fsimage with txid=23945925442 to 
/Users/szetszwo/hadoop/fsimage/current/newFsImage4409944859316006440
   > 2024-05-14 13:38:32,436 [main] INFO  namenode.FSImageFormatProtobuf 
(FSImageFormatProtobuf.java:save(732)) - Saving image file 
/Users/szetszwo/hadoop/fsimage/current/newFsImage4409944859316006440/current/fsimage.ckpt_23945925442
 using no compression
   > 2024-05-14 13:38:35,437 [main] INFO  namenode.FSImageFormatProtobuf 
(FSImageFormatProtobuf.java:save(736)) - Image file 
/Users/szetszwo/hadoop/fsimage/current/newFsImage4409944859316006440/current/fsimage.ckpt_23945925442
 of size 200392062 bytes saved in 3 seconds .
   
   ### For code changes:
   
   - [X] Does the title or this PR starts with the corresponding JIRA issue id 
(e.g. 'HADOOP-17799. Your PR title ...')?
   - [NA] Object storage: have the integration tests been executed and the 
endpoint declared according to the connector-specific documentation?
   - [NA] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
   - [NA] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, 
`NOTICE-binary` files?
   
   




> FsImageValidation: set txid when saving a new image
> ---
>
> Key: HDFS-17528
> URL: https://issues.apache.org/jira/browse/HDFS-17528
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Reporter: Tsz-wo Sze
>Assignee: Tsz-wo Sze
>Priority: Major
>
> - When the fsimage is specified as a file and the FsImageValidation tool 
> saves a new image (for removing inaccessible inodes), the txid is not set.  
> Then, the resulted image will have 0 as its txid.
> - When the fsimage is specified as a directory, the txid is set.  However, it 
> will get NPE since NameNode metrics is uninitialized (although the metrics is 
> not used by FsImageValidation).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17518) In the lease monitor, if a file is closed, we should sync the editslog

2024-05-14 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846350#comment-17846350
 ] 

ASF GitHub Bot commented on HDFS-17518:
---

ThinkerLei commented on code in PR #6809:
URL: https://github.com/apache/hadoop/pull/6809#discussion_r1600246934


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/LeaseManager.java:
##
@@ -626,7 +626,8 @@ private synchronized boolean checkLeases(Collection 
leasesToCheck) {
 }
   }
   // If a lease recovery happened, we need to sync later.

Review Comment:
   @Hexiaoqiao @vinayakumarb Thank you very much for your comment. In the one 
hand,  we may indeed not need to invoke logSync() in time. The purpose of this 
modification is to ensure that `editlog` can be `sync` in a timely manner like 
other write operations,so as to prevent the loss of the `editlog` in some 
extreme cases. on the other hand,  @vinayakumarb I'm still a little confused by 
what you're saying. The current modification  
   ```
   boolean isClosed = !lastINode.isUnderConstruction();
   if (!needSync && (!completed || isClosed)) {
   needSync = true;
 } 
   ```
has ensured that leaseMonitor can invoke `logSync()` when the file gets 
closed and  `reassign lease`. File gets closed, `isClosed` will be true. Lease 
reassigned , `completed` will be false and the initial value of `needSync` is 
false.





> In the lease monitor, if a file is closed, we should sync the editslog
> --
>
> Key: HDFS-17518
> URL: https://issues.apache.org/jira/browse/HDFS-17518
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: lei w
>Assignee: lei w
>Priority: Minor
>  Labels: pull-request-available
>
> In the lease monitor, if a file is closed,  method checklease will return 
> true, and then the edits log will not be sync. In my opinion, we should sync 
> the edits log to avoid not synchronizing the state to the standby NameNode 
> for a long time.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17514) RBF: Routers keep using cached stateID even when active NN returns unset header

2024-05-14 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846343#comment-17846343
 ] 

ASF GitHub Bot commented on HDFS-17514:
---

simbadzina merged PR #6804:
URL: https://github.com/apache/hadoop/pull/6804




> RBF: Routers keep using cached stateID even when active NN returns unset 
> header
> ---
>
> Key: HDFS-17514
> URL: https://issues.apache.org/jira/browse/HDFS-17514
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rbf
>Reporter: Simbarashe Dzinamarira
>Assignee: Simbarashe Dzinamarira
>Priority: Minor
>  Labels: pull-request-available
>
> When a namenode that had "dfs.namenode.state.context.enabled" set to true is 
> restarted with the configuration set to false, routers will keep using a 
> previously cached state ID.
> Without RBF
> * clients that fetched the old stateID could have stale reads even after 
> msyncing
> * new clients will go to the active.
> With RBF
> * client that fetched the old stateID could have stale reads like above.
> * New clients will also fetch the stale stateID and potentially have stale 
> reads
> New clients that are created after the restart should not fetch the stale 
> state ID.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17514) RBF: Routers keep using cached stateID even when active NN returns unset header

2024-05-14 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846340#comment-17846340
 ] 

ASF GitHub Bot commented on HDFS-17514:
---

simbadzina commented on PR #6804:
URL: https://github.com/apache/hadoop/pull/6804#issuecomment-2110488809

   Failing tests in continuous-integration unrelated to my changes
   ```
Failed junit tests  
   |  
hadoop.hdfs.server.federation.router.security.token.TestSQLDelegationTokenSecretManagerImpl
 
   |  hadoop.hdfs.server.federation.store.driver.TestStateStoreMySQL 
   ```




> RBF: Routers keep using cached stateID even when active NN returns unset 
> header
> ---
>
> Key: HDFS-17514
> URL: https://issues.apache.org/jira/browse/HDFS-17514
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rbf
>Reporter: Simbarashe Dzinamarira
>Assignee: Simbarashe Dzinamarira
>Priority: Minor
>  Labels: pull-request-available
>
> When a namenode that had "dfs.namenode.state.context.enabled" set to true is 
> restarted with the configuration set to false, routers will keep using a 
> previously cached state ID.
> Without RBF
> * clients that fetched the old stateID could have stale reads even after 
> msyncing
> * new clients will go to the active.
> With RBF
> * client that fetched the old stateID could have stale reads like above.
> * New clients will also fetch the stale stateID and potentially have stale 
> reads
> New clients that are created after the restart should not fetch the stale 
> state ID.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17476) fix: False positive "Observer Node is too far behind" due to long overflow.

2024-05-14 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846316#comment-17846316
 ] 

ASF GitHub Bot commented on HDFS-17476:
---

KeeProMise commented on PR #6747:
URL: https://github.com/apache/hadoop/pull/6747#issuecomment-2110243175

   > Thanks involving me.
   > 
   > @KeeProMise I'm just doubt what scenarios could cause a negative 
`clientStateId`?
   
   @ZanderXu thanks for your review, under normal circumstances, this problem 
does not occur; because when I wrote a single test, I forgot to pass the 
transaction id, which caused this problem;
   I think the code logic here itself lacks consideration of long overflow (in 
our environment, if the transaction id is not passed, the default is 
long.minvalue), which may cause problems in future iterations.




> fix: False positive "Observer Node is too far behind" due to long overflow.
> ---
>
> Key: HDFS-17476
> URL: https://issues.apache.org/jira/browse/HDFS-17476
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Jian Zhang
>Assignee: Jian Zhang
>Priority: Critical
>  Labels: pull-request-available
> Attachments: HDFS-17476.patch, image-2024-04-18-10-57-10-481.png
>
>
> In the code GlobalStateIdContext#receiveRequestState(), if clientStateId is a 
> small negative number, clientStateId-serverStateId may be greater than 
> (ESTIMATED_TRANSACTIONS_PER_SECOND due to overflow
>                   * TimeUnit.MILLISECONDS.toSeconds(clientWaitTime)
>                   * ESTIMATED_SERVER_TIME_MULTIPLIER),
> resulting in false positives that Observer Node is too far behind.
> !image-2024-04-18-10-57-10-481.png|width=742,height=110!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17521) EC: Fix calculation errors caused by special index order

2024-05-14 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846310#comment-17846310
 ] 

ASF GitHub Bot commented on HDFS-17521:
---

zhengchenyu commented on PR #6813:
URL: https://github.com/apache/hadoop/pull/6813#issuecomment-2110181645

   @zhangshuyan0 
   Thanks for your review! 
   I don't means that parity index is smaller. Parity index and data index is 
fixed number, we can't update it.
   The reproduce case: 
   When we call RawErasureDecoder::decode, and if the parameter `erasedIndexes` 
is in special order. The special order is that the parity index precedes the 
data index. For example, if erasedIndexes is [8,0], will reproduce this problem.
   And you can run unit tests directly and reproduce this easily.
   I printed erasedIndexes for all errors in the single test in 
[wrongindex.txt](https://github.com/apache/hadoop/files/15308714/wrongindex.txt),
 all meet this characteristic.
   
   
   




> EC: Fix calculation errors caused by special index order
> 
>
> Key: HDFS-17521
> URL: https://issues.apache.org/jira/browse/HDFS-17521
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chenyu Zheng
>Assignee: Chenyu Zheng
>Priority: Critical
>  Labels: pull-request-available
>
> I found that if the erasedIndexes distribution is such that the parity index 
> is in front of the data index, ec will produce wrong results when decoding.
> In fact, HDFS-15186 has described this problem, but does not fundamentally 
> solve it.
> The reason is that the code assumes that erasedIndexes is preceded by the 
> data index and followed by parity index. If there is a parity index placed in 
> front of the data index, a calculation error will occur.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17514) RBF: Routers keep using cached stateID even when active NN returns unset header

2024-05-14 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846239#comment-17846239
 ] 

ASF GitHub Bot commented on HDFS-17514:
---

hadoop-yetus commented on PR #6804:
URL: https://github.com/apache/hadoop/pull/6804#issuecomment-2109700134

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m 00s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  spotbugs  |   0m 01s |  |  spotbugs executables are not 
available.  |
   | +0 :ok: |  codespell  |   0m 01s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m 01s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m 00s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m 00s |  |  The patch appears to 
include 2 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  95m 50s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   5m 42s |  |  trunk passed  |
   | +1 :green_heart: |  checkstyle  |   4m 53s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   5m 33s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   5m 08s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  | 157m 21s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   3m 02s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   2m 30s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |   2m 30s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m 00s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   2m 13s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   2m 44s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   2m 17s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  | 163m 47s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  asflicense  |   5m 47s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 441m 52s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | GITHUB PR | https://github.com/apache/hadoop/pull/6804 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | MINGW64_NT-10.0-17763 3998dbde031f 3.4.10-87d57229.x86_64 
2024-02-14 20:17 UTC x86_64 Msys |
   | Build tool | maven |
   | Personality | /c/hadoop/dev-support/bin/hadoop.sh |
   | git revision | trunk / 14ab9f7e6a8d3d3680c496a44c1ec5635596770b |
   | Default Java | Azul Systems, Inc.-1.8.0_332-b09 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6804/7/testReport/
 |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 
hadoop-hdfs-project/hadoop-hdfs-rbf |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6804/7/console
 |
   | versions | git=2.45.0.windows.1 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




> RBF: Routers keep using cached stateID even when active NN returns unset 
> header
> ---
>
> Key: HDFS-17514
> URL: https://issues.apache.org/jira/browse/HDFS-17514
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rbf
>Reporter: Simbarashe Dzinamarira
>Assignee: Simbarashe Dzinamarira
>Priority: Minor
>  Labels: pull-request-available
>
> When a namenode that had "dfs.namenode.state.context.enabled" set to true is 
> restarted with the configuration set to false, routers will keep using a 
> previously cached state ID.
> Without RBF
> * clients that fetched the old stateID could have stale reads even after 
> msyncing
> * new clients will go to the active.
> With RBF
> * client that fetched the old stateID could have stale reads like above.
> * New clients will also fetch the stale stateID and potentially have stale 
> reads
> New clients that are created after the restart should not fetch the stale 
> state ID.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17526) getMetadataInputStream should use getShareDeleteFileInputStream for windows

2024-05-14 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846234#comment-17846234
 ] 

ASF GitHub Bot commented on HDFS-17526:
---

hadoop-yetus commented on PR #6826:
URL: https://github.com/apache/hadoop/pull/6826#issuecomment-2109685586

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m 00s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  spotbugs  |   0m 00s |  |  spotbugs executables are not 
available.  |
   | +0 :ok: |  codespell  |   0m 01s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m 01s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m 00s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m 00s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  91m 52s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   6m 52s |  |  trunk passed  |
   | +1 :green_heart: |  checkstyle  |   5m 03s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   7m 02s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   6m 05s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  | 155m 06s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   4m 43s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   3m 35s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |   3m 35s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m 00s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   2m 21s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   4m 15s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   3m 38s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  | 159m 24s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  asflicense  |   5m 31s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 434m 15s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | GITHUB PR | https://github.com/apache/hadoop/pull/6826 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | MINGW64_NT-10.0-17763 42a9d20631d9 3.4.10-87d57229.x86_64 
2024-02-14 20:17 UTC x86_64 Msys |
   | Build tool | maven |
   | Personality | /c/hadoop/dev-support/bin/hadoop.sh |
   | git revision | trunk / 5f042ded4b7bb9ce826cd9e8b65c08a16e79aea9 |
   | Default Java | Azul Systems, Inc.-1.8.0_332-b09 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6826/1/testReport/
 |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6826/1/console
 |
   | versions | git=2.45.0.windows.1 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




> getMetadataInputStream should use getShareDeleteFileInputStream for windows
> ---
>
> Key: HDFS-17526
> URL: https://issues.apache.org/jira/browse/HDFS-17526
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 3.3.4
>Reporter: Danny Becker
>Priority: Major
>  Labels: pull-request-available
>
> In HDFS-10636, the getDataInputStream method uses the 
> getShareDeleteFileInputStream for windows, but the getMetaDataInputStream 
> does not use this. The following error can happen when a DataNode is trying 
> to update the genstamp on a block in Windows.
> DataNode Logs:
> {{Caused by: java.io.IOException: Failed to rename 
> G:\data\hdfs\data\current\BP-1\current\finalized\subdir5\subdir16\blk_1_1.meta
>  to 
> G:\data\hdfs\data\current\BP-1\current\finalized\subdir5\subdir16\blk_1_2.meta
>  due to failure in native rename. 32: The process cannot access the file 
> because it is being used by another process.}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: 

[jira] [Commented] (HDFS-17476) fix: False positive "Observer Node is too far behind" due to long overflow.

2024-05-14 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846212#comment-17846212
 ] 

ASF GitHub Bot commented on HDFS-17476:
---

ZanderXu commented on PR #6747:
URL: https://github.com/apache/hadoop/pull/6747#issuecomment-2109505953

   Thanks involving me. 
   
   @KeeProMise I'm just doubt what scenarios could cause a negative 
`clientStateId`?
   




> fix: False positive "Observer Node is too far behind" due to long overflow.
> ---
>
> Key: HDFS-17476
> URL: https://issues.apache.org/jira/browse/HDFS-17476
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Jian Zhang
>Assignee: Jian Zhang
>Priority: Critical
>  Labels: pull-request-available
> Attachments: HDFS-17476.patch, image-2024-04-18-10-57-10-481.png
>
>
> In the code GlobalStateIdContext#receiveRequestState(), if clientStateId is a 
> small negative number, clientStateId-serverStateId may be greater than 
> (ESTIMATED_TRANSACTIONS_PER_SECOND due to overflow
>                   * TimeUnit.MILLISECONDS.toSeconds(clientWaitTime)
>                   * ESTIMATED_SERVER_TIME_MULTIPLIER),
> resulting in false positives that Observer Node is too far behind.
> !image-2024-04-18-10-57-10-481.png|width=742,height=110!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17521) EC: Fix calculation errors caused by special index order

2024-05-14 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846195#comment-17846195
 ] 

ASF GitHub Bot commented on HDFS-17521:
---

slfan1989 commented on PR #6813:
URL: https://github.com/apache/hadoop/pull/6813#issuecomment-2109448492

   @zhangshuyan0 @haiyang1987 Could you help review this PR? I'm not very 
familiar with EC, but  I've noticed that  you have submitted quite a few 
improvements related to EC. Thank you very much!




> EC: Fix calculation errors caused by special index order
> 
>
> Key: HDFS-17521
> URL: https://issues.apache.org/jira/browse/HDFS-17521
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chenyu Zheng
>Assignee: Chenyu Zheng
>Priority: Critical
>  Labels: pull-request-available
>
> I found that if the erasedIndexes distribution is such that the parity index 
> is in front of the data index, ec will produce wrong results when decoding.
> In fact, HDFS-15186 has described this problem, but does not fundamentally 
> solve it.
> The reason is that the code assumes that erasedIndexes is preceded by the 
> data index and followed by parity index. If there is a parity index placed in 
> front of the data index, a calculation error will occur.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17476) fix: False positive "Observer Node is too far behind" due to long overflow.

2024-05-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846166#comment-17846166
 ] 

ASF GitHub Bot commented on HDFS-17476:
---

Hexiaoqiao commented on PR #6747:
URL: https://github.com/apache/hadoop/pull/6747#issuecomment-2109219912

   Hi @haiyang1987 @ZanderXu would you mind to take review?




> fix: False positive "Observer Node is too far behind" due to long overflow.
> ---
>
> Key: HDFS-17476
> URL: https://issues.apache.org/jira/browse/HDFS-17476
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Jian Zhang
>Assignee: Jian Zhang
>Priority: Critical
>  Labels: pull-request-available
> Attachments: HDFS-17476.patch, image-2024-04-18-10-57-10-481.png
>
>
> In the code GlobalStateIdContext#receiveRequestState(), if clientStateId is a 
> small negative number, clientStateId-serverStateId may be greater than 
> (ESTIMATED_TRANSACTIONS_PER_SECOND due to overflow
>                   * TimeUnit.MILLISECONDS.toSeconds(clientWaitTime)
>                   * ESTIMATED_SERVER_TIME_MULTIPLIER),
> resulting in false positives that Observer Node is too far behind.
> !image-2024-04-18-10-57-10-481.png|width=742,height=110!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17099) Null Pointer Exception when stop namesystem in HDFS

2024-05-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846149#comment-17846149
 ] 

ASF GitHub Bot commented on HDFS-17099:
---

Hexiaoqiao commented on PR #6034:
URL: https://github.com/apache/hadoop/pull/6034#issuecomment-2109197470

   Sorry for too late to push forward this PR.
   Committed to trunk. Thanks @teamconfx and @ayushtkn 




> Null Pointer Exception when stop namesystem in HDFS
> ---
>
> Key: HDFS-17099
> URL: https://issues.apache.org/jira/browse/HDFS-17099
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ConfX
>Assignee: ConfX
>Priority: Critical
>  Labels: pull-request-available
> Attachments: reproduce.sh
>
>
> h2. What happend:
> Got NullPointerException when stop namesystem in HDFS.
> h2. Buggy code:
>  
> {code:java}
>   void stopActiveServices() {
>     ...
>     if (dir != null && getFSImage() != null) {
>       if (getFSImage().editLog != null) {    // <--- Check whether editLog is 
> null
>         getFSImage().editLog.close();
>       }
>       // Update the fsimage with the last txid that we wrote
>       // so that the tailer starts from the right spot.
>       getFSImage().updateLastAppliedTxIdFromWritten(); // <--- BUG: Even if 
> editLog is null, this line will still be executed and cause nullpointer 
> exception
>     }
>     ...
>   }  public void updateLastAppliedTxIdFromWritten() {
>     this.lastAppliedTxId = editLog.getLastWrittenTxId();  // < This will 
> cause nullpointer exception if editLog is null
>   } {code}
> h2. StackTrace:
>  
> {code:java}
> java.lang.NullPointerException
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateLastAppliedTxIdFromWritten(FSImage.java:1553)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.stopActiveServices(FSNamesystem.java:1463)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.close(FSNamesystem.java:1815)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.(FSNamesystem.java:1017)
>         at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.initialize(SecondaryNameNode.java:248)
>         at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.(SecondaryNameNode.java:194)
>         at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.(SecondaryNameNode.java:181)
>  {code}
> h2. How to reproduce:
> (1) Set {{dfs.namenode.top.windows.minutes}} to {{{}37914516,32,0{}}}; or set 
> {{dfs.namenode.top.window.num.buckets}} to {{{}244111242{}}}.
> (2) Run test: 
> {{org.apache.hadoop.hdfs.server.namenode.TestNameNodeHttpServerXFrame#testSecondaryNameNodeXFrame}}
> h2. What's more:
> I'm still investigating how the parameter 
> {{dfs.namenode.top.windows.minutes}} triggered the buggy code.
>  
> For an easy reproduction, run the reproduce.sh in the attachment.
> We are happy to provide a patch if this issue is confirmed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17099) Null Pointer Exception when stop namesystem in HDFS

2024-05-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846148#comment-17846148
 ] 

ASF GitHub Bot commented on HDFS-17099:
---

Hexiaoqiao merged PR #6034:
URL: https://github.com/apache/hadoop/pull/6034




> Null Pointer Exception when stop namesystem in HDFS
> ---
>
> Key: HDFS-17099
> URL: https://issues.apache.org/jira/browse/HDFS-17099
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ConfX
>Assignee: ConfX
>Priority: Critical
>  Labels: pull-request-available
> Attachments: reproduce.sh
>
>
> h2. What happend:
> Got NullPointerException when stop namesystem in HDFS.
> h2. Buggy code:
>  
> {code:java}
>   void stopActiveServices() {
>     ...
>     if (dir != null && getFSImage() != null) {
>       if (getFSImage().editLog != null) {    // <--- Check whether editLog is 
> null
>         getFSImage().editLog.close();
>       }
>       // Update the fsimage with the last txid that we wrote
>       // so that the tailer starts from the right spot.
>       getFSImage().updateLastAppliedTxIdFromWritten(); // <--- BUG: Even if 
> editLog is null, this line will still be executed and cause nullpointer 
> exception
>     }
>     ...
>   }  public void updateLastAppliedTxIdFromWritten() {
>     this.lastAppliedTxId = editLog.getLastWrittenTxId();  // < This will 
> cause nullpointer exception if editLog is null
>   } {code}
> h2. StackTrace:
>  
> {code:java}
> java.lang.NullPointerException
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.updateLastAppliedTxIdFromWritten(FSImage.java:1553)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.stopActiveServices(FSNamesystem.java:1463)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.close(FSNamesystem.java:1815)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.(FSNamesystem.java:1017)
>         at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.initialize(SecondaryNameNode.java:248)
>         at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.(SecondaryNameNode.java:194)
>         at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.(SecondaryNameNode.java:181)
>  {code}
> h2. How to reproduce:
> (1) Set {{dfs.namenode.top.windows.minutes}} to {{{}37914516,32,0{}}}; or set 
> {{dfs.namenode.top.window.num.buckets}} to {{{}244111242{}}}.
> (2) Run test: 
> {{org.apache.hadoop.hdfs.server.namenode.TestNameNodeHttpServerXFrame#testSecondaryNameNodeXFrame}}
> h2. What's more:
> I'm still investigating how the parameter 
> {{dfs.namenode.top.windows.minutes}} triggered the buggy code.
>  
> For an easy reproduction, run the reproduce.sh in the attachment.
> We are happy to provide a patch if this issue is confirmed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17518) In the lease monitor, if a file is closed, we should sync the editslog

2024-05-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846144#comment-17846144
 ] 

ASF GitHub Bot commented on HDFS-17518:
---

Hexiaoqiao commented on code in PR #6809:
URL: https://github.com/apache/hadoop/pull/6809#discussion_r1599314271


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/LeaseManager.java:
##
@@ -626,7 +626,8 @@ private synchronized boolean checkLeases(Collection 
leasesToCheck) {
 }
   }
   // If a lease recovery happened, we need to sync later.

Review Comment:
   Thanks, I totally agree to write and sync edit log here where 'write' 
already do that for both `close` and `reassign lease` but miss `sync` for some 
corner case. My point is do we need to sync them invoke `logSync()` in time? 
And what will it happen if sync by other write operation at following because 
edit from LeaseManager.Monitor is one asynchronous logic which is not have to 
in order IMO. Maybe it could be going to be missed if there are no other write 
operation later then NameNode shutdown? 





> In the lease monitor, if a file is closed, we should sync the editslog
> --
>
> Key: HDFS-17518
> URL: https://issues.apache.org/jira/browse/HDFS-17518
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: lei w
>Assignee: lei w
>Priority: Minor
>  Labels: pull-request-available
>
> In the lease monitor, if a file is closed,  method checklease will return 
> true, and then the edits log will not be sync. In my opinion, we should sync 
> the edits log to avoid not synchronizing the state to the standby NameNode 
> for a long time.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17438) RBF: The newest STANDBY and UNAVAILABLE nn should be the lowest priority.

2024-05-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846143#comment-17846143
 ] 

ASF GitHub Bot commented on HDFS-17438:
---

KeeProMise commented on PR #6655:
URL: https://github.com/apache/hadoop/pull/6655#issuecomment-2109177969

   @goiri The failed ut should not be caused by my modifications.




> RBF: The newest STANDBY and UNAVAILABLE nn should be the lowest priority.
> -
>
> Key: HDFS-17438
> URL: https://issues.apache.org/jira/browse/HDFS-17438
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Jian Zhang
>Assignee: Jian Zhang
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-17438.001.patch
>
>
> At present, when the status of all namenodes in an ns in the router is the 
> same, the namenode which is the newest reported will be placed at the top of 
> the cache. when the client accesses the ns through the router, it will first 
> access the namenode.
> If multiple namenodes in this route are in an active state, or if there are 
> namenodes with multiple observer states, the existing logic is not a problem, 
> because the newest reported active or observer state namenode have a higher 
> probability of being true active or observer compared to the namenode that 
> reported active or observer state a long time ago.
> Similarly, the newest reported namenode with a status of standby or 
> unavailable has a higher probability of being a standby or unavailable 
> namenode compared to the namenode reported with a status of standby or 
> unavailable a long time ago. Therefore, the newest nn reported as standby or 
> unavailable status should have a lower priority for access, the oldest nn 
> reported as standby or unavailable status should have a higher priority for 
> access.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17497) Logic for committed blocks is mixed when computing file size

2024-05-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846130#comment-17846130
 ] 

ASF GitHub Bot commented on HDFS-17497:
---

haiyang1987 commented on code in PR #6765:
URL: https://github.com/apache/hadoop/pull/6765#discussion_r1599287673


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfo.java:
##
@@ -387,6 +387,19 @@ public boolean isUnderRecovery() {
 return getBlockUCState().equals(BlockUCState.UNDER_RECOVERY);
   }
 
+  /**
+   * Is this block still under construction or recoery.

Review Comment:
Leave some small comment.
   
   `recoery` update to `recovery`





> Logic for committed blocks is mixed when computing file size
> 
>
> Key: HDFS-17497
> URL: https://issues.apache.org/jira/browse/HDFS-17497
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: ZanderXu
>Priority: Major
>  Labels: pull-request-available
>
> One in-writing HDFS file may contains multiple committed blocks, as follows 
> (assume one file contains three blocks):
> || ||Block 1||Block 2||Block 3||
> |Case 1|Complete|Commit|UnderConstruction|
> |Case 2|Complete|Commit|Commit|
> |Case 3|Commit|Commit|Commit|
>  
> But the logic for committed blocks is mixed when computing file size, it 
> ignores the bytes of the last committed block and contains the bytes of other 
> committed blocks.
> {code:java}
> public final long computeFileSize(boolean includesLastUcBlock,
> boolean usePreferredBlockSize4LastUcBlock) {
>   if (blocks.length == 0) {
> return 0;
>   }
>   final int last = blocks.length - 1;
>   //check if the last block is BlockInfoUnderConstruction
>   BlockInfo lastBlk = blocks[last];
>   long size = lastBlk.getNumBytes();
>   // the last committed block is not complete, so it's bytes may be ignored.
>   if (!lastBlk.isComplete()) {
>  if (!includesLastUcBlock) {
>size = 0;
>  } else if (usePreferredBlockSize4LastUcBlock) {
>size = isStriped()?
>getPreferredBlockSize() *
>((BlockInfoStriped)lastBlk).getDataBlockNum() :
>getPreferredBlockSize();
>  }
>   }
>   // The bytes of other committed blocks are calculated into the file length.
>   for (int i = 0; i < last; i++) {
> size += blocks[i].getNumBytes();
>   }
>   return size;
> } {code}
> The bytes of one committed block will not be changed, so the bytes of the 
> last committed block should be calculated into the file length too.
>  
> And the logic for committed blocks is mixed too when computing file length in 
> DFSInputStream. Normally DFSInputStream does not need to get visible length 
> for committed block regardless of whether the committed block is the last 
> block or not.
>  
> -HDFS-10843- encountered one bug which actually caused by the committed 
> block, but -HDFS-10843- fixed that bug by updating quota usage when 
> completing block. The num of bytes of the committed block will no longer 
> change, so we should update the quota usage when the block is committed, 
> which can reduce the delta quota usage in time.
>  
> So there are somethings we need to do:
>  * Unify the calculation logic for all committed blocks in 
> {{computeFileSize}} of {{INodeFile}}
>  * Unify the calculation logic for all committed blocks in {{getFileLength}} 
> of {{DFSInputStream}}
>  * Update quota usage when committing block



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17526) getMetadataInputStream should use getShareDeleteFileInputStream for windows

2024-05-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846127#comment-17846127
 ] 

ASF GitHub Bot commented on HDFS-17526:
---

dannytbecker opened a new pull request, #6826:
URL: https://github.com/apache/hadoop/pull/6826

   ### Description of PR
   In HDFS-10636, the getDataInputStream method uses the 
getShareDeleteFileInputStream for windows, but the getMetaDataInputStream does 
not use this. The following error can happen when a DataNode is trying to 
update the genstamp on a block in Windows.
   
   DataNode Logs:
   ```
   Caused by: java.io.IOException: Failed to rename 
G:\data\hdfs\data\current\BP-1\current\finalized\subdir5\subdir16\blk_1_1.meta 
to 
G:\data\hdfs\data\current\BP-1\current\finalized\subdir5\subdir16\blk_1_2.meta 
due to failure in native rename. 32: The process cannot access the file because 
it is being used by another process.
   ```
   
   ### How was this patch tested?
   
   
   ### For code changes:
   
   - [X] Does the title or this PR starts with the corresponding JIRA issue id 
(e.g. 'HADOOP-17799. Your PR title ...')?
   - [ ] Object storage: have the integration tests been executed and the 
endpoint declared according to the connector-specific documentation?
   - [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
   - [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, 
`NOTICE-binary` files?
   
   




> getMetadataInputStream should use getShareDeleteFileInputStream for windows
> ---
>
> Key: HDFS-17526
> URL: https://issues.apache.org/jira/browse/HDFS-17526
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 3.3.4
>Reporter: Danny Becker
>Priority: Major
>
> In HDFS-10636, the getDataInputStream method uses the 
> getShareDeleteFileInputStream for windows, but the getMetaDataInputStream 
> does not use this. The following error can happen when a DataNode is trying 
> to update the genstamp on a block in Windows.
> DataNode Logs:
> {{Caused by: java.io.IOException: Failed to rename 
> G:\data\hdfs\data\current\BP-1\current\finalized\subdir5\subdir16\blk_1_1.meta
>  to 
> G:\data\hdfs\data\current\BP-1\current\finalized\subdir5\subdir16\blk_1_2.meta
>  due to failure in native rename. 32: The process cannot access the file 
> because it is being used by another process.}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17526) getMetadataInputStream should use getShareDeleteFileInputStream for windows

2024-05-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-17526:
--
Labels: pull-request-available  (was: )

> getMetadataInputStream should use getShareDeleteFileInputStream for windows
> ---
>
> Key: HDFS-17526
> URL: https://issues.apache.org/jira/browse/HDFS-17526
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 3.3.4
>Reporter: Danny Becker
>Priority: Major
>  Labels: pull-request-available
>
> In HDFS-10636, the getDataInputStream method uses the 
> getShareDeleteFileInputStream for windows, but the getMetaDataInputStream 
> does not use this. The following error can happen when a DataNode is trying 
> to update the genstamp on a block in Windows.
> DataNode Logs:
> {{Caused by: java.io.IOException: Failed to rename 
> G:\data\hdfs\data\current\BP-1\current\finalized\subdir5\subdir16\blk_1_1.meta
>  to 
> G:\data\hdfs\data\current\BP-1\current\finalized\subdir5\subdir16\blk_1_2.meta
>  due to failure in native rename. 32: The process cannot access the file 
> because it is being used by another process.}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17514) RBF: Routers keep using cached stateID even when active NN returns unset header

2024-05-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846121#comment-17846121
 ] 

ASF GitHub Bot commented on HDFS-17514:
---

hadoop-yetus commented on PR #6804:
URL: https://github.com/apache/hadoop/pull/6804#issuecomment-2109071666

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m 00s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  spotbugs  |   0m 01s |  |  spotbugs executables are not 
available.  |
   | +0 :ok: |  codespell  |   0m 01s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m 01s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m 00s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m 00s |  |  The patch appears to 
include 2 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  93m 47s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   5m 18s |  |  trunk passed  |
   | +1 :green_heart: |  checkstyle  |   4m 53s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   5m 36s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   4m 55s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  | 151m 19s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   3m 03s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   2m 36s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |   2m 36s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m 01s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   2m 18s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   2m 40s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   2m 19s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  | 162m 22s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  asflicense  |   5m 40s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 431m 44s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | GITHUB PR | https://github.com/apache/hadoop/pull/6804 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | MINGW64_NT-10.0-17763 14da4b14ae4a 3.4.10-87d57229.x86_64 
2024-02-14 20:17 UTC x86_64 Msys |
   | Build tool | maven |
   | Personality | /c/hadoop/dev-support/bin/hadoop.sh |
   | git revision | trunk / afb800fd9c78343d772f45c77a3430409b515034 |
   | Default Java | Azul Systems, Inc.-1.8.0_332-b09 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6804/5/testReport/
 |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 
hadoop-hdfs-project/hadoop-hdfs-rbf |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6804/5/console
 |
   | versions | git=2.45.0.windows.1 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




> RBF: Routers keep using cached stateID even when active NN returns unset 
> header
> ---
>
> Key: HDFS-17514
> URL: https://issues.apache.org/jira/browse/HDFS-17514
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rbf
>Reporter: Simbarashe Dzinamarira
>Assignee: Simbarashe Dzinamarira
>Priority: Minor
>  Labels: pull-request-available
>
> When a namenode that had "dfs.namenode.state.context.enabled" set to true is 
> restarted with the configuration set to false, routers will keep using a 
> previously cached state ID.
> Without RBF
> * clients that fetched the old stateID could have stale reads even after 
> msyncing
> * new clients will go to the active.
> With RBF
> * client that fetched the old stateID could have stale reads like above.
> * New clients will also fetch the stale stateID and potentially have stale 
> reads
> New clients that are created after the restart should not fetch the stale 
> state ID.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17514) RBF: Routers keep using cached stateID even when active NN returns unset header

2024-05-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846111#comment-17846111
 ] 

ASF GitHub Bot commented on HDFS-17514:
---

hadoop-yetus commented on PR #6804:
URL: https://github.com/apache/hadoop/pull/6804#issuecomment-2109013347

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 19s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 2 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  33m 24s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 25s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |   0m 20s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   0m 18s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 28s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 30s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 20s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   0m 54s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  20m  9s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 18s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 18s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |   0m 18s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 16s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |   0m 16s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 12s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 20s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 17s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 16s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   0m 50s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  20m 23s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  |  26m 30s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6804/5/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt)
 |  hadoop-hdfs-rbf in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 23s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 110m 39s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.hdfs.server.federation.router.security.token.TestSQLDelegationTokenSecretManagerImpl
 |
   |   | hadoop.hdfs.server.federation.store.driver.TestStateStoreMySQL |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.44 ServerAPI=1.44 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6804/5/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6804 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 01d54fcf6746 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 
15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 14ab9f7e6a8d3d3680c496a44c1ec5635596770b |
   | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6804/5/testReport/ |
   | Max. process+thread count | 4246 (vs. ulimit 

[jira] [Commented] (HDFS-17438) RBF: The newest STANDBY and UNAVAILABLE nn should be the lowest priority.

2024-05-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846073#comment-17846073
 ] 

ASF GitHub Bot commented on HDFS-17438:
---

hadoop-yetus commented on PR #6655:
URL: https://github.com/apache/hadoop/pull/6655#issuecomment-2108803506

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 46s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 2 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  15m 23s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  36m 50s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  19m 11s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |  17m 24s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   4m 42s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 35s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m  1s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 28s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   4m  4s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  39m 14s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 31s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 26s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  18m  7s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |  18m  7s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  17m 37s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |  17m 37s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   4m 40s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   2m 30s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   1m 53s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 33s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   4m 23s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  39m 53s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  20m 20s |  |  hadoop-common in the patch 
passed.  |
   | -1 :x: |  unit  |  32m 30s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6655/16/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt)
 |  hadoop-hdfs-rbf in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m  2s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 296m 36s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.hdfs.server.federation.store.driver.TestStateStoreMySQL |
   |   | 
hadoop.hdfs.server.federation.router.security.token.TestSQLDelegationTokenSecretManagerImpl
 |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.44 ServerAPI=1.44 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6655/16/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6655 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux ad13d93774b6 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 
15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / a128abc09be0ed7e2174ff28947ed8f9835e905f |
   | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   | Multi-JDK versions | 

[jira] [Commented] (HDFS-17504) DN process should exit when BPServiceActor exit

2024-05-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846060#comment-17846060
 ] 

ASF GitHub Bot commented on HDFS-17504:
---

vinayakumarb commented on PR #6792:
URL: https://github.com/apache/hadoop/pull/6792#issuecomment-2108716673

   > @zhuzilong2013 Thanks for your report and contribution! IMO, they are 
independent between different BPServiceActor, if exit DN process due to one 
BPServiceActor issue, it will increase number of Dead DataNode from the whole 
cluster view, where I don't think it is proper in Federation Arch. Another 
side, maybe we could add some BPServiceActor count metric to monitor if 
BPServiceActor works fine? Thanks again.
   
   +1
   One BPServiceActor reports to one namenode. In case of HA, if one of the 
namenode not able to connect due to some reason, DN can continue to report to 
available NameNode.
   
   Morever, if all BPServiceActor of a BPOfferService  (i.e. connections to all 
namenodes belonging to same namespace) exited, BPOfferService also shutsdown.
   
   When all of such BPOfferServices (in case of federation, there will be 
multiple) shutdown, Datanode will automatically initiate the shutdown.
   
   Refer `DataNode.join()` method.




> DN process should exit when BPServiceActor exit
> ---
>
> Key: HDFS-17504
> URL: https://issues.apache.org/jira/browse/HDFS-17504
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Zilong Zhu
>Assignee: Zilong Zhu
>Priority: Major
>  Labels: pull-request-available
>
> BPServiceActor is a very important thread. In a non-HA cluster, the exit of 
> the BPServiceActor thread will cause the DN process to exit. However, in a HA 
> cluster, this is not the case.
> I found HDFS-15651 causes BPServiceActor thread to exit and sets the 
> "runningState" from "RunningState.FAILED" to "RunningState.EXITED",  it can 
> be confusing during troubleshooting.
> I believe that the DN process should exit when the flag of the BPServiceActor 
> is set to RunningState.FAILED because at this point, the DN is unable to 
> recover and establish a heartbeat connection with the ANN on its own.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17514) RBF: Routers keep using cached stateID even when active NN returns unset header

2024-05-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846053#comment-17846053
 ] 

ASF GitHub Bot commented on HDFS-17514:
---

ctrezzo commented on code in PR #6804:
URL: https://github.com/apache/hadoop/pull/6804#discussion_r1599008233


##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/PoolAlignmentContext.java:
##
@@ -64,7 +64,11 @@ public void 
updateResponseState(RpcHeaderProtos.RpcResponseHeaderProto.Builder h
*/
   @Override
   public void receiveResponseState(RpcHeaderProtos.RpcResponseHeaderProto 
header) {
-sharedGlobalStateId.accumulate(header.getStateId());
+if (header.getStateId() == 0 && sharedGlobalStateId.get() > 0) {
+  sharedGlobalStateId.reset();
+} else {
+  sharedGlobalStateId.accumulate(header.getStateId());

Review Comment:
   Ah makes sense. Thanks.





> RBF: Routers keep using cached stateID even when active NN returns unset 
> header
> ---
>
> Key: HDFS-17514
> URL: https://issues.apache.org/jira/browse/HDFS-17514
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rbf
>Reporter: Simbarashe Dzinamarira
>Assignee: Simbarashe Dzinamarira
>Priority: Minor
>  Labels: pull-request-available
>
> When a namenode that had "dfs.namenode.state.context.enabled" set to true is 
> restarted with the configuration set to false, routers will keep using a 
> previously cached state ID.
> Without RBF
> * clients that fetched the old stateID could have stale reads even after 
> msyncing
> * new clients will go to the active.
> With RBF
> * client that fetched the old stateID could have stale reads like above.
> * New clients will also fetch the stale stateID and potentially have stale 
> reads
> New clients that are created after the restart should not fetch the stale 
> state ID.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17518) In the lease monitor, if a file is closed, we should sync the editslog

2024-05-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846051#comment-17846051
 ] 

ASF GitHub Bot commented on HDFS-17518:
---

vinayakumarb commented on code in PR #6809:
URL: https://github.com/apache/hadoop/pull/6809#discussion_r1599005240


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/LeaseManager.java:
##
@@ -626,7 +626,8 @@ private synchronized boolean checkLeases(Collection 
leasesToCheck) {
 }
   }
   // If a lease recovery happened, we need to sync later.

Review Comment:
   On the other hand, if we see the complete logic of `internalReleaseLease()` 
we need to call logSync() always.
   
   There are two possibilities overall.
   1. File gets closed.
   2. lease recovery gets initiated with reassign of lease.
   
   In both of above cases, there will be edit txn logged. So need to call 
logSync().





> In the lease monitor, if a file is closed, we should sync the editslog
> --
>
> Key: HDFS-17518
> URL: https://issues.apache.org/jira/browse/HDFS-17518
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: lei w
>Assignee: lei w
>Priority: Minor
>  Labels: pull-request-available
>
> In the lease monitor, if a file is closed,  method checklease will return 
> true, and then the edits log will not be sync. In my opinion, we should sync 
> the edits log to avoid not synchronizing the state to the standby NameNode 
> for a long time.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17509) RBF: Fix ClientProtocol.concat will throw NPE if tgr is a empty file.

2024-05-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846028#comment-17846028
 ] 

ASF GitHub Bot commented on HDFS-17509:
---

hadoop-yetus commented on PR #6784:
URL: https://github.com/apache/hadoop/pull/6784#issuecomment-2108370233

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m 01s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  spotbugs  |   0m 00s |  |  spotbugs executables are not 
available.  |
   | +0 :ok: |  codespell  |   0m 00s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m 01s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m 00s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m 00s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  93m 29s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   5m 23s |  |  trunk passed  |
   | +1 :green_heart: |  checkstyle  |   4m 56s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   5m 19s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   4m 59s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  | 152m 41s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   3m 10s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   2m 20s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |   2m 20s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m 00s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   2m 05s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   2m 38s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   2m 23s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  | 161m 35s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  asflicense  |   7m 49s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 434m 01s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | GITHUB PR | https://github.com/apache/hadoop/pull/6784 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | MINGW64_NT-10.0-17763 e1b5e8df9373 3.4.10-87d57229.x86_64 
2024-02-14 20:17 UTC x86_64 Msys |
   | Build tool | maven |
   | Personality | /c/hadoop/dev-support/bin/hadoop.sh |
   | git revision | trunk / 0c3ea013638be3e91a7972c80cf694a1447ea1a0 |
   | Default Java | Azul Systems, Inc.-1.8.0_332-b09 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6784/5/testReport/
 |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 
hadoop-hdfs-project/hadoop-hdfs-rbf |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6784/5/console
 |
   | versions | git=2.45.0.windows.1 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




> RBF: Fix ClientProtocol.concat  will throw NPE if tgr is a empty file.
> --
>
> Key: HDFS-17509
> URL: https://issues.apache.org/jira/browse/HDFS-17509
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: liuguanghua
>Priority: Minor
>  Labels: pull-request-available
>
> hdfs dfs -concat  /tmp/merge /tmp/t1 /tmp/t2
> When /tmp/merge is a empty file, this command will throw NPE via DFSRouter. 
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17514) RBF: Routers keep using cached stateID even when active NN returns unset header

2024-05-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846015#comment-17846015
 ] 

ASF GitHub Bot commented on HDFS-17514:
---

simbadzina commented on code in PR #6804:
URL: https://github.com/apache/hadoop/pull/6804#discussion_r1598782895


##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/PoolAlignmentContext.java:
##
@@ -64,7 +64,11 @@ public void 
updateResponseState(RpcHeaderProtos.RpcResponseHeaderProto.Builder h
*/
   @Override
   public void receiveResponseState(RpcHeaderProtos.RpcResponseHeaderProto 
header) {
-sharedGlobalStateId.accumulate(header.getStateId());
+if (header.getStateId() == 0 && sharedGlobalStateId.get() > 0) {
+  sharedGlobalStateId.reset();
+} else {
+  sharedGlobalStateId.accumulate(header.getStateId());

Review Comment:
   The sharedGlobalStateID is created as [`new LongAccumulator(Math::max, 
Long.MIN_VALUE)`](https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterStateIdContext.java#L111)
   So accumulate either keeps the current value or moves it forward.





> RBF: Routers keep using cached stateID even when active NN returns unset 
> header
> ---
>
> Key: HDFS-17514
> URL: https://issues.apache.org/jira/browse/HDFS-17514
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rbf
>Reporter: Simbarashe Dzinamarira
>Assignee: Simbarashe Dzinamarira
>Priority: Minor
>  Labels: pull-request-available
>
> When a namenode that had "dfs.namenode.state.context.enabled" set to true is 
> restarted with the configuration set to false, routers will keep using a 
> previously cached state ID.
> Without RBF
> * clients that fetched the old stateID could have stale reads even after 
> msyncing
> * new clients will go to the active.
> With RBF
> * client that fetched the old stateID could have stale reads like above.
> * New clients will also fetch the stale stateID and potentially have stale 
> reads
> New clients that are created after the restart should not fetch the stale 
> state ID.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17514) RBF: Routers keep using cached stateID even when active NN returns unset header

2024-05-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846011#comment-17846011
 ] 

ASF GitHub Bot commented on HDFS-17514:
---

simbadzina commented on PR #6804:
URL: https://github.com/apache/hadoop/pull/6804#issuecomment-2108218576

   > @simbadzina Should we also add a test case to 
[TestPoolAlignmentContext.java](https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/router/TestPoolAlignmentContext.java)
 to ensure the sharedGlobalStateId moves under the conditions we expect it to?
   
   Good call. Let me add that.




> RBF: Routers keep using cached stateID even when active NN returns unset 
> header
> ---
>
> Key: HDFS-17514
> URL: https://issues.apache.org/jira/browse/HDFS-17514
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rbf
>Reporter: Simbarashe Dzinamarira
>Assignee: Simbarashe Dzinamarira
>Priority: Minor
>  Labels: pull-request-available
>
> When a namenode that had "dfs.namenode.state.context.enabled" set to true is 
> restarted with the configuration set to false, routers will keep using a 
> previously cached state ID.
> Without RBF
> * clients that fetched the old stateID could have stale reads even after 
> msyncing
> * new clients will go to the active.
> With RBF
> * client that fetched the old stateID could have stale reads like above.
> * New clients will also fetch the stale stateID and potentially have stale 
> reads
> New clients that are created after the restart should not fetch the stale 
> state ID.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17438) RBF: The newest STANDBY and UNAVAILABLE nn should be the lowest priority.

2024-05-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846001#comment-17846001
 ] 

ASF GitHub Bot commented on HDFS-17438:
---

KeeProMise commented on code in PR #6655:
URL: https://github.com/apache/hadoop/pull/6655#discussion_r1598727611


##
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/test/GenericTestUtils.java:
##
@@ -404,6 +404,28 @@ public static void waitFor(final Supplier check,
 }
   }
 
+  /**
+   * Wait at least {@code atLeastWaitForMillis} from start to end of the test.
+   *
+   * @param check the test to perform.
+   * @param atLeastWaitForMillis the minimum waiting time from the beginning
+   * to the end of the test.
+   * @throws InterruptedException if the method is interrupted while waiting.
+   */
+  public static  T atLeastWaitFor(final Supplier check,
+  long atLeastWaitForMillis) throws InterruptedException {
+if (atLeastWaitForMillis < 0) {
+  atLeastWaitForMillis = 0;
+}
+long st = Time.monotonicNow();
+T result = check.get();

Review Comment:
   your suggestion is very good, I have changed the name of the method.





> RBF: The newest STANDBY and UNAVAILABLE nn should be the lowest priority.
> -
>
> Key: HDFS-17438
> URL: https://issues.apache.org/jira/browse/HDFS-17438
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Jian Zhang
>Assignee: Jian Zhang
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-17438.001.patch
>
>
> At present, when the status of all namenodes in an ns in the router is the 
> same, the namenode which is the newest reported will be placed at the top of 
> the cache. when the client accesses the ns through the router, it will first 
> access the namenode.
> If multiple namenodes in this route are in an active state, or if there are 
> namenodes with multiple observer states, the existing logic is not a problem, 
> because the newest reported active or observer state namenode have a higher 
> probability of being true active or observer compared to the namenode that 
> reported active or observer state a long time ago.
> Similarly, the newest reported namenode with a status of standby or 
> unavailable has a higher probability of being a standby or unavailable 
> namenode compared to the namenode reported with a status of standby or 
> unavailable a long time ago. Therefore, the newest nn reported as standby or 
> unavailable status should have a lower priority for access, the oldest nn 
> reported as standby or unavailable status should have a higher priority for 
> access.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17524) OIV: add Transformed processor which reconstructs an fsimage from another fsimage file

2024-05-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845990#comment-17845990
 ] 

ASF GitHub Bot commented on HDFS-17524:
---

hadoop-yetus commented on PR #6820:
URL: https://github.com/apache/hadoop/pull/6820#issuecomment-2108085535

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m 01s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  spotbugs  |   0m 01s |  |  spotbugs executables are not 
available.  |
   | +0 :ok: |  codespell  |   0m 01s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m 01s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  markdownlint  |   0m 01s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m 00s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m 00s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  91m 53s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   6m 45s |  |  trunk passed  |
   | +1 :green_heart: |  checkstyle  |   5m 11s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   7m 07s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   6m 13s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  | 155m 20s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   4m 49s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   3m 44s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |   3m 44s |  |  the patch passed  |
   | -1 :x: |  blanks  |   0m 01s | 
[/blanks-eol.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6820/2/artifact/out/blanks-eol.txt)
 |  The patch has 2 line(s) that end in blanks. Use git apply --whitespace=fix 
<>. Refer https://git-scm.com/docs/git-apply  |
   | +1 :green_heart: |  checkstyle  |   2m 28s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   4m 21s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   3m 52s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  | 163m 59s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  asflicense  |   5m 43s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 439m 29s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | GITHUB PR | https://github.com/apache/hadoop/pull/6820 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets markdownlint 
|
   | uname | MINGW64_NT-10.0-17763 cbd8002785da 3.4.10-87d57229.x86_64 
2024-02-14 20:17 UTC x86_64 Msys |
   | Build tool | maven |
   | Personality | /c/hadoop/dev-support/bin/hadoop.sh |
   | git revision | trunk / 6fe2b6fe1df0eb5c1c3958dece6a370b698e2721 |
   | Default Java | Azul Systems, Inc.-1.8.0_332-b09 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6820/2/testReport/
 |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6820/2/console
 |
   | versions | git=2.44.0.windows.1 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




> OIV: add Transformed processor which reconstructs an fsimage from another 
> fsimage file
> --
>
> Key: HDFS-17524
> URL: https://issues.apache.org/jira/browse/HDFS-17524
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: tools
>Affects Versions: 3.2.0, 3.3.4
>Reporter: Xiaobao Wu
>Priority: Major
>  Labels: pull-request-available
>
> *Background:*
> The Image file generated by the existing Hadoop 3.3.4 version cannot be 
> forward compatible . In the high version of HDFS, the fsimage file conversion 
> tool is provided to support the generation of forward compatible fsimage file 
> to support the downgrade operation.
> {*}Description{*}:
> Because there are differences in the structure and loading methods of some 
> Sections between high and low versions of fsimage files, especially the 
> StringTable Section. This will make it impossible to downgrade to a lower 
> version of HDFS ( e.g., 3.1.1 ) in higher versions ( e.g., 3.3.4 ), because 
> when the lower version of 

[jira] [Commented] (HDFS-17509) RBF: Fix ClientProtocol.concat will throw NPE if tgr is a empty file.

2024-05-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845981#comment-17845981
 ] 

ASF GitHub Bot commented on HDFS-17509:
---

hadoop-yetus commented on PR #6784:
URL: https://github.com/apache/hadoop/pull/6784#issuecomment-2107945602

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 20s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  33m  0s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 23s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |   0m 21s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   0m 20s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 27s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 31s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 20s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   0m 48s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  20m 37s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 18s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 20s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |   0m 20s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 16s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |   0m 16s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 10s | 
[/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs-rbf.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6784/9/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs-rbf.txt)
 |  hadoop-hdfs-project/hadoop-hdfs-rbf: The patch generated 1 new + 3 
unchanged - 0 fixed = 4 total (was 3)  |
   | +1 :green_heart: |  mvnsite  |   0m 18s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 18s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 18s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   0m 50s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  20m 28s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  |  25m 59s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6784/9/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt)
 |  hadoop-hdfs-rbf in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 24s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 109m 41s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.hdfs.server.federation.router.security.token.TestSQLDelegationTokenSecretManagerImpl
 |
   |   | hadoop.hdfs.server.federation.store.driver.TestStateStoreMySQL |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.45 ServerAPI=1.45 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6784/9/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6784 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 26082e0026f7 5.15.0-106-generic #116-Ubuntu SMP Wed Apr 17 
09:17:56 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 7a591b04da617ed30dab681782dc378af6ce527c |
   | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   | Multi-JDK versions | 

[jira] [Commented] (HDFS-17522) JournalNode web interfaces lack configs for X-FRAME-OPTIONS protection

2024-05-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845952#comment-17845952
 ] 

ASF GitHub Bot commented on HDFS-17522:
---

Hexiaoqiao commented on PR #6814:
URL: https://github.com/apache/hadoop/pull/6814#issuecomment-2107680415

   Committed to trunk. Thanks @hiwangzhihui and @vinayakumarb .




> JournalNode web interfaces lack configs for X-FRAME-OPTIONS protection
> --
>
> Key: HDFS-17522
> URL: https://issues.apache.org/jira/browse/HDFS-17522
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: journal-node
>Affects Versions: 3.0.0-alpha1, 3.5.0
>Reporter: wangzhihui
>Priority: Major
>  Labels: pull-request-available
>
> [HDFS-10579 |https://issues.apache.org/jira/browse/HDFS-10579] has added 
> protection for NameNode and DataNode, but missing protection for JournalNode 
> web interfaces.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17522) JournalNode web interfaces lack configs for X-FRAME-OPTIONS protection

2024-05-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845951#comment-17845951
 ] 

ASF GitHub Bot commented on HDFS-17522:
---

Hexiaoqiao merged PR #6814:
URL: https://github.com/apache/hadoop/pull/6814




> JournalNode web interfaces lack configs for X-FRAME-OPTIONS protection
> --
>
> Key: HDFS-17522
> URL: https://issues.apache.org/jira/browse/HDFS-17522
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: journal-node
>Affects Versions: 3.0.0-alpha1, 3.5.0
>Reporter: wangzhihui
>Priority: Major
>  Labels: pull-request-available
>
> [HDFS-10579 |https://issues.apache.org/jira/browse/HDFS-10579] has added 
> protection for NameNode and DataNode, but missing protection for JournalNode 
> web interfaces.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17518) In the lease monitor, if a file is closed, we should sync the editslog

2024-05-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845949#comment-17845949
 ] 

ASF GitHub Bot commented on HDFS-17518:
---

Hexiaoqiao commented on code in PR #6809:
URL: https://github.com/apache/hadoop/pull/6809#discussion_r1598536353


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/LeaseManager.java:
##
@@ -626,7 +626,8 @@ private synchronized boolean checkLeases(Collection 
leasesToCheck) {
 }
   }
   // If a lease recovery happened, we need to sync later.

Review Comment:
   > I would recommend you to change the return type of internalReleaseLease() 
to ImmutablePair to include both completed and needSync 
values. 
   needSync will be true in both cases of file closed and lease re-assignment.
   
   +1. If we will plan to improve it, should fix it together.
   BTW, what will it happen if not sync in time, LeaseManager.Monitor is one 
asynchronous logic, it can not be ensure to sync edits in one certain order 
right?





> In the lease monitor, if a file is closed, we should sync the editslog
> --
>
> Key: HDFS-17518
> URL: https://issues.apache.org/jira/browse/HDFS-17518
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: lei w
>Assignee: lei w
>Priority: Minor
>  Labels: pull-request-available
>
> In the lease monitor, if a file is closed,  method checklease will return 
> true, and then the edits log will not be sync. In my opinion, we should sync 
> the edits log to avoid not synchronizing the state to the standby NameNode 
> for a long time.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16368) DFSAdmin supports refresh topology info without restarting namenode

2024-05-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845928#comment-17845928
 ] 

ASF GitHub Bot commented on HDFS-16368:
---

Hexiaoqiao commented on PR #3743:
URL: https://github.com/apache/hadoop/pull/3743#issuecomment-2107611900

   Not sure why this PR has been pick up again. IIRC we have discussed add 
tools to support refresh topology online many times. I would like to share my 
concern again. This is one high-risky operation, especially for one large 
cluster. Considering the case that change the cluster topology randomly then 
namenode failover or restart, there will be block transfer flood where the 
replications of one block should be going to have enough racks. Before all 
blocks have enough racks (which is cost long times) the cluster could enter 
high-load phase and couldn't process any requests from client. In one word, be 
careful to refresh topology online if there is no stability guarantee measures.




>  DFSAdmin supports refresh topology info without restarting namenode
> 
>
> Key: HDFS-16368
> URL: https://issues.apache.org/jira/browse/HDFS-16368
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: dfsadmin, namanode
>Affects Versions: 2.7.7, 3.3.1
>Reporter: farmmamba
>Assignee: farmmamba
>Priority: Major
>  Labels: features, pull-request-available
> Attachments: 0001.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Currently in HDFS, if we update the rack info for rack-awareness, we may need 
> to rolling restart namenodes to let it be effective. If cluster is large, the 
> cost time of rolling restart namenodes is very log. So, we develope a method 
> to refresh topology info without rolling restart namenodes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



  1   2   3   4   5   6   7   8   9   10   >