date:20240404

[jira] [Commented] (HDFS-17453) IncrementalBlockReport can have race condition with Edit Log Tailer

2024-04-04 Thread ASF GitHub Bot (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-17453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834124#comment-17834124
 ] 

ASF GitHub Bot commented on HDFS-17453:
---

hadoop-yetus commented on PR #6708:
URL: https://github.com/apache/hadoop/pull/6708#issuecomment-2038639280

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 28s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  45m 30s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 22s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |   1m 15s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   1m 17s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 21s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m  7s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 41s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   3m 16s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  36m 52s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 15s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 17s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |   1m 17s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 14s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |   1m 14s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   1m  3s | 
[/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6708/2/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 86 unchanged - 
0 fixed = 87 total (was 86)  |
   | +1 :green_heart: |  mvnsite  |   1m 17s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 51s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 34s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   3m 18s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  37m 52s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 231m 45s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6708/2/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 46s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 377m 44s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.hdfs.server.blockmanagement.TestPendingDataNodeMessages |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.45 ServerAPI=1.45 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6708/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6708 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux bf6537361823 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 
15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 96cbeddc700d96b57c1d818a173657a91efcc8cc |
   | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private

[jira] [Commented] (HDFS-17453) IncrementalBlockReport can have race condition with Edit Log Tailer

2024-04-04 Thread ASF GitHub Bot (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-17453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834121#comment-17834121
 ] 

ASF GitHub Bot commented on HDFS-17453:
---

hadoop-yetus commented on PR #6708:
URL: https://github.com/apache/hadoop/pull/6708#issuecomment-2038632485

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 32s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  45m 15s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 20s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |   1m 16s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   1m  9s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 23s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m  5s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 40s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   3m 28s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  36m 24s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m  9s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 13s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |   1m 13s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  9s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |   1m  9s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m  5s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 17s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 52s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 35s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   3m 33s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  37m 20s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 232m 19s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6708/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 47s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 376m 37s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.hdfs.server.blockmanagement.TestPendingDataNodeMessages |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.45 ServerAPI=1.45 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6708/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6708 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux b3e715c91d4a 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 
15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 7556dbe46d7e3e7a555110b58756785755e55b32 |
   | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6708/1/testReport/ |
   | Max. process+thread count | 4659

[jira] [Commented] (HDFS-17451) RBF: fix spotbugs for redundant nullcheck of dns.

2024-04-04 Thread ASF GitHub Bot (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-17451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834109#comment-17834109
 ] 

ASF GitHub Bot commented on HDFS-17451:
---

KeeProMise commented on code in PR #6697:
URL: https://github.com/apache/hadoop/pull/6697#discussion_r1552627925


##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterRpcServer.java:
##
@@ -1089,13 +1089,7 @@ public DatanodeInfo[] 
getDatanodeReport(DatanodeReportType type)
   DatanodeInfo[] getCachedDatanodeReport(DatanodeReportType type)
   throws IOException {
 try {
-  DatanodeInfo[] dns = this.dnCache.get(type);
-  if (dns == null) {
-LOG.debug("Get null DN report from cache");
-dns = getCachedDatanodeReportImpl(type);
-this.dnCache.put(type, dns);
-  }
-  return dns;
+  return this.dnCache.get(type);

Review Comment:
   @slfan1989 @ayushtkn Thanks for your review, this modification is just to 
test whether spotbug will pass. Why do I still get a spotbugs exception after 
deleting the relevant code?





> RBF: fix spotbugs for redundant nullcheck of dns.
> -
>
> Key: HDFS-17451
> URL: https://issues.apache.org/jira/browse/HDFS-17451
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Jian Zhang
>Assignee: Jian Zhang
>Priority: Major
>  Labels: pull-request-available
>
> h2. Dodgy code Warnings
> ||Code||Warning||
> |RCN|Redundant nullcheck of dns, which is known to be non-null in 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getCachedDatanodeReport(HdfsConstants$DatanodeReportType)|
> | |[Bug type RCN_REDUNDANT_NULLCHECK_OF_NONNULL_VALUE (click for 
> details)|https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6655/8/artifact/out/branch-spotbugs-hadoop-hdfs-project_hadoop-hdfs-rbf-warnings.html#RCN_REDUNDANT_NULLCHECK_OF_NONNULL_VALUE]
> In class org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer
> In method 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getCachedDatanodeReport(HdfsConstants$DatanodeReportType)
> Value loaded from dns
> Return value of 
> org.apache.hadoop.thirdparty.com.google.common.cache.LoadingCache.get(Object) 
> of type Object
> Redundant null check at RouterRpcServer.java:[line 1093]|



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-17451) RBF: fix spotbugs for redundant nullcheck of dns.

2024-04-04 Thread ASF GitHub Bot (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-17451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834108#comment-17834108
 ] 

ASF GitHub Bot commented on HDFS-17451:
---

KeeProMise commented on code in PR #6697:
URL: https://github.com/apache/hadoop/pull/6697#discussion_r1552627925


##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterRpcServer.java:
##
@@ -1089,13 +1089,7 @@ public DatanodeInfo[] 
getDatanodeReport(DatanodeReportType type)
   DatanodeInfo[] getCachedDatanodeReport(DatanodeReportType type)
   throws IOException {
 try {
-  DatanodeInfo[] dns = this.dnCache.get(type);
-  if (dns == null) {
-LOG.debug("Get null DN report from cache");
-dns = getCachedDatanodeReportImpl(type);
-this.dnCache.put(type, dns);
-  }
-  return dns;
+  return this.dnCache.get(type);

Review Comment:
   @slfan1989 @ayushtkn Thanks for your review, this modification is just to 
test whether spotbug will pass. Why do I still report spotbugs exception after 
deleting the relevant code? 





> RBF: fix spotbugs for redundant nullcheck of dns.
> -
>
> Key: HDFS-17451
> URL: https://issues.apache.org/jira/browse/HDFS-17451
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Jian Zhang
>Assignee: Jian Zhang
>Priority: Major
>  Labels: pull-request-available
>
> h2. Dodgy code Warnings
> ||Code||Warning||
> |RCN|Redundant nullcheck of dns, which is known to be non-null in 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getCachedDatanodeReport(HdfsConstants$DatanodeReportType)|
> | |[Bug type RCN_REDUNDANT_NULLCHECK_OF_NONNULL_VALUE (click for 
> details)|https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6655/8/artifact/out/branch-spotbugs-hadoop-hdfs-project_hadoop-hdfs-rbf-warnings.html#RCN_REDUNDANT_NULLCHECK_OF_NONNULL_VALUE]
> In class org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer
> In method 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getCachedDatanodeReport(HdfsConstants$DatanodeReportType)
> Value loaded from dns
> Return value of 
> org.apache.hadoop.thirdparty.com.google.common.cache.LoadingCache.get(Object) 
> of type Object
> Redundant null check at RouterRpcServer.java:[line 1093]|



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-17453) IncrementalBlockReport can have race condition with Edit Log Tailer

2024-04-04 Thread ASF GitHub Bot (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-17453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834069#comment-17834069
 ] 

ASF GitHub Bot commented on HDFS-17453:
---

dannytbecker commented on PR #6708:
URL: https://github.com/apache/hadoop/pull/6708#issuecomment-2038110112

   I am planning to add another unit test that simulates this race condition.




> IncrementalBlockReport can have race condition with Edit Log Tailer
> ---
>
> Key: HDFS-17453
> URL: https://issues.apache.org/jira/browse/HDFS-17453
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: auto-failover, ha, hdfs, namenode
>Affects Versions: 3.3.0, 3.3.1, 2.10.2, 3.3.2, 3.3.5, 3.3.4, 3.3.6
>Reporter: Danny Becker
>Assignee: Danny Becker
>Priority: Major
>  Labels: pull-request-available
>
> h2. Summary
> There is a race condition between IncrementalBlockReports (IBR) and 
> EditLogTailer in Standby NameNode (SNN) which can lead to leaked IBRs and 
> false corrupt blocks after HA Failover. The race condition occurs when the 
> SNN loads the edit logs before it receives the block reports from DataNode 
> (DN).
> h2. Example
> In the following example there is a block (b1) with 3 generation stamps (gs1, 
> gs2, gs3).
>  # SNN1 loads edit logs for b1gs1 and b1gs2.
>  # DN1 sends the IBR for b1gs1 to SNN1.
>  # SNN1 will determine that the reported block b1gs1 from DN1 is corrupt and 
> it will be queued for later. 
> [BlockManager.java|https://github.com/apache/hadoop/blob/6ed73896f6e8b4b7c720eff64193cb30b3e77fb2/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java#L3447C1-L3464C6]
> {code:java}
>     BlockToMarkCorrupt c = checkReplicaCorrupt(
>         block, reportedState, storedBlock, ucState, dn);
>     if (c != null) {
>       if (shouldPostponeBlocksFromFuture) {
>         // If the block is an out-of-date generation stamp or state,
>         // but we're the standby, we shouldn't treat it as corrupt,
>         // but instead just queue it for later processing.
>         // Storing the reported block for later processing, as that is what
>         // comes from the IBR / FBR and hence what we should use to compare
>         // against the memory state.
>         // See HDFS-6289 and HDFS-15422 for more context.
>         queueReportedBlock(storageInfo, block, reportedState,
>             QUEUE_REASON_CORRUPT_STATE);
>       } else {
>         toCorrupt.add(c);
>       }
>       return storedBlock;
>     } {code}
>  # DN1 sends IBR for b1gs2 and b1gs3 to SNN1.
>  # SNN1 processes b1sg2 and updates the blocks map.
>  # SNN1 queues b1gs3 for later because it determines that b1gs3 is a future 
> genstamp.
>  # SNN1 loads b1gs3 edit logs and processes the queued reports for b1.
>  # SNN1 processes b1gs1 first and puts it back in the queue.
>  # SNN1 processes b1gs3 next and updates the blocks map.
>  # Later, SNN1 becomes the Active NameNode (ANN) during an HA Failover.
>  # SNN1 will catch to the latest edit logs, then process all queued block 
> reports to become the ANN.
>  # ANN1 will process b1gs1 and mark it as corrupt.
> If the example above happens for every DN which stores b1, then when the HA 
> failover happens, b1 will be incorrectly marked as corrupt. This will be 
> fixed when the first DN sends a FullBlockReport or an IBR for b1.
> h2. Logs from Active Cluster
> I added the following logs to confirm this issue in an active cluster:
> {code:java}
> BlockToMarkCorrupt c = checkReplicaCorrupt(
> block, reportedState, storedBlock, ucState, dn);
> if (c != null) {
>   DatanodeStorageInfo storedStorageInfo = storedBlock.findStorageInfo(dn);
>   LOG.info("Found corrupt block {} [{}, {}] from DN {}. Stored block {} from 
> DN {}",
>   block, reportedState.name(), ucState.name(), storageInfo, storedBlock, 
> storedStorageInfo);
>   if (storageInfo.equals(storedStorageInfo) &&
> storedBlock.getGenerationStamp() > block.getGenerationStamp()) {
> LOG.info("Stored Block {} from the same DN {} has a newer GenStamp." +
> storedBlock, storedStorageInfo);
>   }
>   if (shouldPostponeBlocksFromFuture) {
> // If the block is an out-of-date generation stamp or state,
> // but we're the standby, we shouldn't treat it as corrupt,
> // but instead just queue it for later processing.
> // Storing the reported block for later processing, as that is what
> // comes from the IBR / FBR and hence what we should use to compare
> // against the memory state.
> // See HDFS-6289 and HDFS-15422 for more context.
> queueReportedBlock(storageInfo, block, reportedState,
> QUEUE_REASON_CORRUPT_STATE);
> LOG.info("Queueing the block {} for later processing", block);
>   } else {
>

[jira] [Updated] (HDFS-17453) IncrementalBlockReport can have race condition with Edit Log Tailer

2024-04-04 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-17453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-17453:
--
Labels: pull-request-available  (was: )

> IncrementalBlockReport can have race condition with Edit Log Tailer
> ---
>
> Key: HDFS-17453
> URL: https://issues.apache.org/jira/browse/HDFS-17453
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: auto-failover, ha, hdfs, namenode
>Affects Versions: 3.3.0, 3.3.1, 2.10.2, 3.3.2, 3.3.5, 3.3.4, 3.3.6
>Reporter: Danny Becker
>Assignee: Danny Becker
>Priority: Major
>  Labels: pull-request-available
>
> h2. Summary
> There is a race condition between IncrementalBlockReports (IBR) and 
> EditLogTailer in Standby NameNode (SNN) which can lead to leaked IBRs and 
> false corrupt blocks after HA Failover. The race condition occurs when the 
> SNN loads the edit logs before it receives the block reports from DataNode 
> (DN).
> h2. Example
> In the following example there is a block (b1) with 3 generation stamps (gs1, 
> gs2, gs3).
>  # SNN1 loads edit logs for b1gs1 and b1gs2.
>  # DN1 sends the IBR for b1gs1 to SNN1.
>  # SNN1 will determine that the reported block b1gs1 from DN1 is corrupt and 
> it will be queued for later. 
> [BlockManager.java|https://github.com/apache/hadoop/blob/6ed73896f6e8b4b7c720eff64193cb30b3e77fb2/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java#L3447C1-L3464C6]
> {code:java}
>     BlockToMarkCorrupt c = checkReplicaCorrupt(
>         block, reportedState, storedBlock, ucState, dn);
>     if (c != null) {
>       if (shouldPostponeBlocksFromFuture) {
>         // If the block is an out-of-date generation stamp or state,
>         // but we're the standby, we shouldn't treat it as corrupt,
>         // but instead just queue it for later processing.
>         // Storing the reported block for later processing, as that is what
>         // comes from the IBR / FBR and hence what we should use to compare
>         // against the memory state.
>         // See HDFS-6289 and HDFS-15422 for more context.
>         queueReportedBlock(storageInfo, block, reportedState,
>             QUEUE_REASON_CORRUPT_STATE);
>       } else {
>         toCorrupt.add(c);
>       }
>       return storedBlock;
>     } {code}
>  # DN1 sends IBR for b1gs2 and b1gs3 to SNN1.
>  # SNN1 processes b1sg2 and updates the blocks map.
>  # SNN1 queues b1gs3 for later because it determines that b1gs3 is a future 
> genstamp.
>  # SNN1 loads b1gs3 edit logs and processes the queued reports for b1.
>  # SNN1 processes b1gs1 first and puts it back in the queue.
>  # SNN1 processes b1gs3 next and updates the blocks map.
>  # Later, SNN1 becomes the Active NameNode (ANN) during an HA Failover.
>  # SNN1 will catch to the latest edit logs, then process all queued block 
> reports to become the ANN.
>  # ANN1 will process b1gs1 and mark it as corrupt.
> If the example above happens for every DN which stores b1, then when the HA 
> failover happens, b1 will be incorrectly marked as corrupt. This will be 
> fixed when the first DN sends a FullBlockReport or an IBR for b1.
> h2. Logs from Active Cluster
> I added the following logs to confirm this issue in an active cluster:
> {code:java}
> BlockToMarkCorrupt c = checkReplicaCorrupt(
> block, reportedState, storedBlock, ucState, dn);
> if (c != null) {
>   DatanodeStorageInfo storedStorageInfo = storedBlock.findStorageInfo(dn);
>   LOG.info("Found corrupt block {} [{}, {}] from DN {}. Stored block {} from 
> DN {}",
>   block, reportedState.name(), ucState.name(), storageInfo, storedBlock, 
> storedStorageInfo);
>   if (storageInfo.equals(storedStorageInfo) &&
> storedBlock.getGenerationStamp() > block.getGenerationStamp()) {
> LOG.info("Stored Block {} from the same DN {} has a newer GenStamp." +
> storedBlock, storedStorageInfo);
>   }
>   if (shouldPostponeBlocksFromFuture) {
> // If the block is an out-of-date generation stamp or state,
> // but we're the standby, we shouldn't treat it as corrupt,
> // but instead just queue it for later processing.
> // Storing the reported block for later processing, as that is what
> // comes from the IBR / FBR and hence what we should use to compare
> // against the memory state.
> // See HDFS-6289 and HDFS-15422 for more context.
> queueReportedBlock(storageInfo, block, reportedState,
> QUEUE_REASON_CORRUPT_STATE);
> LOG.info("Queueing the block {} for later processing", block);
>   } else {
> toCorrupt.add(c);
> LOG.info("Marking the block {} as corrupt", block);
>   }
>   return storedBlock;
> } {code}
>  
> Logs from nn1 (Active):
> {code:java}
>

[jira] [Updated] (HDFS-17453) IncrementalBlockReport can have race condition with Edit Log Tailer

2024-04-04 Thread Danny Becker (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-17453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Becker updated HDFS-17453:

Description: 
h2. Summary

There is a race condition between IncrementalBlockReports (IBR) and 
EditLogTailer in Standby NameNode (SNN) which can lead to leaked IBRs and false 
corrupt blocks after HA Failover. The race condition occurs when the SNN loads 
the edit logs before it receives the block reports from DataNode (DN).
h2. Example

In the following example there is a block (b1) with 3 generation stamps (gs1, 
gs2, gs3).
 # SNN1 loads edit logs for b1gs1 and b1gs2.
 # DN1 sends the IBR for b1gs1 to SNN1.
 # SNN1 will determine that the reported block b1gs1 from DN1 is corrupt and it 
will be queued for later. 
[BlockManager.java|https://github.com/apache/hadoop/blob/6ed73896f6e8b4b7c720eff64193cb30b3e77fb2/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java#L3447C1-L3464C6]
{code:java}
    BlockToMarkCorrupt c = checkReplicaCorrupt(
        block, reportedState, storedBlock, ucState, dn);
    if (c != null) {
      if (shouldPostponeBlocksFromFuture) {
        // If the block is an out-of-date generation stamp or state,
        // but we're the standby, we shouldn't treat it as corrupt,
        // but instead just queue it for later processing.
        // Storing the reported block for later processing, as that is what
        // comes from the IBR / FBR and hence what we should use to compare
        // against the memory state.
        // See HDFS-6289 and HDFS-15422 for more context.
        queueReportedBlock(storageInfo, block, reportedState,
            QUEUE_REASON_CORRUPT_STATE);
      } else {
        toCorrupt.add(c);
      }
      return storedBlock;
    } {code}

 # DN1 sends IBR for b1gs2 and b1gs3 to SNN1.
 # SNN1 processes b1sg2 and updates the blocks map.
 # SNN1 queues b1gs3 for later because it determines that b1gs3 is a future 
genstamp.
 # SNN1 loads b1gs3 edit logs and processes the queued reports for b1.
 # SNN1 processes b1gs1 first and puts it back in the queue.
 # SNN1 processes b1gs3 next and updates the blocks map.
 # Later, SNN1 becomes the Active NameNode (ANN) during an HA Failover.
 # SNN1 will catch to the latest edit logs, then process all queued block 
reports to become the ANN.
 # ANN1 will process b1gs1 and mark it as corrupt.

If the example above happens for every DN which stores b1, then when the HA 
failover happens, b1 will be incorrectly marked as corrupt. This will be fixed 
when the first DN sends a FullBlockReport or an IBR for b1.
h2. Logs from Active Cluster

I added the following logs to confirm this issue in an active cluster:
{code:java}
BlockToMarkCorrupt c = checkReplicaCorrupt(
block, reportedState, storedBlock, ucState, dn);
if (c != null) {
  DatanodeStorageInfo storedStorageInfo = storedBlock.findStorageInfo(dn);
  LOG.info("Found corrupt block {} [{}, {}] from DN {}. Stored block {} from DN 
{}",
  block, reportedState.name(), ucState.name(), storageInfo, storedBlock, 
storedStorageInfo);
  if (storageInfo.equals(storedStorageInfo) &&
storedBlock.getGenerationStamp() > block.getGenerationStamp()) {
LOG.info("Stored Block {} from the same DN {} has a newer GenStamp." +
storedBlock, storedStorageInfo);
  }
  if (shouldPostponeBlocksFromFuture) {
// If the block is an out-of-date generation stamp or state,
// but we're the standby, we shouldn't treat it as corrupt,
// but instead just queue it for later processing.
// Storing the reported block for later processing, as that is what
// comes from the IBR / FBR and hence what we should use to compare
// against the memory state.
// See HDFS-6289 and HDFS-15422 for more context.
queueReportedBlock(storageInfo, block, reportedState,
QUEUE_REASON_CORRUPT_STATE);
LOG.info("Queueing the block {} for later processing", block);
  } else {
toCorrupt.add(c);
LOG.info("Marking the block {} as corrupt", block);
  }
  return storedBlock;
} {code}
 

Logs from nn1 (Active):
{code:java}
2024-04-03T03:00:52.524-0700,INFO,[IPC Server handler 6 on default port 
443],org.apache.hadoop.hdfs.server.namenode.FSNamesystem,"updatePipeline(blk_66092666802_65700910634,
 newGS=65700925027, newLength=10485760, newNodes=[[DN1]:10010, [DN2]:10010, 
[DN3]:10010, client=client1)"
2024-04-03T03:00:52.539-0700,INFO,[IPC Server handler 6 on default port 
443],org.apache.hadoop.hdfs.server.namenode.FSNamesystem,"updatePipeline(blk_66092666802_65700910634
 => blk_66092666802_65700925027) success"
2024-04-03T03:01:07.413-0700,INFO,[IPC Server handler 6 on default port 
443],org.apache.hadoop.hdfs.server.namenode.FSNamesystem,"updatePipeline(blk_66092666802_65700925027,
 newGS=65700933553, newLength=20971520, newNodes=[[DN1]:10010, [DN2]:10010, 
[DN3]:10010, client=client1)"

[jira] [Commented] (HDFS-17453) IncrementalBlockReport can have race condition with Edit Log Tailer

2024-04-04 Thread ASF GitHub Bot (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-17453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834067#comment-17834067
 ] 

ASF GitHub Bot commented on HDFS-17453:
---

dannytbecker opened a new pull request, #6708:
URL: https://github.com/apache/hadoop/pull/6708

   
   
   # Description of PR
   ## Summary
   
   There is a race condition between IncrementalBlockReports (IBR) and 
EditLogTailer in Standby NameNode (SNN) which can lead to leaked IBRs and false 
corrupt blocks after HA Failover. The race condition occurs when the SNN loads 
the edit logs before it receives the block reports from DataNode (DN).
   ## Example
   
   In the following example there is a block (b1) with 3 generation stamps 
(gs1, gs2, gs3).
1. SNN1 loads edit logs for b1gs1 and b1gs2.
1. DN1 sends the IBR for b1gs1 to SNN1.
1. SNN1 will determine that the reported block b1gs1 from DN1 is corrupt 
and it will be queued for later. 
[BlockManager.java](https://github.com/apache/hadoop/blob/6ed73896f6e8b4b7c720eff64193cb30b3e77fb2/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java#L3447C1-L3464C6)
   ``` java
       BlockToMarkCorrupt c = checkReplicaCorrupt(
           block, reportedState, storedBlock, ucState, dn);
       if (c != null) {
         if (shouldPostponeBlocksFromFuture) {
           // If the block is an out-of-date generation stamp or state,
           // but we're the standby, we shouldn't treat it as corrupt,
           // but instead just queue it for later processing.
           // Storing the reported block for later processing, as that is what
           // comes from the IBR / FBR and hence what we should use to compare
           // against the memory state.
           // See HDFS-6289 and HDFS-15422 for more context.
           queueReportedBlock(storageInfo, block, reportedState,
               QUEUE_REASON_CORRUPT_STATE);
         } else {
           toCorrupt.add(c);
         }
         return storedBlock;
       }
   ```
1. DN1 sends IBR for b1gs2 and b1gs3 to SNN1.
1. SNN1 processes b1sg2 and updates the blocks map.
1. SNN1 queues b1gs3 for later because it determines that b1gs3 is a future 
genstamp.
1. SNN1 loads b1gs3 edit logs and processes the queued reports for b1.
1. SNN1 processes b1gs1 first and puts it back in the queue.
1. SNN1 processes b1gs3 next and updates the blocks map.
1. Later, SNN1 becomes the Active NameNode (ANN) during an HA Failover.
1. SNN1 will catch to the latest edit logs, then process all queued block 
reports to become the ANN.
1. ANN1 will process b1gs1 and mark it as corrupt.
   
   If the example above happens for every DN which stores b1, then when the HA 
failover happens, b1 will be incorrectly marked as corrupt. This will be fixed 
when the first DN sends a FullBlockReport or an IBR for b1.
   
   ## Logs from Active Cluster
   
   I added the following logs to confirm this issue in an active cluster:
   ``` java
   BlockToMarkCorrupt c = checkReplicaCorrupt(
   block, reportedState, storedBlock, ucState, dn);
   if (c != null) {
 DatanodeStorageInfo storedStorageInfo = storedBlock.findStorageInfo(dn);
 LOG.info("Found corrupt block {} [{}, {}] from DN {}. Stored block {} from 
DN {}",
 block, reportedState.name(), ucState.name(), storageInfo, storedBlock, 
storedStorageInfo);
 if (storageInfo.equals(storedStorageInfo) &&
   storedBlock.getGenerationStamp() > block.getGenerationStamp()) {
   LOG.info("Stored Block {} from the same DN {} has a newer GenStamp." +
   storedBlock, storedStorageInfo);
 }
 if (shouldPostponeBlocksFromFuture) {
   // If the block is an out-of-date generation stamp or state,
   // but we're the standby, we shouldn't treat it as corrupt,
   // but instead just queue it for later processing.
   // Storing the reported block for later processing, as that is what
   // comes from the IBR / FBR and hence what we should use to compare
   // against the memory state.
   // See HDFS-6289 and HDFS-15422 for more context.
   queueReportedBlock(storageInfo, block, reportedState,
   QUEUE_REASON_CORRUPT_STATE);
   LOG.info("Queueing the block {} for later processing", block);
 } else {
   toCorrupt.add(c);
   LOG.info("Marking the block {} as corrupt", block);
 }
 return storedBlock;
   }
   ```
   
   Logs from nn1 (Active):
   ``` java
   2024-04-03T03:00:52.524-0700,INFO,[IPC Server handler 6 on default port 
443],org.apache.hadoop.hdfs.server.namenode.FSNamesystem,"updatePipeline(blk_66092666802_65700910634,
 newGS=65700925027, newLength=10485760, newNodes=[[DN1]:10010, [DN2]:10010, 
[DN3]:10010, client=client1)"
   2024-04-03T03:00:52.539-0700,INFO,[IPC Server handler 6 on default port

[jira] [Commented] (HDFS-17453) IncrementalBlockReport can have race condition with Edit Log Tailer

2024-04-04 Thread Danny Becker (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-17453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834063#comment-17834063
 ] 

Danny Becker commented on HDFS-17453:
-

These JIRAs mention the same race condition.

> IncrementalBlockReport can have race condition with Edit Log Tailer
> ---
>
> Key: HDFS-17453
> URL: https://issues.apache.org/jira/browse/HDFS-17453
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: auto-failover, ha, hdfs, namenode
>Affects Versions: 3.3.0, 3.3.1, 2.10.2, 3.3.2, 3.3.5, 3.3.4, 3.3.6
>Reporter: Danny Becker
>Assignee: Danny Becker
>Priority: Major
>
> h2. Summary
> There is a race condition between IncrementalBlockReports (IBR) and 
> EditLogTailer in Standby NameNode (SNN) which can lead to leaked IBRs and 
> false corrupt blocks after HA Failover. The race condition occurs when the 
> SNN loads the edit logs before it receives the block reports from DataNode 
> (DN).
> h2. Example
> In the following example there is a block (b1) with 3 generation stamps (gs1, 
> gs2, gs3).
>  # SNN1 loads edit logs for b1gs1 and b1gs2.
>  # DN1 sends the IBR for b1gs1 to SNN1.
>  # SNN1 will determine that the reported block b1gs1 from DN1 is corrupt and 
> it will be queued for later. 
> [BlockManager.java|https://github.com/apache/hadoop/blob/6ed73896f6e8b4b7c720eff64193cb30b3e77fb2/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java#L3447C1-L3464C6]
> {code:java}
>     BlockToMarkCorrupt c = checkReplicaCorrupt(
>         block, reportedState, storedBlock, ucState, dn);
>     if (c != null) {
>       if (shouldPostponeBlocksFromFuture) {
>         // If the block is an out-of-date generation stamp or state,
>         // but we're the standby, we shouldn't treat it as corrupt,
>         // but instead just queue it for later processing.
>         // Storing the reported block for later processing, as that is what
>         // comes from the IBR / FBR and hence what we should use to compare
>         // against the memory state.
>         // See HDFS-6289 and HDFS-15422 for more context.
>         queueReportedBlock(storageInfo, block, reportedState,
>             QUEUE_REASON_CORRUPT_STATE);
>       } else {
>         toCorrupt.add(c);
>       }
>       return storedBlock;
>     } {code}
>  # DN1 sends IBR for b1gs2 and b1gs3 to SNN1.
>  # SNN1 processes b1sg2 and updates the blocks map.
>  # SNN1 queues b1gs3 for later because it determines that b1gs3 is a future 
> genstamp.
>  # SNN1 loads b1gs3 edit logs and processes the queued reports for b1.
>  # SNN1 processes b1gs1 first and puts it back in the queue.
>  # SNN1 processes b1gs3 next and updates the blocks map.
>  # Later, SNN1 becomes the Active NameNode (ANN) during an HA Failover.
>  # SNN1 will catch to the latest edit logs, then process all queued block 
> reports to become the ANN.
>  # ANN1 will process b1gs1 and mark it as corrupt.
> If the example above happens for every DN which stores b1, then when the HA 
> failover happens, b1 will be incorrectly marked as corrupt. This will be 
> fixed when the first DN sends a FullBlockReport or an IBR for b1.
> h2. Logs from Active Cluster
> I added the following logs to confirm this issue in an active cluster:
> {code:java}
> BlockToMarkCorrupt c = checkReplicaCorrupt(
> block, reportedState, storedBlock, ucState, dn);
> if (c != null) {
>   DatanodeStorageInfo storedStorageInfo = storedBlock.findStorageInfo(dn);
>   LOG.info("Found corrupt block {} [{}, {}] from DN {}. Stored block {} from 
> DN {}",
>   block, reportedState.name(), ucState.name(), storageInfo, storedBlock, 
> storedStorageInfo);
>   if (storageInfo.equals(storedStorageInfo) &&
> storedBlock.getGenerationStamp() > block.getGenerationStamp()) {
> LOG.info("Stored Block {} from the same DN {} has a newer GenStamp." +
> storedBlock, storedStorageInfo);
>   }
>   if (shouldPostponeBlocksFromFuture) {
> // If the block is an out-of-date generation stamp or state,
> // but we're the standby, we shouldn't treat it as corrupt,
> // but instead just queue it for later processing.
> // Storing the reported block for later processing, as that is what
> // comes from the IBR / FBR and hence what we should use to compare
> // against the memory state.
> // See HDFS-6289 and HDFS-15422 for more context.
> queueReportedBlock(storageInfo, block, reportedState,
> QUEUE_REASON_CORRUPT_STATE);
> LOG.info("Queueing the block {} for later processing", block);
>   } else {
> toCorrupt.add(c);
> LOG.info("Marking the block {} as corrupt", block);
>   }
>   return storedBlock;
> } {code}
>  
> Logs from nn1 (Active):
> {code:java}
>

[jira] [Updated] (HDFS-17453) IncrementalBlockReport can have race condition with Edit Log Tailer

2024-04-04 Thread Danny Becker (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-17453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Becker updated HDFS-17453:

Description: 
h2. Summary

There is a race condition between IncrementalBlockReports (IBR) and 
EditLogTailer in Standby NameNode (SNN) which can lead to leaked IBRs and false 
corrupt blocks after HA Failover. The race condition occurs when the SNN loads 
the edit logs before it receives the block reports from DataNode (DN).
h2. Example

In the following example there is a block (b1) with 3 generation stamps (gs1, 
gs2, gs3).
 # SNN1 loads edit logs for b1gs1 and b1gs2.
 # DN1 sends the IBR for b1gs1 to SNN1.
 # SNN1 will determine that the reported block b1gs1 from DN1 is corrupt and it 
will be queued for later. 
[BlockManager.java|https://github.com/apache/hadoop/blob/6ed73896f6e8b4b7c720eff64193cb30b3e77fb2/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java#L3447C1-L3464C6]
{code:java}
    BlockToMarkCorrupt c = checkReplicaCorrupt(
        block, reportedState, storedBlock, ucState, dn);
    if (c != null) {
      if (shouldPostponeBlocksFromFuture) {
        // If the block is an out-of-date generation stamp or state,
        // but we're the standby, we shouldn't treat it as corrupt,
        // but instead just queue it for later processing.
        // Storing the reported block for later processing, as that is what
        // comes from the IBR / FBR and hence what we should use to compare
        // against the memory state.
        // See HDFS-6289 and HDFS-15422 for more context.
        queueReportedBlock(storageInfo, block, reportedState,
            QUEUE_REASON_CORRUPT_STATE);
      } else {
        toCorrupt.add(c);
      }
      return storedBlock;
    } {code}

 # DN1 sends IBR for b1gs2 and b1gs3 to SNN1.
 # SNN1 processes b1sg2 and updates the blocks map.
 # SNN1 queues b1gs3 for later because it determines that b1gs3 is a future 
genstamp.
 # SNN1 loads b1gs3 edit logs and processes the queued reports for b1.
 # SNN1 processes b1gs1 first and puts it back in the queue.
 # SNN1 processes b1gs3 next and updates the blocks map.
 # Later, SNN1 becomes the Active NameNode (ANN) during an HA Failover.
 # SNN1 will catch to the latest edit logs, then process all queued block 
reports to become the ANN.
 # ANN1 will process b1gs1 and mark it as corrupt.

If the example above happens for every DN which stores b1, then when the HA 
failover happens, b1 will be incorrectly marked as corrupt. This will be fixed 
when the first DN sends a FullBlockReport or an IBR for b1.
h2. Logs from Active Cluster

I added the following logs to confirm this issue in an active cluster:
{code:java}
BlockToMarkCorrupt c = checkReplicaCorrupt(
block, reportedState, storedBlock, ucState, dn);
if (c != null) {
  DatanodeStorageInfo storedStorageInfo = storedBlock.findStorageInfo(dn);
  LOG.info("Found corrupt block {} [{}, {}] from DN {}. Stored block {} from DN 
{}",
  block, reportedState.name(), ucState.name(), storageInfo, storedBlock, 
storedStorageInfo);
  if (storageInfo.equals(storedStorageInfo) &&
storedBlock.getGenerationStamp() > block.getGenerationStamp()) {
LOG.info("Stored Block {} from the same DN {} has a newer GenStamp." +
storedBlock, storedStorageInfo);
  }
  if (shouldPostponeBlocksFromFuture) {
// If the block is an out-of-date generation stamp or state,
// but we're the standby, we shouldn't treat it as corrupt,
// but instead just queue it for later processing.
// Storing the reported block for later processing, as that is what
// comes from the IBR / FBR and hence what we should use to compare
// against the memory state.
// See HDFS-6289 and HDFS-15422 for more context.
queueReportedBlock(storageInfo, block, reportedState,
QUEUE_REASON_CORRUPT_STATE);
LOG.info("Queueing the block {} for later processing", block);
  } else {
toCorrupt.add(c);
LOG.info("Marking the block {} as corrupt", block);
  }
  return storedBlock;
} {code}
 

Logs from nn1 (Active):
{code:java}
2024-04-03T03:00:52.524-0700,INFO,[IPC Server handler 6 on default port 
443],org.apache.hadoop.hdfs.server.namenode.FSNamesystem,"updatePipeline(blk_66092666802_65700910634,
 newGS=65700925027, newLength=10485760, newNodes=[[DN1]:10010, [DN2]:10010, 
[DN3]:10010, client=client1)" 2024-04-03T03:00:52.539-0700,INFO,[IPC Server 
handler 6 on default port 
443],org.apache.hadoop.hdfs.server.namenode.FSNamesystem,"updatePipeline(blk_66092666802_65700910634
 => blk_66092666802_65700925027) success" 
2024-04-03T03:01:07.413-0700,INFO,[IPC Server handler 6 on default port 
443],org.apache.hadoop.hdfs.server.namenode.FSNamesystem,"updatePipeline(blk_66092666802_65700925027,
 newGS=65700933553, newLength=20971520, newNodes=[[DN1]:10010, [DN2]:10010, 
[DN3]:10010, client=client1)"

[jira] [Updated] (HDFS-17453) IncrementalBlockReport can have race condition with Edit Log Tailer

2024-04-04 Thread Danny Becker (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-17453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Becker updated HDFS-17453:

Description: 
h2. Summary

There is a race condition between IncrementalBlockReports (IBR) and 
EditLogTailer in Standby NameNode (SNN) which can lead to leaked IBRs and false 
corrupt blocks after HA Failover. The race condition occurs when the SNN loads 
the edit logs before it receives the block reports from DataNode (DN).
h2. Example

In the following example there is a block (b1) with 3 generation stamps (gs1, 
gs2, gs3).
 # SNN1 loads edit logs for b1gs1 and b1gs2.
 # DN1 sends the IBR for b1gs1 to SNN1.
 # SNN1 will determine that the reported block b1gs1 from DN1 is corrupt and it 
will be queued for later. 
[BlockManager.java|https://github.com/apache/hadoop/blob/6ed73896f6e8b4b7c720eff64193cb30b3e77fb2/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java#L3447C1-L3464C6]
{code:java}
    BlockToMarkCorrupt c = checkReplicaCorrupt(
        block, reportedState, storedBlock, ucState, dn);
    if (c != null) {
      if (shouldPostponeBlocksFromFuture) {
        // If the block is an out-of-date generation stamp or state,
        // but we're the standby, we shouldn't treat it as corrupt,
        // but instead just queue it for later processing.
        // Storing the reported block for later processing, as that is what
        // comes from the IBR / FBR and hence what we should use to compare
        // against the memory state.
        // See HDFS-6289 and HDFS-15422 for more context.
        queueReportedBlock(storageInfo, block, reportedState,
            QUEUE_REASON_CORRUPT_STATE);
      } else {
        toCorrupt.add(c);
      }
      return storedBlock;
    } {code}

 # DN1 sends IBR for b1gs2 and b1gs3 to SNN1.
 # SNN1 processes b1sg2 and updates the blocks map.
 # SNN1 queues b1gs3 for later because it determines that b1gs3 is a future 
genstamp.
 # SNN1 loads b1gs3 edit logs and processes the queued reports for b1.
 # SNN1 processes b1gs1 first and puts it back in the queue.
 # SNN1 processes b1gs3 next and updates the blocks map.
 # Later, SNN1 becomes the Active NameNode (ANN) during an HA Failover.
 # SNN1 will catch to the latest edit logs, then process all queued block 
reports to become the ANN.
 # ANN1 will process b1gs1 and mark it as corrupt.

If the example above happens for every DN which stores b1, then when the HA 
failover happens, b1 will be incorrectly marked as corrupt. This will be fixed 
when the first DN sends a FullBlockReport or an IBR for b1.
h2. Logs from Active Cluster

I added the following logs to confirm this issue in an active cluster:
{code:java}
BlockToMarkCorrupt c = checkReplicaCorrupt(
block, reportedState, storedBlock, ucState, dn);
if (c != null) {
  DatanodeStorageInfo storedStorageInfo = storedBlock.findStorageInfo(dn);
  LOG.info("Found corrupt block {} [{}, {}] from DN {}. Stored block {} from DN 
{}",
  block, reportedState.name(), ucState.name(), storageInfo, storedBlock, 
storedStorageInfo);
  if (storageInfo.equals(storedStorageInfo) &&
storedBlock.getGenerationStamp() > block.getGenerationStamp()) {
LOG.info("Stored Block {} from the same DN {} has a newer GenStamp." +
storedBlock, storedStorageInfo);
  }
  if (shouldPostponeBlocksFromFuture) {
// If the block is an out-of-date generation stamp or state,
// but we're the standby, we shouldn't treat it as corrupt,
// but instead just queue it for later processing.
// Storing the reported block for later processing, as that is what
// comes from the IBR / FBR and hence what we should use to compare
// against the memory state.
// See HDFS-6289 and HDFS-15422 for more context.
queueReportedBlock(storageInfo, block, reportedState,
QUEUE_REASON_CORRUPT_STATE);
LOG.info("Queueing the block {} for later processing", block);
  } else {
toCorrupt.add(c);
LOG.info("Marking the block {} as corrupt", block);
  }
  return storedBlock;
} {code}
Logs from nn1 (Active):

 
{code:java}
2024-04-03T03:00:52.524-0700,INFO,[IPC Server handler 6 on default port 
443],org.apache.hadoop.hdfs.server.namenode.FSNamesystem,"updatePipeline(blk_66092666802_65700910634,
 newGS=65700925027, newLength=10485760, newNodes=[[DN1]:10010, [DN2]:10010, 
[DN3]:10010, client=client1)" 2024-04-03T03:00:52.539-0700,INFO,[IPC Server 
handler 6 on default port 
443],org.apache.hadoop.hdfs.server.namenode.FSNamesystem,"updatePipeline(blk_66092666802_65700910634
 => blk_66092666802_65700925027) success" 
2024-04-03T03:01:07.413-0700,INFO,[IPC Server handler 6 on default port 
443],org.apache.hadoop.hdfs.server.namenode.FSNamesystem,"updatePipeline(blk_66092666802_65700925027,
 newGS=65700933553, newLength=20971520, newNodes=[[DN1]:10010, [DN2]:10010, 
[DN3]:10010, client=client1)"

[jira] [Created] (HDFS-17453) IncrementalBlockReport can have race condition with Edit Log Tailer

2024-04-04 Thread Danny Becker (Jira)

Danny Becker created HDFS-17453:
---

 Summary: IncrementalBlockReport can have race condition with Edit 
Log Tailer
 Key: HDFS-17453
 URL: https://issues.apache.org/jira/browse/HDFS-17453
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: auto-failover, ha, hdfs, namenode
Affects Versions: 3.3.6, 3.3.4, 3.3.5, 3.3.2, 2.10.2, 3.3.1, 3.3.0
Reporter: Danny Becker
Assignee: Danny Becker


h2. Summary

There is a race condition between IncrementalBlockReports (IBR) and 
EditLogTailer in Standby NameNode (SNN) which can lead to leaked IBRs and false 
corrupt blocks after HA Failover. The race condition occurs when the SNN loads 
the edit logs before it receives the block reports from DataNode (DN).
h2. Example

In the following example there is a block (b1) with 3 generation stamps (gs1, 
gs2, gs3).
 # SNN1 loads edit logs for b1gs1 and b1gs2.
 # DN1 sends the IBR for b1gs1 to SNN1.
 # SNN1 will determine that the reported block b1gs1 from DN1 is corrupt and it 
will be queued for later. 
[BlockManager.java|https://github.com/apache/hadoop/blob/6ed73896f6e8b4b7c720eff64193cb30b3e77fb2/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java#L3447C1-L3464C6]
{code:java}
    BlockToMarkCorrupt c = checkReplicaCorrupt(
        block, reportedState, storedBlock, ucState, dn);
    if (c != null) {
      if (shouldPostponeBlocksFromFuture) {
        // If the block is an out-of-date generation stamp or state,
        // but we're the standby, we shouldn't treat it as corrupt,
        // but instead just queue it for later processing.
        // Storing the reported block for later processing, as that is what
        // comes from the IBR / FBR and hence what we should use to compare
        // against the memory state.
        // See HDFS-6289 and HDFS-15422 for more context.
        queueReportedBlock(storageInfo, block, reportedState,
            QUEUE_REASON_CORRUPT_STATE);
      } else {
        toCorrupt.add(c);
      }
      return storedBlock;
    } {code}

 # DN1 sends IBR for b1gs2 and b1gs3 to SNN1.
 # SNN1 processes b1sg2 and updates the blocks map.
 # SNN1 queues b1gs3 for later because it determines that b1gs3 is a future 
genstamp.
 # SNN1 loads b1gs3 edit logs and processes the queued reports for b1.
 # SNN1 processes b1gs1 first and puts it back in the queue.
 # SNN1 processes b1gs3 next and updates the blocks map.
 # Later, SNN1 becomes the Active NameNode (ANN) during an HA Failover.
 # SNN1 will catch to the latest edit logs, then process all queued block 
reports to become the ANN.
 # ANN1 will process b1gs1 and mark it as corrupt.

If the example above happens for every DN which stores b1, then when the HA 
failover happens, b1 will be incorrectly marked as corrupt. This will be fixed 
when the first DN sends a FullBlockReport or an IBR for b1.
h2. Logs from Active Cluster

I added the following logs to confirm this issue in an active cluster:

 
{code:java}
BlockToMarkCorrupt c = checkReplicaCorrupt(
block, reportedState, storedBlock, ucState, dn);
if (c != null) {
  DatanodeStorageInfo storedStorageInfo = storedBlock.findStorageInfo(dn);
  LOG.info("Found corrupt block {} [{}, {}] from DN {}. Stored block {} from DN 
{}",
  block, reportedState.name(), ucState.name(), storageInfo, storedBlock, 
storedStorageInfo);
  if (storageInfo.equals(storedStorageInfo) &&
storedBlock.getGenerationStamp() > block.getGenerationStamp()) {
LOG.info("Stored Block {} from the same DN {} has a newer GenStamp." +
storedBlock, storedStorageInfo);
  }
  if (shouldPostponeBlocksFromFuture) {
// If the block is an out-of-date generation stamp or state,
// but we're the standby, we shouldn't treat it as corrupt,
// but instead just queue it for later processing.
// Storing the reported block for later processing, as that is what
// comes from the IBR / FBR and hence what we should use to compare
// against the memory state.
// See HDFS-6289 and HDFS-15422 for more context.
queueReportedBlock(storageInfo, block, reportedState,
QUEUE_REASON_CORRUPT_STATE);
LOG.info("Queueing the block {} for later processing", block);
  } else {
toCorrupt.add(c);
LOG.info("Marking the block {} as corrupt", block);
  }
  return storedBlock;
} {code}
Logs from nn1 (Active):

 
{code:java}
2024-04-03T03:00:52.524-0700,INFO,[IPC Server handler 6 on default port 
443],org.apache.hadoop.hdfs.server.namenode.FSNamesystem,"updatePipeline(blk_66092666802_65700910634,
 newGS=65700925027, newLength=10485760, newNodes=[[DN1]:10010, [DN2]:10010, 
[DN3]:10010, client=client1)" 2024-04-03T03:00:52.539-0700,INFO,[IPC Server 
handler 6 on default port 
443],org.apache.hadoop.hdfs.server.namenode.FSNamesystem,"updatePipeline(blk_66092666802_65700910634
 => blk_66092666802_65700925027)

[jira] [Commented] (HDFS-17451) RBF: fix spotbugs for redundant nullcheck of dns.

2024-04-04 Thread ASF GitHub Bot (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-17451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17833873#comment-17833873
 ] 

ASF GitHub Bot commented on HDFS-17451:
---

ayushtkn commented on code in PR #6697:
URL: https://github.com/apache/hadoop/pull/6697#discussion_r1551269035


##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterRpcServer.java:
##
@@ -1089,13 +1089,7 @@ public DatanodeInfo[] 
getDatanodeReport(DatanodeReportType type)
   DatanodeInfo[] getCachedDatanodeReport(DatanodeReportType type)
   throws IOException {
 try {
-  DatanodeInfo[] dns = this.dnCache.get(type);
-  if (dns == null) {
-LOG.debug("Get null DN report from cache");
-dns = getCachedDatanodeReportImpl(type);
-this.dnCache.put(type, dns);
-  }
-  return dns;
+  return this.dnCache.get(type);

Review Comment:
   I am not sure if it is correct, atleast I am not sure & it makes my code 
look unsafe





> RBF: fix spotbugs for redundant nullcheck of dns.
> -
>
> Key: HDFS-17451
> URL: https://issues.apache.org/jira/browse/HDFS-17451
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Jian Zhang
>Assignee: Jian Zhang
>Priority: Major
>  Labels: pull-request-available
>
> h2. Dodgy code Warnings
> ||Code||Warning||
> |RCN|Redundant nullcheck of dns, which is known to be non-null in 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getCachedDatanodeReport(HdfsConstants$DatanodeReportType)|
> | |[Bug type RCN_REDUNDANT_NULLCHECK_OF_NONNULL_VALUE (click for 
> details)|https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6655/8/artifact/out/branch-spotbugs-hadoop-hdfs-project_hadoop-hdfs-rbf-warnings.html#RCN_REDUNDANT_NULLCHECK_OF_NONNULL_VALUE]
> In class org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer
> In method 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getCachedDatanodeReport(HdfsConstants$DatanodeReportType)
> Value loaded from dns
> Return value of 
> org.apache.hadoop.thirdparty.com.google.common.cache.LoadingCache.get(Object) 
> of type Object
> Redundant null check at RouterRpcServer.java:[line 1093]|



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Moved] (HDFS-17452) DfsRouterAdmin RefreshCallQueue fails when authorization is enabled

2024-04-04 Thread Ananya Singh (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-17452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ananya Singh moved HADOOP-19142 to HDFS-17452:
--

  Component/s: ipc
   (was: ipc)
  Key: HDFS-17452  (was: HADOOP-19142)
Affects Version/s: 3.3.6
   (was: 3.3.6)
  Project: Hadoop HDFS  (was: Hadoop Common)

> DfsRouterAdmin RefreshCallQueue fails when authorization is enabled
> ---
>
> Key: HDFS-17452
> URL: https://issues.apache.org/jira/browse/HDFS-17452
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ipc
>Affects Versions: 3.3.6
>Reporter: Ananya Singh
>Assignee: Ananya Singh
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-17453) IncrementalBlockReport can have race condition with Edit Log Tailer

[jira] [Commented] (HDFS-17453) IncrementalBlockReport can have race condition with Edit Log Tailer

[jira] [Commented] (HDFS-17451) RBF: fix spotbugs for redundant nullcheck of dns.

[jira] [Commented] (HDFS-17451) RBF: fix spotbugs for redundant nullcheck of dns.

[jira] [Commented] (HDFS-17453) IncrementalBlockReport can have race condition with Edit Log Tailer

[jira] [Updated] (HDFS-17453) IncrementalBlockReport can have race condition with Edit Log Tailer

[jira] [Updated] (HDFS-17453) IncrementalBlockReport can have race condition with Edit Log Tailer

[jira] [Commented] (HDFS-17453) IncrementalBlockReport can have race condition with Edit Log Tailer

[jira] [Commented] (HDFS-17453) IncrementalBlockReport can have race condition with Edit Log Tailer

[jira] [Updated] (HDFS-17453) IncrementalBlockReport can have race condition with Edit Log Tailer

[jira] [Updated] (HDFS-17453) IncrementalBlockReport can have race condition with Edit Log Tailer

[jira] [Created] (HDFS-17453) IncrementalBlockReport can have race condition with Edit Log Tailer

[jira] [Commented] (HDFS-17451) RBF: fix spotbugs for redundant nullcheck of dns.

[jira] [Moved] (HDFS-17452) DfsRouterAdmin RefreshCallQueue fails when authorization is enabled

14 matches

Site Navigation

Mail list logo

Footer information