[jira] [Commented] (HDFS-17451) RBF: fix spotbugs for redundant nullcheck of dns.

2024-04-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17835584#comment-17835584
 ] 

ASF GitHub Bot commented on HDFS-17451:
---

KeeProMise commented on code in PR #6697:
URL: https://github.com/apache/hadoop/pull/6697#discussion_r1558805351


##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterRpcServer.java:
##
@@ -1090,7 +1090,7 @@ DatanodeInfo[] getCachedDatanodeReport(DatanodeReportType 
type)
   throws IOException {
 try {
   DatanodeInfo[] dns = this.dnCache.get(type);
-  if (dns == null) {
+  if (dns.length == 0) {

Review Comment:
   @simbadzina 
   https://github.com/apache/hadoop/assets/38941777/d91a9e69-14a6-44ac-b983-65364ae4dd8c;>
   





> RBF: fix spotbugs for redundant nullcheck of dns.
> -
>
> Key: HDFS-17451
> URL: https://issues.apache.org/jira/browse/HDFS-17451
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Jian Zhang
>Assignee: Jian Zhang
>Priority: Major
>  Labels: pull-request-available
>
> h2. Dodgy code Warnings
> ||Code||Warning||
> |RCN|Redundant nullcheck of dns, which is known to be non-null in 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getCachedDatanodeReport(HdfsConstants$DatanodeReportType)|
> | |[Bug type RCN_REDUNDANT_NULLCHECK_OF_NONNULL_VALUE (click for 
> details)|https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6655/8/artifact/out/branch-spotbugs-hadoop-hdfs-project_hadoop-hdfs-rbf-warnings.html#RCN_REDUNDANT_NULLCHECK_OF_NONNULL_VALUE]
> In class org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer
> In method 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getCachedDatanodeReport(HdfsConstants$DatanodeReportType)
> Value loaded from dns
> Return value of 
> org.apache.hadoop.thirdparty.com.google.common.cache.LoadingCache.get(Object) 
> of type Object
> Redundant null check at RouterRpcServer.java:[line 1093]|



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17453) IncrementalBlockReport can have race condition with Edit Log Tailer

2024-04-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17835585#comment-17835585
 ] 

ASF GitHub Bot commented on HDFS-17453:
---

hadoop-yetus commented on PR #6708:
URL: https://github.com/apache/hadoop/pull/6708#issuecomment-2046497984

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |  11m 37s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 3 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  46m  2s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 22s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |   1m 14s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   1m 12s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 26s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m  9s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 41s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   3m 18s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  35m 44s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 13s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 15s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |   1m 15s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  7s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |   1m  7s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 58s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 15s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 53s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 37s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   3m 17s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  35m 59s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  | 230m  8s |  |  hadoop-hdfs in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 46s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 384m 30s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.45 ServerAPI=1.45 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6708/8/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6708 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 3d8c054a2c10 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 
15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 7e36db7affc66da101cfda141c9f35afc82aa606 |
   | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6708/8/testReport/ |
   | Max. process+thread count | 4658 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6708/8/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




> IncrementalBlockReport can have race 

[jira] [Created] (HDFS-17459) [FGL] Summarize this feature

2024-04-09 Thread ZanderXu (Jira)
ZanderXu created HDFS-17459:
---

 Summary: [FGL] Summarize this feature 
 Key: HDFS-17459
 URL: https://issues.apache.org/jira/browse/HDFS-17459
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: ZanderXu
Assignee: ZanderXu


Write a doc to summarize this feature so we can merge it into the trunk.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-17445) [FGL] All remaining operations support fine-grained locking

2024-04-09 Thread Hui Fei (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hui Fei resolved HDFS-17445.

Resolution: Fixed

> [FGL] All remaining operations support fine-grained locking
> ---
>
> Key: HDFS-17445
> URL: https://issues.apache.org/jira/browse/HDFS-17445
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17445) [FGL] All remaining operations support fine-grained locking

2024-04-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17835565#comment-17835565
 ] 

ASF GitHub Bot commented on HDFS-17445:
---

ferhui merged PR #6715:
URL: https://github.com/apache/hadoop/pull/6715




> [FGL] All remaining operations support fine-grained locking
> ---
>
> Key: HDFS-17445
> URL: https://issues.apache.org/jira/browse/HDFS-17445
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17445) [FGL] All remaining operations support fine-grained locking

2024-04-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17835566#comment-17835566
 ] 

ASF GitHub Bot commented on HDFS-17445:
---

ferhui commented on PR #6715:
URL: https://github.com/apache/hadoop/pull/6715#issuecomment-2046331457

   Thanks for contribution. Merged.




> [FGL] All remaining operations support fine-grained locking
> ---
>
> Key: HDFS-17445
> URL: https://issues.apache.org/jira/browse/HDFS-17445
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17451) RBF: fix spotbugs for redundant nullcheck of dns.

2024-04-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17835543#comment-17835543
 ] 

ASF GitHub Bot commented on HDFS-17451:
---

hadoop-yetus commented on PR #6697:
URL: https://github.com/apache/hadoop/pull/6697#issuecomment-2046132702

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |  12m  3s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  56m  4s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 44s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |   0m 39s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   0m 31s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 46s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 44s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 30s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | -1 :x: |  spotbugs  |   1m 28s | 
[/branch-spotbugs-hadoop-hdfs-project_hadoop-hdfs-rbf-warnings.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6697/3/artifact/out/branch-spotbugs-hadoop-hdfs-project_hadoop-hdfs-rbf-warnings.html)
 |  hadoop-hdfs-project/hadoop-hdfs-rbf in trunk has 1 extant spotbugs 
warnings.  |
   | +1 :green_heart: |  shadedclient  |  36m 35s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 32s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 35s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |   0m 35s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 30s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |   0m 30s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 20s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 35s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 29s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 27s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   1m 33s |  |  
hadoop-hdfs-project/hadoop-hdfs-rbf generated 0 new + 0 unchanged - 1 fixed = 0 
total (was 1)  |
   | +1 :green_heart: |  shadedclient  |  38m 45s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  29m 33s |  |  hadoop-hdfs-rbf in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 39s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 189m 11s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.45 ServerAPI=1.45 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6697/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6697 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 2033f43410ac 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 
15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / f43a1c75f464c081d3de3b807f23bec27fc173cb |
   | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   |  Test Results | 

[jira] [Commented] (HDFS-17453) IncrementalBlockReport can have race condition with Edit Log Tailer

2024-04-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17835540#comment-17835540
 ] 

ASF GitHub Bot commented on HDFS-17453:
---

dannytbecker commented on code in PR #6708:
URL: https://github.com/apache/hadoop/pull/6708#discussion_r1558331856


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/PendingDataNodeMessages.java:
##
@@ -95,16 +95,27 @@ void removeAllMessagesForDatanode(DatanodeDescriptor dn) {
   
   void enqueueReportedBlock(DatanodeStorageInfo storageInfo, Block block,
   ReplicaState reportedState) {
+long genStamp = block.getGenerationStamp();
+Queue queue = null;
 if (BlockIdManager.isStripedBlockID(block.getBlockId())) {
   Block blkId = new Block(BlockIdManager.convertToStripedID(block
   .getBlockId()));
-  getBlockQueue(blkId).add(
-  new ReportedBlockInfo(storageInfo, new Block(block), reportedState));
+  queue = getBlockQueue(blkId);
 } else {
   block = new Block(block);
-  getBlockQueue(block).add(
-  new ReportedBlockInfo(storageInfo, block, reportedState));
+  queue = getBlockQueue(block);
 }
+// We only want the latest non-future reported block to be queued for each
+// DataNode. Otherwise, there can be a race condition that causes an old
+// reported block to be kept in the queue until the SNN switches to ANN and
+// the old reported block will be processed and marked as corrupt by the 
ANN.
+// See HDFS-17453
+int size = queue.size();
+if (queue.removeIf(rbi -> rbi.storageInfo.equals(storageInfo) &&

Review Comment:
   I added these null checks, but I needed to remove the reportedState null 
check because the null value is used by removeStoredBlock





> IncrementalBlockReport can have race condition with Edit Log Tailer
> ---
>
> Key: HDFS-17453
> URL: https://issues.apache.org/jira/browse/HDFS-17453
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: auto-failover, ha, hdfs, namenode
>Affects Versions: 3.3.0, 3.3.1, 2.10.2, 3.3.2, 3.3.5, 3.3.4, 3.3.6
>Reporter: Danny Becker
>Assignee: Danny Becker
>Priority: Major
>  Labels: pull-request-available
>
> h2. Summary
> There is a race condition between IncrementalBlockReports (IBR) and 
> EditLogTailer in Standby NameNode (SNN) which can lead to leaked IBRs and 
> false corrupt blocks after HA Failover. The race condition occurs when the 
> SNN loads the edit logs before it receives the block reports from DataNode 
> (DN).
> h2. Example
> In the following example there is a block (b1) with 3 generation stamps (gs1, 
> gs2, gs3).
>  # SNN1 loads edit logs for b1gs1 and b1gs2.
>  # DN1 sends the IBR for b1gs1 to SNN1.
>  # SNN1 will determine that the reported block b1gs1 from DN1 is corrupt and 
> it will be queued for later. 
> [BlockManager.java|https://github.com/apache/hadoop/blob/6ed73896f6e8b4b7c720eff64193cb30b3e77fb2/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java#L3447C1-L3464C6]
> {code:java}
>     BlockToMarkCorrupt c = checkReplicaCorrupt(
>         block, reportedState, storedBlock, ucState, dn);
>     if (c != null) {
>       if (shouldPostponeBlocksFromFuture) {
>         // If the block is an out-of-date generation stamp or state,
>         // but we're the standby, we shouldn't treat it as corrupt,
>         // but instead just queue it for later processing.
>         // Storing the reported block for later processing, as that is what
>         // comes from the IBR / FBR and hence what we should use to compare
>         // against the memory state.
>         // See HDFS-6289 and HDFS-15422 for more context.
>         queueReportedBlock(storageInfo, block, reportedState,
>             QUEUE_REASON_CORRUPT_STATE);
>       } else {
>         toCorrupt.add(c);
>       }
>       return storedBlock;
>     } {code}
>  # DN1 sends IBR for b1gs2 and b1gs3 to SNN1.
>  # SNN1 processes b1sg2 and updates the blocks map.
>  # SNN1 queues b1gs3 for later because it determines that b1gs3 is a future 
> genstamp.
>  # SNN1 loads b1gs3 edit logs and processes the queued reports for b1.
>  # SNN1 processes b1gs1 first and puts it back in the queue.
>  # SNN1 processes b1gs3 next and updates the blocks map.
>  # Later, SNN1 becomes the Active NameNode (ANN) during an HA Failover.
>  # SNN1 will catch to the latest edit logs, then process all queued block 
> reports to become the ANN.
>  # ANN1 will process b1gs1 and mark it as corrupt.
> If the example above happens for every DN which stores b1, then when the HA 
> failover happens, b1 will be incorrectly marked as corrupt. This will be 
> fixed when the first DN sends a 

[jira] [Commented] (HDFS-17451) RBF: fix spotbugs for redundant nullcheck of dns.

2024-04-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17835515#comment-17835515
 ] 

ASF GitHub Bot commented on HDFS-17451:
---

simbadzina commented on code in PR #6697:
URL: https://github.com/apache/hadoop/pull/6697#discussion_r1558177674


##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterRpcServer.java:
##
@@ -1090,7 +1090,7 @@ DatanodeInfo[] getCachedDatanodeReport(DatanodeReportType 
type)
   throws IOException {
 try {
   DatanodeInfo[] dns = this.dnCache.get(type);
-  if (dns == null) {
+  if (dns.length == 0) {

Review Comment:
   @KeeProMise where do you see that `get` has a `non-null` annotation?





> RBF: fix spotbugs for redundant nullcheck of dns.
> -
>
> Key: HDFS-17451
> URL: https://issues.apache.org/jira/browse/HDFS-17451
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Jian Zhang
>Assignee: Jian Zhang
>Priority: Major
>  Labels: pull-request-available
>
> h2. Dodgy code Warnings
> ||Code||Warning||
> |RCN|Redundant nullcheck of dns, which is known to be non-null in 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getCachedDatanodeReport(HdfsConstants$DatanodeReportType)|
> | |[Bug type RCN_REDUNDANT_NULLCHECK_OF_NONNULL_VALUE (click for 
> details)|https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6655/8/artifact/out/branch-spotbugs-hadoop-hdfs-project_hadoop-hdfs-rbf-warnings.html#RCN_REDUNDANT_NULLCHECK_OF_NONNULL_VALUE]
> In class org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer
> In method 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getCachedDatanodeReport(HdfsConstants$DatanodeReportType)
> Value loaded from dns
> Return value of 
> org.apache.hadoop.thirdparty.com.google.common.cache.LoadingCache.get(Object) 
> of type Object
> Redundant null check at RouterRpcServer.java:[line 1093]|



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17455) Fix Client throw IndexOutOfBoundsException in DFSInputStream#fetchBlockAt

2024-04-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17835412#comment-17835412
 ] 

ASF GitHub Bot commented on HDFS-17455:
---

Hexiaoqiao commented on code in PR #6710:
URL: https://github.com/apache/hadoop/pull/6710#discussion_r1557595628


##
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSInputStream.java:
##
@@ -287,4 +300,69 @@ public void testReadWithoutPreferredCachingReplica() 
throws IOException {
   cluster.shutdown();
 }
   }
+
+  @Test
+  public void testCreateBlockReaderWhenInvalidBlockTokenException() throws
+  IOException, InterruptedException, TimeoutException {
+GenericTestUtils.setLogLevel(DFSClient.LOG, Level.DEBUG);
+Configuration conf = new Configuration();
+DFSClientFaultInjector oldFaultInjector = DFSClientFaultInjector.get();
+try (MiniDFSCluster cluster = new 
MiniDFSCluster.Builder(conf).numDataNodes(3).build()) {
+  cluster.waitActive();
+  DistributedFileSystem fs = cluster.getFileSystem();
+  String file = "/testfile";
+  Path path = new Path(file);
+  long fileLen = 1024 * 64;
+  EnumSet createFlags = EnumSet.of(CREATE);
+  FSDataOutputStream out =  fs.create(path, FsPermission.getFileDefault(), 
createFlags,

Review Comment:
   +1





> Fix Client throw IndexOutOfBoundsException in DFSInputStream#fetchBlockAt
> -
>
> Key: HDFS-17455
> URL: https://issues.apache.org/jira/browse/HDFS-17455
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Haiyang Hu
>Assignee: Haiyang Hu
>Priority: Major
>  Labels: pull-request-available
>
> When the client read data, connect to the datanode, because at this time the 
> datanode access token is invalid will throw InvalidBlockTokenException. At 
> this time, when call fetchBlockAt method will  throw 
> java.lang.IndexOutOfBoundsException causing  read data failed.
> *Root case:*
> * The HDFS file contains only one RBW block, with a block data size of 2048KB.
> * The client open this file and seeks to the offset of 1024KB to read data.
> * Call DFSInputStream#getBlockReader method connect to the datanode,  because 
> at this time the datanode access token is invalid will throw 
> InvalidBlockTokenException., and call DFSInputStream#fetchBlockAt will throw 
> java.lang.IndexOutOfBoundsException.
> {code:java}
> private synchronized DatanodeInfo blockSeekTo(long target)
>  throws IOException {
>if (target >= getFileLength()) {
>// the target size is smaller than fileLength (completeBlockSize + 
> lastBlockBeingWrittenLength),
>// here at this time target is 1024 and getFileLength is 2048
>  throw new IOException("Attempted to read past end of file");
>}
>...
>while (true) {
>  ...
>  try {
>blockReader = getBlockReader(targetBlock, offsetIntoBlock,
>targetBlock.getBlockSize() - offsetIntoBlock, targetAddr,
>storageType, chosenNode);
>if(connectFailedOnce) {
>  DFSClient.LOG.info("Successfully connected to " + targetAddr +
> " for " + targetBlock.getBlock());
>}
>return chosenNode;
>  } catch (IOException ex) {
>...
>} else if (refetchToken > 0 && tokenRefetchNeeded(ex, targetAddr)) {
>  refetchToken--;
>  // Here will catch InvalidBlockTokenException.
>  fetchBlockAt(target);
>} else {
>  ...
>}
>  }
>}
>  }
> private LocatedBlock fetchBlockAt(long offset, long length, boolean useCache)
>   throws IOException {
> maybeRegisterBlockRefresh();
> synchronized(infoLock) {
>   // Here the locatedBlocks only contains one locatedBlock, at this time 
> the offset is 1024 and fileLength is 0,
>   // so the targetBlockIdx is -2
>   int targetBlockIdx = locatedBlocks.findBlock(offset);
>   if (targetBlockIdx < 0) { // block is not cached
> targetBlockIdx = LocatedBlocks.getInsertIndex(targetBlockIdx);
> // Here the targetBlockIdx is 1;
> useCache = false;
>   }
>   if (!useCache) { // fetch blocks
> final LocatedBlocks newBlocks = (length == 0)
> ? dfsClient.getLocatedBlocks(src, offset)
> : dfsClient.getLocatedBlocks(src, offset, length);
> if (newBlocks == null || newBlocks.locatedBlockCount() == 0) {
>   throw new EOFException("Could not find target position " + offset);
> }
> // Update the LastLocatedBlock, if offset is for last block.
> if (offset >= locatedBlocks.getFileLength()) {
>   setLocatedBlocksFields(newBlocks, getLastBlockLength(newBlocks));
> } else {
>   locatedBlocks.insertRange(targetBlockIdx,
>   

[jira] [Commented] (HDFS-17424) [FGL] DelegationTokenSecretManager supports fine-grained lock

2024-04-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17835402#comment-17835402
 ] 

ASF GitHub Bot commented on HDFS-17424:
---

hadoop-yetus commented on PR #6696:
URL: https://github.com/apache/hadoop/pull/6696#issuecomment-2045093108

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 19s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ HDFS-17384 Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  31m 49s |  |  HDFS-17384 passed  |
   | +1 :green_heart: |  compile  |   0m 43s |  |  HDFS-17384 passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |   0m 42s |  |  HDFS-17384 passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   0m 38s |  |  HDFS-17384 passed  |
   | +1 :green_heart: |  mvnsite  |   0m 44s |  |  HDFS-17384 passed  |
   | +1 :green_heart: |  javadoc  |   0m 46s |  |  HDFS-17384 passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 10s |  |  HDFS-17384 passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   1m 45s |  |  HDFS-17384 passed  |
   | +1 :green_heart: |  shadedclient  |  21m 23s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 37s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 40s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |   0m 40s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 36s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |   0m 36s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 31s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 41s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 31s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   1m  1s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   1m 45s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  25m  9s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 207m 17s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6696/3/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 32s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 300m 27s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.tools.TestDFSAdmin |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.45 ServerAPI=1.45 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6696/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6696 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux c8f24b173cce 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 
15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | HDFS-17384 / 50f97bf7cf488948649f5b41e934abec74f14928 |
   | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6696/3/testReport/ |
   | Max. 

[jira] [Commented] (HDFS-17455) Fix Client throw IndexOutOfBoundsException in DFSInputStream#fetchBlockAt

2024-04-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17835355#comment-17835355
 ] 

ASF GitHub Bot commented on HDFS-17455:
---

Hexiaoqiao commented on PR #6710:
URL: https://github.com/apache/hadoop/pull/6710#issuecomment-2044900199

   Got it. My bad, the first feeling is out of scope will return -1 for binary 
search, but actually not.




> Fix Client throw IndexOutOfBoundsException in DFSInputStream#fetchBlockAt
> -
>
> Key: HDFS-17455
> URL: https://issues.apache.org/jira/browse/HDFS-17455
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Haiyang Hu
>Assignee: Haiyang Hu
>Priority: Major
>  Labels: pull-request-available
>
> When the client read data, connect to the datanode, because at this time the 
> datanode access token is invalid will throw InvalidBlockTokenException. At 
> this time, when call fetchBlockAt method will  throw 
> java.lang.IndexOutOfBoundsException causing  read data failed.
> *Root case:*
> * The HDFS file contains only one RBW block, with a block data size of 2048KB.
> * The client open this file and seeks to the offset of 1024KB to read data.
> * Call DFSInputStream#getBlockReader method connect to the datanode,  because 
> at this time the datanode access token is invalid will throw 
> InvalidBlockTokenException., and call DFSInputStream#fetchBlockAt will throw 
> java.lang.IndexOutOfBoundsException.
> {code:java}
> private synchronized DatanodeInfo blockSeekTo(long target)
>  throws IOException {
>if (target >= getFileLength()) {
>// the target size is smaller than fileLength (completeBlockSize + 
> lastBlockBeingWrittenLength),
>// here at this time target is 1024 and getFileLength is 2048
>  throw new IOException("Attempted to read past end of file");
>}
>...
>while (true) {
>  ...
>  try {
>blockReader = getBlockReader(targetBlock, offsetIntoBlock,
>targetBlock.getBlockSize() - offsetIntoBlock, targetAddr,
>storageType, chosenNode);
>if(connectFailedOnce) {
>  DFSClient.LOG.info("Successfully connected to " + targetAddr +
> " for " + targetBlock.getBlock());
>}
>return chosenNode;
>  } catch (IOException ex) {
>...
>} else if (refetchToken > 0 && tokenRefetchNeeded(ex, targetAddr)) {
>  refetchToken--;
>  // Here will catch InvalidBlockTokenException.
>  fetchBlockAt(target);
>} else {
>  ...
>}
>  }
>}
>  }
> private LocatedBlock fetchBlockAt(long offset, long length, boolean useCache)
>   throws IOException {
> maybeRegisterBlockRefresh();
> synchronized(infoLock) {
>   // Here the locatedBlocks only contains one locatedBlock, at this time 
> the offset is 1024 and fileLength is 0,
>   // so the targetBlockIdx is -2
>   int targetBlockIdx = locatedBlocks.findBlock(offset);
>   if (targetBlockIdx < 0) { // block is not cached
> targetBlockIdx = LocatedBlocks.getInsertIndex(targetBlockIdx);
> // Here the targetBlockIdx is 1;
> useCache = false;
>   }
>   if (!useCache) { // fetch blocks
> final LocatedBlocks newBlocks = (length == 0)
> ? dfsClient.getLocatedBlocks(src, offset)
> : dfsClient.getLocatedBlocks(src, offset, length);
> if (newBlocks == null || newBlocks.locatedBlockCount() == 0) {
>   throw new EOFException("Could not find target position " + offset);
> }
> // Update the LastLocatedBlock, if offset is for last block.
> if (offset >= locatedBlocks.getFileLength()) {
>   setLocatedBlocksFields(newBlocks, getLastBlockLength(newBlocks));
> } else {
>   locatedBlocks.insertRange(targetBlockIdx,
>   newBlocks.getLocatedBlocks());
> }
>   }
>   // Here the locatedBlocks only contains one locatedBlock, so will throw 
> java.lang.IndexOutOfBoundsException: Index 1 out of bounds for length 1
>   return locatedBlocks.get(targetBlockIdx);
> }
>   }
> {code}
> The client exception:
> {code:java}
> java.lang.IndexOutOfBoundsException: Index 1 out of bounds for length 1
> at 
> java.base/jdk.internal.util.Preconditions.outOfBounds(Preconditions.java:64)
> at 
> java.base/jdk.internal.util.Preconditions.outOfBoundsCheckIndex(Preconditions.java:70)
> at 
> java.base/jdk.internal.util.Preconditions.checkIndex(Preconditions.java:266)
> at java.base/java.util.Objects.checkIndex(Objects.java:359)
> at java.base/java.util.ArrayList.get(ArrayList.java:427)
> at 
> org.apache.hadoop.hdfs.protocol.LocatedBlocks.get(LocatedBlocks.java:87)
> at 
> 

[jira] [Commented] (HDFS-17424) [FGL] DelegationTokenSecretManager supports fine-grained lock

2024-04-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17835340#comment-17835340
 ] 

ASF GitHub Bot commented on HDFS-17424:
---

ZanderXu commented on PR #6696:
URL: https://github.com/apache/hadoop/pull/6696#issuecomment-2044641581

   > For sync edit logging, it may cause corruption by interspersing edits with 
the end/start segment edits.
   
   As HDFS-13112 said rollEdits is not thread-safe, so it added hasReadLock(). 
But if we can make rollEdits thread-safe, DelegationTokenSecretManager can not 
use FSLock. We can mark it as a improvement and complete in milestone II.




> [FGL] DelegationTokenSecretManager supports fine-grained lock
> -
>
> Key: HDFS-17424
> URL: https://issues.apache.org/jira/browse/HDFS-17424
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: ZanderXu
>Assignee: Yuanbo Liu
>Priority: Major
>  Labels: pull-request-available
>
> DelegationTokenSecretManager supports fine-grained lock



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17424) [FGL] DelegationTokenSecretManager supports fine-grained lock

2024-04-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17835334#comment-17835334
 ] 

ASF GitHub Bot commented on HDFS-17424:
---

ZanderXu commented on code in PR #6696:
URL: https://github.com/apache/hadoop/pull/6696#discussion_r1557344652


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/security/token/delegation/DelegationTokenSecretManager.java:
##
@@ -401,7 +402,10 @@ protected void logExpireToken(final 
DelegationTokenIdentifier dtId)
   // closes the edit log files. Doing this inside the
   // fsn lock will prevent being interrupted when stopping
   // the secret manager.
-  namesystem.readLockInterruptibly();
+  // TODO: delegation token is a very independent system, so
+  // it's proper to use an seperated r/w lock instead of fs lock
+  // for getting/renewing/expiring/canceling token or updating master key.

Review Comment:
   `logUpdateMasterKey` and `logExpireDelegationToken` need to write edit log, 
so HDFS-13112 add `hasReadLock` for these methods. 
   
   So I think this FSLock is needed before we understand why `rollEdits` is not 
thread-safe as HDFS-13112 said.





> [FGL] DelegationTokenSecretManager supports fine-grained lock
> -
>
> Key: HDFS-17424
> URL: https://issues.apache.org/jira/browse/HDFS-17424
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: ZanderXu
>Assignee: Yuanbo Liu
>Priority: Major
>  Labels: pull-request-available
>
> DelegationTokenSecretManager supports fine-grained lock



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17445) [FGL] All remaining operations support fine-grained locking

2024-04-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17835333#comment-17835333
 ] 

ASF GitHub Bot commented on HDFS-17445:
---

ZanderXu commented on PR #6715:
URL: https://github.com/apache/hadoop/pull/6715#issuecomment-2044575542

   The failed UT `hadoop.hdfs.tools.TestDFSAdmin` works locally.




> [FGL] All remaining operations support fine-grained locking
> ---
>
> Key: HDFS-17445
> URL: https://issues.apache.org/jira/browse/HDFS-17445
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17445) [FGL] All remaining operations support fine-grained locking

2024-04-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17835326#comment-17835326
 ] 

ASF GitHub Bot commented on HDFS-17445:
---

hadoop-yetus commented on PR #6715:
URL: https://github.com/apache/hadoop/pull/6715#issuecomment-2044556675

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 46s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ HDFS-17384 Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  52m 22s |  |  HDFS-17384 passed  |
   | +1 :green_heart: |  compile  |   1m 22s |  |  HDFS-17384 passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |   1m 12s |  |  HDFS-17384 passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   1m 14s |  |  HDFS-17384 passed  |
   | +1 :green_heart: |  mvnsite  |   1m 24s |  |  HDFS-17384 passed  |
   | +1 :green_heart: |  javadoc  |   1m  9s |  |  HDFS-17384 passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 42s |  |  HDFS-17384 passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   3m 20s |  |  HDFS-17384 passed  |
   | +1 :green_heart: |  shadedclient  |  40m 27s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 12s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 17s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |   1m 17s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  7s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |   1m  7s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m  4s |  |  
hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 414 unchanged - 5 
fixed = 414 total (was 419)  |
   | +1 :green_heart: |  mvnsite  |   1m 12s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 57s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 34s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   3m 18s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  40m 18s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 266m 17s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6715/3/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 42s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 425m 19s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.tools.TestDFSAdmin |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.44 ServerAPI=1.44 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6715/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6715 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 65d8160b42e7 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 
15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | HDFS-17384 / 3f65a1e1117d75ba75394e6591960548a52db170 |
   | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6715/3/testReport/ |
   | Max. process+thread count | 2790 (vs. ulimit of 

[jira] [Commented] (HDFS-17455) Fix Client throw IndexOutOfBoundsException in DFSInputStream#fetchBlockAt

2024-04-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17835310#comment-17835310
 ] 

ASF GitHub Bot commented on HDFS-17455:
---

ZanderXu commented on PR #6710:
URL: https://github.com/apache/hadoop/pull/6710#issuecomment-2044493479

   > I am a little confused with it why here return -2 which is binary search 
for collections contains only one element, IIUC, it will return -1 for this 
case. Please correct me if i missed something.
   
   The binary search will return -1 if the offset is less than the smallest 
offset in the list. And it will return -2 if the offset is greater than the 
maximum offset in the list.
   
   We can reproduce by the following steps:
   
   1. Assume one file contains 20 completed blocks and 1 UC block.
   2. The current `locatedBlocks` only cache the 10 ~ 20 blocks.
   3. The binary search will return -1 if the offset is less than the offset of 
the first cached block, such as offset in the first ten blocks.
   4. The binary search will return -2 if the offset is greater than the offset 
of the last cached block, such as offset in the last UC block.
   
   
   And another bug is that the `currentBlock` should be relocated after 
`locatedBlocks` is refreshed.




> Fix Client throw IndexOutOfBoundsException in DFSInputStream#fetchBlockAt
> -
>
> Key: HDFS-17455
> URL: https://issues.apache.org/jira/browse/HDFS-17455
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Haiyang Hu
>Assignee: Haiyang Hu
>Priority: Major
>  Labels: pull-request-available
>
> When the client read data, connect to the datanode, because at this time the 
> datanode access token is invalid will throw InvalidBlockTokenException. At 
> this time, when call fetchBlockAt method will  throw 
> java.lang.IndexOutOfBoundsException causing  read data failed.
> *Root case:*
> * The HDFS file contains only one RBW block, with a block data size of 2048KB.
> * The client open this file and seeks to the offset of 1024KB to read data.
> * Call DFSInputStream#getBlockReader method connect to the datanode,  because 
> at this time the datanode access token is invalid will throw 
> InvalidBlockTokenException., and call DFSInputStream#fetchBlockAt will throw 
> java.lang.IndexOutOfBoundsException.
> {code:java}
> private synchronized DatanodeInfo blockSeekTo(long target)
>  throws IOException {
>if (target >= getFileLength()) {
>// the target size is smaller than fileLength (completeBlockSize + 
> lastBlockBeingWrittenLength),
>// here at this time target is 1024 and getFileLength is 2048
>  throw new IOException("Attempted to read past end of file");
>}
>...
>while (true) {
>  ...
>  try {
>blockReader = getBlockReader(targetBlock, offsetIntoBlock,
>targetBlock.getBlockSize() - offsetIntoBlock, targetAddr,
>storageType, chosenNode);
>if(connectFailedOnce) {
>  DFSClient.LOG.info("Successfully connected to " + targetAddr +
> " for " + targetBlock.getBlock());
>}
>return chosenNode;
>  } catch (IOException ex) {
>...
>} else if (refetchToken > 0 && tokenRefetchNeeded(ex, targetAddr)) {
>  refetchToken--;
>  // Here will catch InvalidBlockTokenException.
>  fetchBlockAt(target);
>} else {
>  ...
>}
>  }
>}
>  }
> private LocatedBlock fetchBlockAt(long offset, long length, boolean useCache)
>   throws IOException {
> maybeRegisterBlockRefresh();
> synchronized(infoLock) {
>   // Here the locatedBlocks only contains one locatedBlock, at this time 
> the offset is 1024 and fileLength is 0,
>   // so the targetBlockIdx is -2
>   int targetBlockIdx = locatedBlocks.findBlock(offset);
>   if (targetBlockIdx < 0) { // block is not cached
> targetBlockIdx = LocatedBlocks.getInsertIndex(targetBlockIdx);
> // Here the targetBlockIdx is 1;
> useCache = false;
>   }
>   if (!useCache) { // fetch blocks
> final LocatedBlocks newBlocks = (length == 0)
> ? dfsClient.getLocatedBlocks(src, offset)
> : dfsClient.getLocatedBlocks(src, offset, length);
> if (newBlocks == null || newBlocks.locatedBlockCount() == 0) {
>   throw new EOFException("Could not find target position " + offset);
> }
> // Update the LastLocatedBlock, if offset is for last block.
> if (offset >= locatedBlocks.getFileLength()) {
>   setLocatedBlocksFields(newBlocks, getLastBlockLength(newBlocks));
> } else {
>   locatedBlocks.insertRange(targetBlockIdx,
>   newBlocks.getLocatedBlocks());
> }
>   }
>   // 

[jira] [Commented] (HDFS-17455) Fix Client throw IndexOutOfBoundsException in DFSInputStream#fetchBlockAt

2024-04-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17835294#comment-17835294
 ] 

ASF GitHub Bot commented on HDFS-17455:
---

Hexiaoqiao commented on PR #6710:
URL: https://github.com/apache/hadoop/pull/6710#issuecomment-2044371352

   @haiyang1987 Thanks for your works. At description I see that you mentioned,
   ```
 // Here the locatedBlocks only contains one locatedBlock, at this time 
the offset is 1024 and fileLength is 0,
 // so the targetBlockIdx is -2
 int targetBlockIdx = locatedBlocks.findBlock(offset);
   ```
   This is the root cause here, right ? I am a little confused with it why here 
return -2 which is binary search for collections contains only one element, 
IIUC, it will return -1 for this case. Please correct me if i missed something.




> Fix Client throw IndexOutOfBoundsException in DFSInputStream#fetchBlockAt
> -
>
> Key: HDFS-17455
> URL: https://issues.apache.org/jira/browse/HDFS-17455
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Haiyang Hu
>Assignee: Haiyang Hu
>Priority: Major
>  Labels: pull-request-available
>
> When the client read data, connect to the datanode, because at this time the 
> datanode access token is invalid will throw InvalidBlockTokenException. At 
> this time, when call fetchBlockAt method will  throw 
> java.lang.IndexOutOfBoundsException causing  read data failed.
> *Root case:*
> * The HDFS file contains only one RBW block, with a block data size of 2048KB.
> * The client open this file and seeks to the offset of 1024KB to read data.
> * Call DFSInputStream#getBlockReader method connect to the datanode,  because 
> at this time the datanode access token is invalid will throw 
> InvalidBlockTokenException., and call DFSInputStream#fetchBlockAt will throw 
> java.lang.IndexOutOfBoundsException.
> {code:java}
> private synchronized DatanodeInfo blockSeekTo(long target)
>  throws IOException {
>if (target >= getFileLength()) {
>// the target size is smaller than fileLength (completeBlockSize + 
> lastBlockBeingWrittenLength),
>// here at this time target is 1024 and getFileLength is 2048
>  throw new IOException("Attempted to read past end of file");
>}
>...
>while (true) {
>  ...
>  try {
>blockReader = getBlockReader(targetBlock, offsetIntoBlock,
>targetBlock.getBlockSize() - offsetIntoBlock, targetAddr,
>storageType, chosenNode);
>if(connectFailedOnce) {
>  DFSClient.LOG.info("Successfully connected to " + targetAddr +
> " for " + targetBlock.getBlock());
>}
>return chosenNode;
>  } catch (IOException ex) {
>...
>} else if (refetchToken > 0 && tokenRefetchNeeded(ex, targetAddr)) {
>  refetchToken--;
>  // Here will catch InvalidBlockTokenException.
>  fetchBlockAt(target);
>} else {
>  ...
>}
>  }
>}
>  }
> private LocatedBlock fetchBlockAt(long offset, long length, boolean useCache)
>   throws IOException {
> maybeRegisterBlockRefresh();
> synchronized(infoLock) {
>   // Here the locatedBlocks only contains one locatedBlock, at this time 
> the offset is 1024 and fileLength is 0,
>   // so the targetBlockIdx is -2
>   int targetBlockIdx = locatedBlocks.findBlock(offset);
>   if (targetBlockIdx < 0) { // block is not cached
> targetBlockIdx = LocatedBlocks.getInsertIndex(targetBlockIdx);
> // Here the targetBlockIdx is 1;
> useCache = false;
>   }
>   if (!useCache) { // fetch blocks
> final LocatedBlocks newBlocks = (length == 0)
> ? dfsClient.getLocatedBlocks(src, offset)
> : dfsClient.getLocatedBlocks(src, offset, length);
> if (newBlocks == null || newBlocks.locatedBlockCount() == 0) {
>   throw new EOFException("Could not find target position " + offset);
> }
> // Update the LastLocatedBlock, if offset is for last block.
> if (offset >= locatedBlocks.getFileLength()) {
>   setLocatedBlocksFields(newBlocks, getLastBlockLength(newBlocks));
> } else {
>   locatedBlocks.insertRange(targetBlockIdx,
>   newBlocks.getLocatedBlocks());
> }
>   }
>   // Here the locatedBlocks only contains one locatedBlock, so will throw 
> java.lang.IndexOutOfBoundsException: Index 1 out of bounds for length 1
>   return locatedBlocks.get(targetBlockIdx);
> }
>   }
> {code}
> The client exception:
> {code:java}
> java.lang.IndexOutOfBoundsException: Index 1 out of bounds for length 1
> at 
> 

[jira] [Updated] (HDFS-17458) Remove unnecessary BP lock in ReplicaMap

2024-04-09 Thread farmmamba (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

farmmamba updated HDFS-17458:
-
Description: 
In HDFS-16429 we make LightWeightResizableGSet to be thread safe, and in 
HDFS-16511  we change some methods in ReplicaMap to acquire read lock instead 
of acquiring write lock.

This PR try to remove unnecessary Block_Pool read lock further.

Recently, I performed stress tests on datanodes to measure their read/write 
operations/second.

Before we removing some lock,  it can only achieve ~2K write ops. After 
optimizing, it can achieve more than 5K write ops.

> Remove unnecessary BP lock in ReplicaMap
> 
>
> Key: HDFS-17458
> URL: https://issues.apache.org/jira/browse/HDFS-17458
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.4.0
>Reporter: farmmamba
>Assignee: farmmamba
>Priority: Major
>
> In HDFS-16429 we make LightWeightResizableGSet to be thread safe, and in 
> HDFS-16511  we change some methods in ReplicaMap to acquire read lock instead 
> of acquiring write lock.
> This PR try to remove unnecessary Block_Pool read lock further.
> Recently, I performed stress tests on datanodes to measure their read/write 
> operations/second.
> Before we removing some lock,  it can only achieve ~2K write ops. After 
> optimizing, it can achieve more than 5K write ops.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-17458) Remove unnecessary BP lock in ReplicaMap

2024-04-09 Thread farmmamba (Jira)
farmmamba created HDFS-17458:


 Summary: Remove unnecessary BP lock in ReplicaMap
 Key: HDFS-17458
 URL: https://issues.apache.org/jira/browse/HDFS-17458
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 3.4.0
Reporter: farmmamba
Assignee: farmmamba






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org