date:20240410

[jira] [Commented] (HDFS-17460) Make All HDFS Server side component uniform use the value of `dfs.internal.nameservices` to override `dfs.nameservices`

2024-04-10 Thread xiaojunxiang (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-17460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17835988#comment-17835988
 ] 

xiaojunxiang commented on HDFS-17460:
-

[~Keepromise] Hi, I have aready added my reasons, Feel free to share your 
thoughts.
 
 
 
 
 
 

> Make All HDFS Server side component uniform use the value of 
> `dfs.internal.nameservices` to override `dfs.nameservices` 
> 
>
> Key: HDFS-17460
> URL: https://issues.apache.org/jira/browse/HDFS-17460
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, rbf
>Reporter: xiaojunxiang
>Priority: Major
>
>   In HDFS, `dfs.internal.nameservices` is a server-side configuration, while 
> `dfs.nameservices` is both a server-side and HDFSClient-side configuration.
>  
>   The internal configuration implies it's for internal systems. 
> Now on the DataNode and JournalNode side, the priority of 
> dfs.internal.nameservices configuration is higher than `dfs.nameservices`, 
> which makes sense. {color:#FF}*However,*{color} on the ZKFC + NameNode 
> side, internal configuration isn't considered.
>  
>   In such a scenario, when I enable RBF, the NameNode side expects 
> `dfs.nameservices`=ns1,ns2 (It now ignoring `dfs.internal.nameservices`),  
> {color:#FF}while {color}the HDFSClient side expects 
> `dfs.nameservices`=ns1,ns2,nsRouter (It now ignoring 
> `dfs.internal.nameservices`). 
> {color:#FF}This means they can't use the same configuration, It would be 
> too inconvenient{color}
>  
>   If I unify all {color:#FF}server-side{color} configurations to use 
> `dfs.internal.nameservices`, meaning using the value of 
> `dfs.internal.nameservices` to {color:#FF}override 
> {color}`dfs.nameservices` during process startup,
> {color:#FF}it brings the following benefits{color}:
>   1、Both server and HDFSClient can {color:#FF}use the same 
> configuration{color}: `dfs.nameservices`=ns1,ns2,nsRouter, 
> `dfs.internal.nameservices`=ns1,ns2.
>   2、Previously, the priority of `dfs.internal.nameservices` over 
> `dfs.nameservices` was only considered in DataNode and JournalNode 
> configurations,  but NameNode didn't consider dfs.internal.nameservices. This 
> inconsistency in internal services doesn't look good and can be avoided. And 
> now, by unifying them, {color:#FF}it better aligns with the original 
> semantic meaning of the word "internal" as it pertains to internal 
> systems.{color}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-17460) Make All HDFS Server side component uniform use the value of `dfs.internal.nameservices` to override `dfs.nameservices`

2024-04-10 Thread xiaojunxiang (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-17460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaojunxiang updated HDFS-17460:

Description: 
  In HDFS, `dfs.internal.nameservices` is a server-side configuration, while 
`dfs.nameservices` is both a server-side and HDFSClient-side configuration.
 
  The internal configuration implies it's for internal systems. 
Now on the DataNode and JournalNode side, the priority of 
dfs.internal.nameservices configuration is higher than `dfs.nameservices`, 
which makes sense. {color:#FF}*However,*{color} on the ZKFC + NameNode 
side, internal configuration isn't considered.
 
  In such a scenario, when I enable RBF, the NameNode side expects 
`dfs.nameservices`=ns1,ns2 (It now ignoring `dfs.internal.nameservices`),  
{color:#FF}while {color}the HDFSClient side expects 
`dfs.nameservices`=ns1,ns2,nsRouter (It now ignoring 
`dfs.internal.nameservices`). 
{color:#FF}This means they can't use the same configuration, It would be 
too inconvenient{color}

 
  If I unify all {color:#FF}server-side{color} configurations to use 
`dfs.internal.nameservices`, meaning using the value of 
`dfs.internal.nameservices` to {color:#FF}override 
{color}`dfs.nameservices` during process startup,
{color:#FF}it brings the following benefits{color}:
  1、Both server and HDFSClient can {color:#FF}use the same 
configuration{color}: `dfs.nameservices`=ns1,ns2,nsRouter, 
`dfs.internal.nameservices`=ns1,ns2.
  2、Previously, the priority of `dfs.internal.nameservices` over 
`dfs.nameservices` was only considered in DataNode and JournalNode 
configurations,  but NameNode didn't consider dfs.internal.nameservices. This 
inconsistency in internal services doesn't look good and can be avoided. And 
now, by unifying them, {color:#FF}it better aligns with the original 
semantic meaning of the word "internal" as it pertains to internal 
systems.{color}

  was:
  In HDFS, dfs.internal.nameservices is a server-side configuration, while 
dfs.nameservices is both a server-side and client-side configuration.
 
  The internal configuration implies it's for internal systems. 
This means on the DataNode and JournalNode side, the priority of 
dfs.internal.nameservices configuration is higher than dfs.nameservices, which 
makes sense.
However, on the ZKFC + NameNode side, internal configuration isn't considered.
 
  In such a scenario, when I enable RBF, the NameNode side expects 
dfs.nameservices=ns1,ns2 (ignoring dfs.internal.nameservices), 
while the HDFSClient side expects dfs.nameservices=ns1,ns2,nsRouter (ignoring 
dfs.internal.nameservices). 
This means they can't use the same configuration.
 
  If I unify all server-side configurations to use dfs.internal.nameservices, 
meaning using the value of dfs.internal.nameservices to override 
dfs.nameservices during process startup,
it brings the following benefits:
  1、Both server and client can use the same configuration: 
dfs.nameservices=ns1,ns2,nsRouter, dfs.internal.nameservices=ns1,ns2.
  2、Previously, the priority of dfs.internal.nameservices over dfs.nameservices 
was only considered in DataNode and JournalNode configurations,  but NameNode 
didn't consider dfs.internal.nameservices. This inconsistency in internal 
services doesn't look good and can be avoided. And now, by unifying them, it 
better aligns with the original semantic meaning of the word "internal" as it 
pertains to internal systems.


> Make All HDFS Server side component uniform use the value of 
> `dfs.internal.nameservices` to override `dfs.nameservices` 
> 
>
> Key: HDFS-17460
> URL: https://issues.apache.org/jira/browse/HDFS-17460
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, rbf
>Reporter: xiaojunxiang
>Priority: Major
>
>   In HDFS, `dfs.internal.nameservices` is a server-side configuration, while 
> `dfs.nameservices` is both a server-side and HDFSClient-side configuration.
>  
>   The internal configuration implies it's for internal systems. 
> Now on the DataNode and JournalNode side, the priority of 
> dfs.internal.nameservices configuration is higher than `dfs.nameservices`, 
> which makes sense. {color:#FF}*However,*{color} on the ZKFC + NameNode 
> side, internal configuration isn't considered.
>  
>   In such a scenario, when I enable RBF, the NameNode side expects 
> `dfs.nameservices`=ns1,ns2 (It now ignoring `dfs.internal.nameservices`),  
> {color:#FF}while {color}the HDFSClient side expects 
> `dfs.nameservices`=ns1,ns2,nsRouter (It now ignoring 
> `dfs.internal.nameservices`). 
> {color:#FF}This means they can't use the same configuration, It would be 
> too inconvenient{color}
>  
>   If I unify all

[jira] [Updated] (HDFS-17460) Make All HDFS Server side component uniform use the value of `dfs.internal.nameservices` to override `dfs.nameservices`

2024-04-10 Thread xiaojunxiang (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-17460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaojunxiang updated HDFS-17460:

Description: 
  In HDFS, dfs.internal.nameservices is a server-side configuration, while 
dfs.nameservices is both a server-side and client-side configuration.
 
  The internal configuration implies it's for internal systems. 
This means on the DataNode and JournalNode side, the priority of 
dfs.internal.nameservices configuration is higher than dfs.nameservices, which 
makes sense.
However, on the ZKFC + NameNode side, internal configuration isn't considered.
 
  In such a scenario, when I enable RBF, the NameNode side expects 
dfs.nameservices=ns1,ns2 (ignoring dfs.internal.nameservices), 
while the HDFSClient side expects dfs.nameservices=ns1,ns2,nsRouter (ignoring 
dfs.internal.nameservices). 
This means they can't use the same configuration.
 
  If I unify all server-side configurations to use dfs.internal.nameservices, 
meaning using the value of dfs.internal.nameservices to override 
dfs.nameservices during process startup,
it brings the following benefits:
  1、Both server and client can use the same configuration: 
dfs.nameservices=ns1,ns2,nsRouter, dfs.internal.nameservices=ns1,ns2.
  2、Previously, the priority of dfs.internal.nameservices over dfs.nameservices 
was only considered in DataNode and JournalNode configurations,  but NameNode 
didn't consider dfs.internal.nameservices. This inconsistency in internal 
services doesn't look good and can be avoided. And now, by unifying them, it 
better aligns with the original semantic meaning of the word "internal" as it 
pertains to internal systems.

> Make All HDFS Server side component uniform use the value of 
> `dfs.internal.nameservices` to override `dfs.nameservices` 
> 
>
> Key: HDFS-17460
> URL: https://issues.apache.org/jira/browse/HDFS-17460
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, rbf
>Reporter: xiaojunxiang
>Priority: Major
>
>   In HDFS, dfs.internal.nameservices is a server-side configuration, while 
> dfs.nameservices is both a server-side and client-side configuration.
>  
>   The internal configuration implies it's for internal systems. 
> This means on the DataNode and JournalNode side, the priority of 
> dfs.internal.nameservices configuration is higher than dfs.nameservices, 
> which makes sense.
> However, on the ZKFC + NameNode side, internal configuration isn't considered.
>  
>   In such a scenario, when I enable RBF, the NameNode side expects 
> dfs.nameservices=ns1,ns2 (ignoring dfs.internal.nameservices), 
> while the HDFSClient side expects dfs.nameservices=ns1,ns2,nsRouter (ignoring 
> dfs.internal.nameservices). 
> This means they can't use the same configuration.
>  
>   If I unify all server-side configurations to use dfs.internal.nameservices, 
> meaning using the value of dfs.internal.nameservices to override 
> dfs.nameservices during process startup,
> it brings the following benefits:
>   1、Both server and client can use the same configuration: 
> dfs.nameservices=ns1,ns2,nsRouter, dfs.internal.nameservices=ns1,ns2.
>   2、Previously, the priority of dfs.internal.nameservices over 
> dfs.nameservices was only considered in DataNode and JournalNode 
> configurations,  but NameNode didn't consider dfs.internal.nameservices. This 
> inconsistency in internal services doesn't look good and can be avoided. And 
> now, by unifying them, it better aligns with the original semantic meaning of 
> the word "internal" as it pertains to internal systems.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-17453) IncrementalBlockReport can have race condition with Edit Log Tailer

2024-04-10 Thread ASF GitHub Bot (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-17453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17835976#comment-17835976
 ] 

ASF GitHub Bot commented on HDFS-17453:
---

yuanboliu commented on PR #6708:
URL: https://github.com/apache/hadoop/pull/6708#issuecomment-2048874743

   @dannytbecker  thanks for your work, I'm just wondering what's the case of 
three GS with one block




> IncrementalBlockReport can have race condition with Edit Log Tailer
> ---
>
> Key: HDFS-17453
> URL: https://issues.apache.org/jira/browse/HDFS-17453
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: auto-failover, ha, hdfs, namenode
>Affects Versions: 3.3.0, 3.3.1, 2.10.2, 3.3.2, 3.3.5, 3.3.4, 3.3.6
>Reporter: Danny Becker
>Assignee: Danny Becker
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0
>
>
> h2. Summary
> There is a race condition between IncrementalBlockReports (IBR) and 
> EditLogTailer in Standby NameNode (SNN) which can lead to leaked IBRs and 
> false corrupt blocks after HA Failover. The race condition occurs when the 
> SNN loads the edit logs before it receives the block reports from DataNode 
> (DN).
> h2. Example
> In the following example there is a block (b1) with 3 generation stamps (gs1, 
> gs2, gs3).
>  # SNN1 loads edit logs for b1gs1 and b1gs2.
>  # DN1 sends the IBR for b1gs1 to SNN1.
>  # SNN1 will determine that the reported block b1gs1 from DN1 is corrupt and 
> it will be queued for later. 
> [BlockManager.java|https://github.com/apache/hadoop/blob/6ed73896f6e8b4b7c720eff64193cb30b3e77fb2/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java#L3447C1-L3464C6]
> {code:java}
>     BlockToMarkCorrupt c = checkReplicaCorrupt(
>         block, reportedState, storedBlock, ucState, dn);
>     if (c != null) {
>       if (shouldPostponeBlocksFromFuture) {
>         // If the block is an out-of-date generation stamp or state,
>         // but we're the standby, we shouldn't treat it as corrupt,
>         // but instead just queue it for later processing.
>         // Storing the reported block for later processing, as that is what
>         // comes from the IBR / FBR and hence what we should use to compare
>         // against the memory state.
>         // See HDFS-6289 and HDFS-15422 for more context.
>         queueReportedBlock(storageInfo, block, reportedState,
>             QUEUE_REASON_CORRUPT_STATE);
>       } else {
>         toCorrupt.add(c);
>       }
>       return storedBlock;
>     } {code}
>  # DN1 sends IBR for b1gs2 and b1gs3 to SNN1.
>  # SNN1 processes b1sg2 and updates the blocks map.
>  # SNN1 queues b1gs3 for later because it determines that b1gs3 is a future 
> genstamp.
>  # SNN1 loads b1gs3 edit logs and processes the queued reports for b1.
>  # SNN1 processes b1gs1 first and puts it back in the queue.
>  # SNN1 processes b1gs3 next and updates the blocks map.
>  # Later, SNN1 becomes the Active NameNode (ANN) during an HA Failover.
>  # SNN1 will catch to the latest edit logs, then process all queued block 
> reports to become the ANN.
>  # ANN1 will process b1gs1 and mark it as corrupt.
> If the example above happens for every DN which stores b1, then when the HA 
> failover happens, b1 will be incorrectly marked as corrupt. This will be 
> fixed when the first DN sends a FullBlockReport or an IBR for b1.
> h2. Logs from Active Cluster
> I added the following logs to confirm this issue in an active cluster:
> {code:java}
> BlockToMarkCorrupt c = checkReplicaCorrupt(
> block, reportedState, storedBlock, ucState, dn);
> if (c != null) {
>   DatanodeStorageInfo storedStorageInfo = storedBlock.findStorageInfo(dn);
>   LOG.info("Found corrupt block {} [{}, {}] from DN {}. Stored block {} from 
> DN {}",
>   block, reportedState.name(), ucState.name(), storageInfo, storedBlock, 
> storedStorageInfo);
>   if (storageInfo.equals(storedStorageInfo) &&
> storedBlock.getGenerationStamp() > block.getGenerationStamp()) {
> LOG.info("Stored Block {} from the same DN {} has a newer GenStamp." +
> storedBlock, storedStorageInfo);
>   }
>   if (shouldPostponeBlocksFromFuture) {
> // If the block is an out-of-date generation stamp or state,
> // but we're the standby, we shouldn't treat it as corrupt,
> // but instead just queue it for later processing.
> // Storing the reported block for later processing, as that is what
> // comes from the IBR / FBR and hence what we should use to compare
> // against the memory state.
> // See HDFS-6289 and HDFS-15422 for more context.
> queueReportedBlock(storageInfo, block, reportedState,
> QUEUE_REASON_CORRUPT_STATE);
> LOG.info("Queueing the block {}

[jira] [Commented] (HDFS-17434) Selector.select in SocketIOWithTimeout.java has significant overhead

2024-04-10 Thread ZanderXu (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-17434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17835950#comment-17835950
 ] 

ZanderXu commented on HDFS-17434:
-

[~qinyuren] Thanks for involving me.  Can you share the screenshot about the 
sendbuffer of the connection on DN side? We need to confirm that if the 
connection is writable.

> Selector.select in SocketIOWithTimeout.java has significant overhead
> 
>
> Key: HDFS-17434
> URL: https://issues.apache.org/jira/browse/HDFS-17434
> Project: Hadoop HDFS
>  Issue Type: Test
>Reporter: qinyuren
>Priority: Major
> Attachments: image-2024-03-20-19-10-13-016.png, 
> image-2024-03-20-19-22-29-829.png, image-2024-03-20-19-24-02-233.png, 
> image-2024-03-20-19-55-18-378.png
>
>
> In our cluster, the SendDataPacketBlockedOnNetworkNanosAvgTime metric ranges 
> from 5ms to 10ms, exceeding the usual disk reading overhead. Our machine 
> network card bandwidth is 2Mb/s.
> !image-2024-03-20-19-10-13-016.png|width=662,height=135!
> !image-2024-03-20-19-55-18-378.png!
> By adding log printing, it turns out that the Selector.select function has 
> significant overhead.
> !image-2024-03-20-19-22-29-829.png|width=474,height=262!
> !image-2024-03-20-19-24-02-233.png|width=445,height=181!
> I would like to know if this falls within the normal range or how we can 
> improve it.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-17455) Fix Client throw IndexOutOfBoundsException in DFSInputStream#fetchBlockAt

2024-04-10 Thread ASF GitHub Bot (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-17455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17835947#comment-17835947
 ] 

ASF GitHub Bot commented on HDFS-17455:
---

haiyang1987 commented on PR #6710:
URL: https://github.com/apache/hadoop/pull/6710#issuecomment-2048812653

   > Hi @haiyang1987 . There are one spotbug find and another unit test failed 
at latest Yetus report which is not related to this changes, would you mind to 
fix it later? Thanks.
   
   Sure, I will to fix it later. Thanks~




> Fix Client throw IndexOutOfBoundsException in DFSInputStream#fetchBlockAt
> -
>
> Key: HDFS-17455
> URL: https://issues.apache.org/jira/browse/HDFS-17455
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Haiyang Hu
>Assignee: Haiyang Hu
>Priority: Major
>  Labels: pull-request-available
>
> When the client read data, connect to the datanode, because at this time the 
> datanode access token is invalid will throw InvalidBlockTokenException. At 
> this time, when call fetchBlockAt method will  throw 
> java.lang.IndexOutOfBoundsException causing  read data failed.
> *Root case:*
> * The HDFS file contains only one RBW block, with a block data size of 2048KB.
> * The client open this file and seeks to the offset of 1024KB to read data.
> * Call DFSInputStream#getBlockReader method connect to the datanode,  because 
> at this time the datanode access token is invalid will throw 
> InvalidBlockTokenException., and call DFSInputStream#fetchBlockAt will throw 
> java.lang.IndexOutOfBoundsException.
> {code:java}
> private synchronized DatanodeInfo blockSeekTo(long target)
>  throws IOException {
>if (target >= getFileLength()) {
>// the target size is smaller than fileLength (completeBlockSize + 
> lastBlockBeingWrittenLength),
>// here at this time target is 1024 and getFileLength is 2048
>  throw new IOException("Attempted to read past end of file");
>}
>...
>while (true) {
>  ...
>  try {
>blockReader = getBlockReader(targetBlock, offsetIntoBlock,
>targetBlock.getBlockSize() - offsetIntoBlock, targetAddr,
>storageType, chosenNode);
>if(connectFailedOnce) {
>  DFSClient.LOG.info("Successfully connected to " + targetAddr +
> " for " + targetBlock.getBlock());
>}
>return chosenNode;
>  } catch (IOException ex) {
>...
>} else if (refetchToken > 0 && tokenRefetchNeeded(ex, targetAddr)) {
>  refetchToken--;
>  // Here will catch InvalidBlockTokenException.
>  fetchBlockAt(target);
>} else {
>  ...
>}
>  }
>}
>  }
> private LocatedBlock fetchBlockAt(long offset, long length, boolean useCache)
>   throws IOException {
> maybeRegisterBlockRefresh();
> synchronized(infoLock) {
>   // Here the locatedBlocks only contains one locatedBlock, at this time 
> the offset is 1024 and fileLength is 0,
>   // so the targetBlockIdx is -2
>   int targetBlockIdx = locatedBlocks.findBlock(offset);
>   if (targetBlockIdx < 0) { // block is not cached
> targetBlockIdx = LocatedBlocks.getInsertIndex(targetBlockIdx);
> // Here the targetBlockIdx is 1;
> useCache = false;
>   }
>   if (!useCache) { // fetch blocks
> final LocatedBlocks newBlocks = (length == 0)
> ? dfsClient.getLocatedBlocks(src, offset)
> : dfsClient.getLocatedBlocks(src, offset, length);
> if (newBlocks == null || newBlocks.locatedBlockCount() == 0) {
>   throw new EOFException("Could not find target position " + offset);
> }
> // Update the LastLocatedBlock, if offset is for last block.
> if (offset >= locatedBlocks.getFileLength()) {
>   setLocatedBlocksFields(newBlocks, getLastBlockLength(newBlocks));
> } else {
>   locatedBlocks.insertRange(targetBlockIdx,
>   newBlocks.getLocatedBlocks());
> }
>   }
>   // Here the locatedBlocks only contains one locatedBlock, so will throw 
> java.lang.IndexOutOfBoundsException: Index 1 out of bounds for length 1
>   return locatedBlocks.get(targetBlockIdx);
> }
>   }
> {code}
> The client exception:
> {code:java}
> java.lang.IndexOutOfBoundsException: Index 1 out of bounds for length 1
> at 
> java.base/jdk.internal.util.Preconditions.outOfBounds(Preconditions.java:64)
> at 
> java.base/jdk.internal.util.Preconditions.outOfBoundsCheckIndex(Preconditions.java:70)
> at 
> java.base/jdk.internal.util.Preconditions.checkIndex(Preconditions.java:266)
> at java.base/java.util.Objects.checkIndex(Objects.java:359)
> at

[jira] [Resolved] (HDFS-17453) IncrementalBlockReport can have race condition with Edit Log Tailer

2024-04-10 Thread Jira



 [ 
https://issues.apache.org/jira/browse/HDFS-17453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri resolved HDFS-17453.

Fix Version/s: 3.5.0
 Hadoop Flags: Reviewed
   Resolution: Fixed

> IncrementalBlockReport can have race condition with Edit Log Tailer
> ---
>
> Key: HDFS-17453
> URL: https://issues.apache.org/jira/browse/HDFS-17453
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: auto-failover, ha, hdfs, namenode
>Affects Versions: 3.3.0, 3.3.1, 2.10.2, 3.3.2, 3.3.5, 3.3.4, 3.3.6
>Reporter: Danny Becker
>Assignee: Danny Becker
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0
>
>
> h2. Summary
> There is a race condition between IncrementalBlockReports (IBR) and 
> EditLogTailer in Standby NameNode (SNN) which can lead to leaked IBRs and 
> false corrupt blocks after HA Failover. The race condition occurs when the 
> SNN loads the edit logs before it receives the block reports from DataNode 
> (DN).
> h2. Example
> In the following example there is a block (b1) with 3 generation stamps (gs1, 
> gs2, gs3).
>  # SNN1 loads edit logs for b1gs1 and b1gs2.
>  # DN1 sends the IBR for b1gs1 to SNN1.
>  # SNN1 will determine that the reported block b1gs1 from DN1 is corrupt and 
> it will be queued for later. 
> [BlockManager.java|https://github.com/apache/hadoop/blob/6ed73896f6e8b4b7c720eff64193cb30b3e77fb2/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java#L3447C1-L3464C6]
> {code:java}
>     BlockToMarkCorrupt c = checkReplicaCorrupt(
>         block, reportedState, storedBlock, ucState, dn);
>     if (c != null) {
>       if (shouldPostponeBlocksFromFuture) {
>         // If the block is an out-of-date generation stamp or state,
>         // but we're the standby, we shouldn't treat it as corrupt,
>         // but instead just queue it for later processing.
>         // Storing the reported block for later processing, as that is what
>         // comes from the IBR / FBR and hence what we should use to compare
>         // against the memory state.
>         // See HDFS-6289 and HDFS-15422 for more context.
>         queueReportedBlock(storageInfo, block, reportedState,
>             QUEUE_REASON_CORRUPT_STATE);
>       } else {
>         toCorrupt.add(c);
>       }
>       return storedBlock;
>     } {code}
>  # DN1 sends IBR for b1gs2 and b1gs3 to SNN1.
>  # SNN1 processes b1sg2 and updates the blocks map.
>  # SNN1 queues b1gs3 for later because it determines that b1gs3 is a future 
> genstamp.
>  # SNN1 loads b1gs3 edit logs and processes the queued reports for b1.
>  # SNN1 processes b1gs1 first and puts it back in the queue.
>  # SNN1 processes b1gs3 next and updates the blocks map.
>  # Later, SNN1 becomes the Active NameNode (ANN) during an HA Failover.
>  # SNN1 will catch to the latest edit logs, then process all queued block 
> reports to become the ANN.
>  # ANN1 will process b1gs1 and mark it as corrupt.
> If the example above happens for every DN which stores b1, then when the HA 
> failover happens, b1 will be incorrectly marked as corrupt. This will be 
> fixed when the first DN sends a FullBlockReport or an IBR for b1.
> h2. Logs from Active Cluster
> I added the following logs to confirm this issue in an active cluster:
> {code:java}
> BlockToMarkCorrupt c = checkReplicaCorrupt(
> block, reportedState, storedBlock, ucState, dn);
> if (c != null) {
>   DatanodeStorageInfo storedStorageInfo = storedBlock.findStorageInfo(dn);
>   LOG.info("Found corrupt block {} [{}, {}] from DN {}. Stored block {} from 
> DN {}",
>   block, reportedState.name(), ucState.name(), storageInfo, storedBlock, 
> storedStorageInfo);
>   if (storageInfo.equals(storedStorageInfo) &&
> storedBlock.getGenerationStamp() > block.getGenerationStamp()) {
> LOG.info("Stored Block {} from the same DN {} has a newer GenStamp." +
> storedBlock, storedStorageInfo);
>   }
>   if (shouldPostponeBlocksFromFuture) {
> // If the block is an out-of-date generation stamp or state,
> // but we're the standby, we shouldn't treat it as corrupt,
> // but instead just queue it for later processing.
> // Storing the reported block for later processing, as that is what
> // comes from the IBR / FBR and hence what we should use to compare
> // against the memory state.
> // See HDFS-6289 and HDFS-15422 for more context.
> queueReportedBlock(storageInfo, block, reportedState,
> QUEUE_REASON_CORRUPT_STATE);
> LOG.info("Queueing the block {} for later processing", block);
>   } else {
> toCorrupt.add(c);
> LOG.info("Marking the block {} as corrupt", block);
>   }
>   return storedBlock;
> } {code}
>  
> Logs

[jira] [Commented] (HDFS-17453) IncrementalBlockReport can have race condition with Edit Log Tailer

2024-04-10 Thread ASF GitHub Bot (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-17453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17835815#comment-17835815
 ] 

ASF GitHub Bot commented on HDFS-17453:
---

goiri merged PR #6708:
URL: https://github.com/apache/hadoop/pull/6708




> IncrementalBlockReport can have race condition with Edit Log Tailer
> ---
>
> Key: HDFS-17453
> URL: https://issues.apache.org/jira/browse/HDFS-17453
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: auto-failover, ha, hdfs, namenode
>Affects Versions: 3.3.0, 3.3.1, 2.10.2, 3.3.2, 3.3.5, 3.3.4, 3.3.6
>Reporter: Danny Becker
>Assignee: Danny Becker
>Priority: Major
>  Labels: pull-request-available
>
> h2. Summary
> There is a race condition between IncrementalBlockReports (IBR) and 
> EditLogTailer in Standby NameNode (SNN) which can lead to leaked IBRs and 
> false corrupt blocks after HA Failover. The race condition occurs when the 
> SNN loads the edit logs before it receives the block reports from DataNode 
> (DN).
> h2. Example
> In the following example there is a block (b1) with 3 generation stamps (gs1, 
> gs2, gs3).
>  # SNN1 loads edit logs for b1gs1 and b1gs2.
>  # DN1 sends the IBR for b1gs1 to SNN1.
>  # SNN1 will determine that the reported block b1gs1 from DN1 is corrupt and 
> it will be queued for later. 
> [BlockManager.java|https://github.com/apache/hadoop/blob/6ed73896f6e8b4b7c720eff64193cb30b3e77fb2/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java#L3447C1-L3464C6]
> {code:java}
>     BlockToMarkCorrupt c = checkReplicaCorrupt(
>         block, reportedState, storedBlock, ucState, dn);
>     if (c != null) {
>       if (shouldPostponeBlocksFromFuture) {
>         // If the block is an out-of-date generation stamp or state,
>         // but we're the standby, we shouldn't treat it as corrupt,
>         // but instead just queue it for later processing.
>         // Storing the reported block for later processing, as that is what
>         // comes from the IBR / FBR and hence what we should use to compare
>         // against the memory state.
>         // See HDFS-6289 and HDFS-15422 for more context.
>         queueReportedBlock(storageInfo, block, reportedState,
>             QUEUE_REASON_CORRUPT_STATE);
>       } else {
>         toCorrupt.add(c);
>       }
>       return storedBlock;
>     } {code}
>  # DN1 sends IBR for b1gs2 and b1gs3 to SNN1.
>  # SNN1 processes b1sg2 and updates the blocks map.
>  # SNN1 queues b1gs3 for later because it determines that b1gs3 is a future 
> genstamp.
>  # SNN1 loads b1gs3 edit logs and processes the queued reports for b1.
>  # SNN1 processes b1gs1 first and puts it back in the queue.
>  # SNN1 processes b1gs3 next and updates the blocks map.
>  # Later, SNN1 becomes the Active NameNode (ANN) during an HA Failover.
>  # SNN1 will catch to the latest edit logs, then process all queued block 
> reports to become the ANN.
>  # ANN1 will process b1gs1 and mark it as corrupt.
> If the example above happens for every DN which stores b1, then when the HA 
> failover happens, b1 will be incorrectly marked as corrupt. This will be 
> fixed when the first DN sends a FullBlockReport or an IBR for b1.
> h2. Logs from Active Cluster
> I added the following logs to confirm this issue in an active cluster:
> {code:java}
> BlockToMarkCorrupt c = checkReplicaCorrupt(
> block, reportedState, storedBlock, ucState, dn);
> if (c != null) {
>   DatanodeStorageInfo storedStorageInfo = storedBlock.findStorageInfo(dn);
>   LOG.info("Found corrupt block {} [{}, {}] from DN {}. Stored block {} from 
> DN {}",
>   block, reportedState.name(), ucState.name(), storageInfo, storedBlock, 
> storedStorageInfo);
>   if (storageInfo.equals(storedStorageInfo) &&
> storedBlock.getGenerationStamp() > block.getGenerationStamp()) {
> LOG.info("Stored Block {} from the same DN {} has a newer GenStamp." +
> storedBlock, storedStorageInfo);
>   }
>   if (shouldPostponeBlocksFromFuture) {
> // If the block is an out-of-date generation stamp or state,
> // but we're the standby, we shouldn't treat it as corrupt,
> // but instead just queue it for later processing.
> // Storing the reported block for later processing, as that is what
> // comes from the IBR / FBR and hence what we should use to compare
> // against the memory state.
> // See HDFS-6289 and HDFS-15422 for more context.
> queueReportedBlock(storageInfo, block, reportedState,
> QUEUE_REASON_CORRUPT_STATE);
> LOG.info("Queueing the block {} for later processing", block);
>   } else {
> toCorrupt.add(c);
> LOG.info("Marking the block {} as corrupt", block);
>   }
>   return storedBlock;
> } {code}
>

[jira] [Commented] (HDFS-17458) Remove unnecessary BP lock in ReplicaMap

2024-04-10 Thread ASF GitHub Bot (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-17458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17835776#comment-17835776
 ] 

ASF GitHub Bot commented on HDFS-17458:
---

hadoop-yetus commented on PR #6717:
URL: https://github.com/apache/hadoop/pull/6717#issuecomment-2047785996

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   6m 41s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  32m 30s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 42s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |   0m 40s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   0m 40s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 45s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 44s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   1m  7s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   1m 42s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  21m 31s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 38s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 41s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |   0m 41s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 35s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |   0m 35s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 28s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 39s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 31s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   1m  0s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | -1 :x: |  spotbugs  |   1m 46s | 
[/new-spotbugs-hadoop-hdfs-project_hadoop-hdfs.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6717/1/artifact/out/new-spotbugs-hadoop-hdfs-project_hadoop-hdfs.html)
 |  hadoop-hdfs-project/hadoop-hdfs generated 1 new + 0 unchanged - 0 fixed = 1 
total (was 0)  |
   | +1 :green_heart: |  shadedclient  |  22m  9s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 204m  2s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6717/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 32s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 301m 12s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | SpotBugs | module:hadoop-hdfs-project/hadoop-hdfs |
   |  |  Return value of putIfAbsent is ignored, but curSet is reused in 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaMap.mergeAll(ReplicaMap)
  At ReplicaMap.java:ignored, but curSet is reused in 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaMap.mergeAll(ReplicaMap)
  At ReplicaMap.java:[line 178] |
   | Failed junit tests | hadoop.hdfs.tools.TestDFSAdmin |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.45 ServerAPI=1.45 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6717/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6717 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux bf96324ae525 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9

[jira] [Commented] (HDFS-17455) Fix Client throw IndexOutOfBoundsException in DFSInputStream#fetchBlockAt

2024-04-10 Thread ASF GitHub Bot (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-17455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17835773#comment-17835773
 ] 

ASF GitHub Bot commented on HDFS-17455:
---

Hexiaoqiao commented on PR #6710:
URL: https://github.com/apache/hadoop/pull/6710#issuecomment-2047772604

   Hi @haiyang1987 . There are one spotbug find and another unit test failed at 
latest Yetus report which is not related to this changes, would you mind to fix 
it later? Thanks.




> Fix Client throw IndexOutOfBoundsException in DFSInputStream#fetchBlockAt
> -
>
> Key: HDFS-17455
> URL: https://issues.apache.org/jira/browse/HDFS-17455
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Haiyang Hu
>Assignee: Haiyang Hu
>Priority: Major
>  Labels: pull-request-available
>
> When the client read data, connect to the datanode, because at this time the 
> datanode access token is invalid will throw InvalidBlockTokenException. At 
> this time, when call fetchBlockAt method will  throw 
> java.lang.IndexOutOfBoundsException causing  read data failed.
> *Root case:*
> * The HDFS file contains only one RBW block, with a block data size of 2048KB.
> * The client open this file and seeks to the offset of 1024KB to read data.
> * Call DFSInputStream#getBlockReader method connect to the datanode,  because 
> at this time the datanode access token is invalid will throw 
> InvalidBlockTokenException., and call DFSInputStream#fetchBlockAt will throw 
> java.lang.IndexOutOfBoundsException.
> {code:java}
> private synchronized DatanodeInfo blockSeekTo(long target)
>  throws IOException {
>if (target >= getFileLength()) {
>// the target size is smaller than fileLength (completeBlockSize + 
> lastBlockBeingWrittenLength),
>// here at this time target is 1024 and getFileLength is 2048
>  throw new IOException("Attempted to read past end of file");
>}
>...
>while (true) {
>  ...
>  try {
>blockReader = getBlockReader(targetBlock, offsetIntoBlock,
>targetBlock.getBlockSize() - offsetIntoBlock, targetAddr,
>storageType, chosenNode);
>if(connectFailedOnce) {
>  DFSClient.LOG.info("Successfully connected to " + targetAddr +
> " for " + targetBlock.getBlock());
>}
>return chosenNode;
>  } catch (IOException ex) {
>...
>} else if (refetchToken > 0 && tokenRefetchNeeded(ex, targetAddr)) {
>  refetchToken--;
>  // Here will catch InvalidBlockTokenException.
>  fetchBlockAt(target);
>} else {
>  ...
>}
>  }
>}
>  }
> private LocatedBlock fetchBlockAt(long offset, long length, boolean useCache)
>   throws IOException {
> maybeRegisterBlockRefresh();
> synchronized(infoLock) {
>   // Here the locatedBlocks only contains one locatedBlock, at this time 
> the offset is 1024 and fileLength is 0,
>   // so the targetBlockIdx is -2
>   int targetBlockIdx = locatedBlocks.findBlock(offset);
>   if (targetBlockIdx < 0) { // block is not cached
> targetBlockIdx = LocatedBlocks.getInsertIndex(targetBlockIdx);
> // Here the targetBlockIdx is 1;
> useCache = false;
>   }
>   if (!useCache) { // fetch blocks
> final LocatedBlocks newBlocks = (length == 0)
> ? dfsClient.getLocatedBlocks(src, offset)
> : dfsClient.getLocatedBlocks(src, offset, length);
> if (newBlocks == null || newBlocks.locatedBlockCount() == 0) {
>   throw new EOFException("Could not find target position " + offset);
> }
> // Update the LastLocatedBlock, if offset is for last block.
> if (offset >= locatedBlocks.getFileLength()) {
>   setLocatedBlocksFields(newBlocks, getLastBlockLength(newBlocks));
> } else {
>   locatedBlocks.insertRange(targetBlockIdx,
>   newBlocks.getLocatedBlocks());
> }
>   }
>   // Here the locatedBlocks only contains one locatedBlock, so will throw 
> java.lang.IndexOutOfBoundsException: Index 1 out of bounds for length 1
>   return locatedBlocks.get(targetBlockIdx);
> }
>   }
> {code}
> The client exception:
> {code:java}
> java.lang.IndexOutOfBoundsException: Index 1 out of bounds for length 1
> at 
> java.base/jdk.internal.util.Preconditions.outOfBounds(Preconditions.java:64)
> at 
> java.base/jdk.internal.util.Preconditions.outOfBoundsCheckIndex(Preconditions.java:70)
> at 
> java.base/jdk.internal.util.Preconditions.checkIndex(Preconditions.java:266)
> at java.base/java.util.Objects.checkIndex(Objects.java:359)
> at java.base/java.util.ArrayList.get(ArrayList.java:427)
> at 
>

[jira] [Commented] (HDFS-17455) Fix Client throw IndexOutOfBoundsException in DFSInputStream#fetchBlockAt

2024-04-10 Thread ASF GitHub Bot (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-17455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17835752#comment-17835752
 ] 

ASF GitHub Bot commented on HDFS-17455:
---

hadoop-yetus commented on PR #6710:
URL: https://github.com/apache/hadoop/pull/6710#issuecomment-2047669969

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 34s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m 17s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  32m 44s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   5m 31s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |   5m 19s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   1m 26s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 25s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 50s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 24s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | -1 :x: |  spotbugs  |   2m 35s | 
[/branch-spotbugs-hadoop-hdfs-project_hadoop-hdfs-client-warnings.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6710/4/artifact/out/branch-spotbugs-hadoop-hdfs-project_hadoop-hdfs-client-warnings.html)
 |  hadoop-hdfs-project/hadoop-hdfs-client in trunk has 1 extant spotbugs 
warnings.  |
   | +1 :green_heart: |  shadedclient  |  35m 44s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 32s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m  1s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   5m 21s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |   5m 21s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   5m 11s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |   5m 11s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m 14s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   2m  4s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   1m 29s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 12s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   5m 55s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  35m 32s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 28s |  |  hadoop-hdfs-client in the patch 
passed.  |
   | -1 :x: |  unit  | 231m  8s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6710/4/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 47s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 405m 41s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.hdfs.server.balancer.TestBalancerWithHANameNodes |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.45 ServerAPI=1.45 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6710/4/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6710 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux eae96800de5f 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 
15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / cc6f05f5edc79189eaa3c0bce002670044e55d4b |
   | Default

[jira] [Commented] (HDFS-17460) Make All HDFS Server side component uniform use the value of `dfs.internal.nameservices` to override `dfs.nameservices`

2024-04-10 Thread Jian Zhang (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-17460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17835678#comment-17835678
 ] 

Jian Zhang commented on HDFS-17460:
---

hi, can you explain why this is necessary?

> Make All HDFS Server side component uniform use the value of 
> `dfs.internal.nameservices` to override `dfs.nameservices` 
> 
>
> Key: HDFS-17460
> URL: https://issues.apache.org/jira/browse/HDFS-17460
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, rbf
>Reporter: xiaojunxiang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-17458) Remove unnecessary BP lock in ReplicaMap

2024-04-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-17458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-17458:
--
Labels: pull-request-available  (was: )

> Remove unnecessary BP lock in ReplicaMap
> 
>
> Key: HDFS-17458
> URL: https://issues.apache.org/jira/browse/HDFS-17458
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.4.0
>Reporter: farmmamba
>Assignee: farmmamba
>Priority: Major
>  Labels: pull-request-available
>
> In HDFS-16429 we make LightWeightResizableGSet to be thread safe, and in 
> HDFS-16511  we change some methods in ReplicaMap to acquire read lock instead 
> of acquiring write lock.
> This PR try to remove unnecessary Block_Pool read lock further.
> Recently, I performed stress tests on datanodes to measure their read/write 
> operations/second.
> Before we removing some lock,  it can only achieve ~2K write ops. After 
> optimizing, it can achieve more than 5K write ops.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-17458) Remove unnecessary BP lock in ReplicaMap

2024-04-10 Thread ASF GitHub Bot (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-17458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17835669#comment-17835669
 ] 

ASF GitHub Bot commented on HDFS-17458:
---

hfutatzhanghb opened a new pull request, #6717:
URL: https://github.com/apache/hadoop/pull/6717

   ### Description of PR
   
   In [HDFS-16429](https://issues.apache.org/jira/browse/HDFS-16429) we make 
LightWeightResizableGSet to be thread safe, and in 
[HDFS-16511](https://issues.apache.org/jira/browse/HDFS-16511)  we change some 
methods in ReplicaMap to acquire read lock instead of acquiring write lock.
   
   This PR try to remove unnecessary Block_Pool read lock further.
   
   Recently, I performed stress tests on datanodes to measure their read/write 
operations/second.
   
   Before we removing some lock（createRbw、finalizeBlock etc.）,  it can only 
achieve ~2K write ops. After optimizing, it can achieve more than 5K write ops.




> Remove unnecessary BP lock in ReplicaMap
> 
>
> Key: HDFS-17458
> URL: https://issues.apache.org/jira/browse/HDFS-17458
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.4.0
>Reporter: farmmamba
>Assignee: farmmamba
>Priority: Major
>
> In HDFS-16429 we make LightWeightResizableGSet to be thread safe, and in 
> HDFS-16511  we change some methods in ReplicaMap to acquire read lock instead 
> of acquiring write lock.
> This PR try to remove unnecessary Block_Pool read lock further.
> Recently, I performed stress tests on datanodes to measure their read/write 
> operations/second.
> Before we removing some lock,  it can only achieve ~2K write ops. After 
> optimizing, it can achieve more than 5K write ops.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDFS-17460) Make All HDFS Server side component uniform use the value of `dfs.internal.nameservices` to override `dfs.nameservices`

2024-04-10 Thread xiaojunxiang (Jira)

xiaojunxiang created HDFS-17460:
---

 Summary: Make All HDFS Server side component uniform use the value 
of `dfs.internal.nameservices` to override `dfs.nameservices` 
 Key: HDFS-17460
 URL: https://issues.apache.org/jira/browse/HDFS-17460
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs, rbf
Reporter: xiaojunxiang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-17442) Fix prompt information in StandbyCheckpointer

2024-04-10 Thread Xiaobao Wu (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-17442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobao Wu updated HDFS-17442:
--
Description: 
Fix prompt information in 
org.apache.hadoop.hdfs.server.namenode.NameNodeLayoutVersion
{code:java}
void prepareToStopStandbyServices() throws ServiceFailedException {
  if (standbyCheckpointer != null) {
standbyCheckpointer.cancelAndPreventCheckpoints(
"About to leave standby state");
  }
} {code}

  was:Fix annotation in 
org.apache.hadoop.hdfs.server.namenode.NameNodeLayoutVersion


> Fix prompt information in StandbyCheckpointer
> -
>
> Key: HDFS-17442
> URL: https://issues.apache.org/jira/browse/HDFS-17442
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.3.0, 3.3.4
>Reporter: Xiaobao Wu
>Priority: Trivial
> Fix For: 3.3.4
>
>
> Fix prompt information in 
> org.apache.hadoop.hdfs.server.namenode.NameNodeLayoutVersion
> {code:java}
> void prepareToStopStandbyServices() throws ServiceFailedException {
>   if (standbyCheckpointer != null) {
> standbyCheckpointer.cancelAndPreventCheckpoints(
> "About to leave standby state");
>   }
> } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-17442) Fix prompt information in StandbyCheckpointer

2024-04-10 Thread Xiaobao Wu (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-17442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobao Wu updated HDFS-17442:
--
Summary: Fix prompt information in StandbyCheckpointer  (was: Fix 
annotation in NameNodeLayoutVersion)

> Fix prompt information in StandbyCheckpointer
> -
>
> Key: HDFS-17442
> URL: https://issues.apache.org/jira/browse/HDFS-17442
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.3.0, 3.3.4
>Reporter: Xiaobao Wu
>Priority: Trivial
> Fix For: 3.3.4
>
>
> Fix annotation in org.apache.hadoop.hdfs.server.namenode.NameNodeLayoutVersion



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-17455) Fix Client throw IndexOutOfBoundsException in DFSInputStream#fetchBlockAt

2024-04-10 Thread ASF GitHub Bot (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-17455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17835625#comment-17835625
 ] 

ASF GitHub Bot commented on HDFS-17455:
---

haiyang1987 commented on PR #6710:
URL: https://github.com/apache/hadoop/pull/6710#issuecomment-2046720852

   Thanks @ZanderXu @Hexiaoqiao for your detailed coment.
   
   Update pr, please  help me review this PR again when you are free, thanks ~




> Fix Client throw IndexOutOfBoundsException in DFSInputStream#fetchBlockAt
> -
>
> Key: HDFS-17455
> URL: https://issues.apache.org/jira/browse/HDFS-17455
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Haiyang Hu
>Assignee: Haiyang Hu
>Priority: Major
>  Labels: pull-request-available
>
> When the client read data, connect to the datanode, because at this time the 
> datanode access token is invalid will throw InvalidBlockTokenException. At 
> this time, when call fetchBlockAt method will  throw 
> java.lang.IndexOutOfBoundsException causing  read data failed.
> *Root case:*
> * The HDFS file contains only one RBW block, with a block data size of 2048KB.
> * The client open this file and seeks to the offset of 1024KB to read data.
> * Call DFSInputStream#getBlockReader method connect to the datanode,  because 
> at this time the datanode access token is invalid will throw 
> InvalidBlockTokenException., and call DFSInputStream#fetchBlockAt will throw 
> java.lang.IndexOutOfBoundsException.
> {code:java}
> private synchronized DatanodeInfo blockSeekTo(long target)
>  throws IOException {
>if (target >= getFileLength()) {
>// the target size is smaller than fileLength (completeBlockSize + 
> lastBlockBeingWrittenLength),
>// here at this time target is 1024 and getFileLength is 2048
>  throw new IOException("Attempted to read past end of file");
>}
>...
>while (true) {
>  ...
>  try {
>blockReader = getBlockReader(targetBlock, offsetIntoBlock,
>targetBlock.getBlockSize() - offsetIntoBlock, targetAddr,
>storageType, chosenNode);
>if(connectFailedOnce) {
>  DFSClient.LOG.info("Successfully connected to " + targetAddr +
> " for " + targetBlock.getBlock());
>}
>return chosenNode;
>  } catch (IOException ex) {
>...
>} else if (refetchToken > 0 && tokenRefetchNeeded(ex, targetAddr)) {
>  refetchToken--;
>  // Here will catch InvalidBlockTokenException.
>  fetchBlockAt(target);
>} else {
>  ...
>}
>  }
>}
>  }
> private LocatedBlock fetchBlockAt(long offset, long length, boolean useCache)
>   throws IOException {
> maybeRegisterBlockRefresh();
> synchronized(infoLock) {
>   // Here the locatedBlocks only contains one locatedBlock, at this time 
> the offset is 1024 and fileLength is 0,
>   // so the targetBlockIdx is -2
>   int targetBlockIdx = locatedBlocks.findBlock(offset);
>   if (targetBlockIdx < 0) { // block is not cached
> targetBlockIdx = LocatedBlocks.getInsertIndex(targetBlockIdx);
> // Here the targetBlockIdx is 1;
> useCache = false;
>   }
>   if (!useCache) { // fetch blocks
> final LocatedBlocks newBlocks = (length == 0)
> ? dfsClient.getLocatedBlocks(src, offset)
> : dfsClient.getLocatedBlocks(src, offset, length);
> if (newBlocks == null || newBlocks.locatedBlockCount() == 0) {
>   throw new EOFException("Could not find target position " + offset);
> }
> // Update the LastLocatedBlock, if offset is for last block.
> if (offset >= locatedBlocks.getFileLength()) {
>   setLocatedBlocksFields(newBlocks, getLastBlockLength(newBlocks));
> } else {
>   locatedBlocks.insertRange(targetBlockIdx,
>   newBlocks.getLocatedBlocks());
> }
>   }
>   // Here the locatedBlocks only contains one locatedBlock, so will throw 
> java.lang.IndexOutOfBoundsException: Index 1 out of bounds for length 1
>   return locatedBlocks.get(targetBlockIdx);
> }
>   }
> {code}
> The client exception:
> {code:java}
> java.lang.IndexOutOfBoundsException: Index 1 out of bounds for length 1
> at 
> java.base/jdk.internal.util.Preconditions.outOfBounds(Preconditions.java:64)
> at 
> java.base/jdk.internal.util.Preconditions.outOfBoundsCheckIndex(Preconditions.java:70)
> at 
> java.base/jdk.internal.util.Preconditions.checkIndex(Preconditions.java:266)
> at java.base/java.util.Objects.checkIndex(Objects.java:359)
> at java.base/java.util.ArrayList.get(ArrayList.java:427)
> at 
>

[jira] [Commented] (HDFS-17455) Fix Client throw IndexOutOfBoundsException in DFSInputStream#fetchBlockAt

2024-04-10 Thread ASF GitHub Bot (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-17455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17835623#comment-17835623
 ] 

ASF GitHub Bot commented on HDFS-17455:
---

haiyang1987 commented on code in PR #6710:
URL: https://github.com/apache/hadoop/pull/6710#discussion_r1558977912


##
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSInputStream.java:
##
@@ -287,4 +300,69 @@ public void testReadWithoutPreferredCachingReplica() 
throws IOException {
   cluster.shutdown();
 }
   }
+
+  @Test
+  public void testCreateBlockReaderWhenInvalidBlockTokenException() throws
+  IOException, InterruptedException, TimeoutException {
+GenericTestUtils.setLogLevel(DFSClient.LOG, Level.DEBUG);
+Configuration conf = new Configuration();
+DFSClientFaultInjector oldFaultInjector = DFSClientFaultInjector.get();
+try (MiniDFSCluster cluster = new 
MiniDFSCluster.Builder(conf).numDataNodes(3).build()) {
+  cluster.waitActive();
+  DistributedFileSystem fs = cluster.getFileSystem();
+  String file = "/testfile";
+  Path path = new Path(file);
+  long fileLen = 1024 * 64;
+  EnumSet createFlags = EnumSet.of(CREATE);
+  FSDataOutputStream out =  fs.create(path, FsPermission.getFileDefault(), 
createFlags,
+  fs.getConf().getInt(IO_FILE_BUFFER_SIZE_KEY, 4096), (short) 3,
+  fs.getDefaultBlockSize(path), null);
+  int bufferLen = 1024;
+  byte[] toWrite = new byte[bufferLen];
+  Random rb = new Random(0);
+  long bytesToWrite = fileLen;
+  while (bytesToWrite > 0) {

Review Comment:
   here will write 5KB data, ensure that the test can seek to 1024 and read 1KB 
of data.





> Fix Client throw IndexOutOfBoundsException in DFSInputStream#fetchBlockAt
> -
>
> Key: HDFS-17455
> URL: https://issues.apache.org/jira/browse/HDFS-17455
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Haiyang Hu
>Assignee: Haiyang Hu
>Priority: Major
>  Labels: pull-request-available
>
> When the client read data, connect to the datanode, because at this time the 
> datanode access token is invalid will throw InvalidBlockTokenException. At 
> this time, when call fetchBlockAt method will  throw 
> java.lang.IndexOutOfBoundsException causing  read data failed.
> *Root case:*
> * The HDFS file contains only one RBW block, with a block data size of 2048KB.
> * The client open this file and seeks to the offset of 1024KB to read data.
> * Call DFSInputStream#getBlockReader method connect to the datanode,  because 
> at this time the datanode access token is invalid will throw 
> InvalidBlockTokenException., and call DFSInputStream#fetchBlockAt will throw 
> java.lang.IndexOutOfBoundsException.
> {code:java}
> private synchronized DatanodeInfo blockSeekTo(long target)
>  throws IOException {
>if (target >= getFileLength()) {
>// the target size is smaller than fileLength (completeBlockSize + 
> lastBlockBeingWrittenLength),
>// here at this time target is 1024 and getFileLength is 2048
>  throw new IOException("Attempted to read past end of file");
>}
>...
>while (true) {
>  ...
>  try {
>blockReader = getBlockReader(targetBlock, offsetIntoBlock,
>targetBlock.getBlockSize() - offsetIntoBlock, targetAddr,
>storageType, chosenNode);
>if(connectFailedOnce) {
>  DFSClient.LOG.info("Successfully connected to " + targetAddr +
> " for " + targetBlock.getBlock());
>}
>return chosenNode;
>  } catch (IOException ex) {
>...
>} else if (refetchToken > 0 && tokenRefetchNeeded(ex, targetAddr)) {
>  refetchToken--;
>  // Here will catch InvalidBlockTokenException.
>  fetchBlockAt(target);
>} else {
>  ...
>}
>  }
>}
>  }
> private LocatedBlock fetchBlockAt(long offset, long length, boolean useCache)
>   throws IOException {
> maybeRegisterBlockRefresh();
> synchronized(infoLock) {
>   // Here the locatedBlocks only contains one locatedBlock, at this time 
> the offset is 1024 and fileLength is 0,
>   // so the targetBlockIdx is -2
>   int targetBlockIdx = locatedBlocks.findBlock(offset);
>   if (targetBlockIdx < 0) { // block is not cached
> targetBlockIdx = LocatedBlocks.getInsertIndex(targetBlockIdx);
> // Here the targetBlockIdx is 1;
> useCache = false;
>   }
>   if (!useCache) { // fetch blocks
> final LocatedBlocks newBlocks = (length == 0)
> ? dfsClient.getLocatedBlocks(src, offset)
> : dfsClient.getLocatedBlocks(src, offset, length);
> if (newBlocks == null || newBlocks.locatedBlockCount()

[jira] [Commented] (HDFS-17451) RBF: fix spotbugs for redundant nullcheck of dns.

2024-04-10 Thread ASF GitHub Bot (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-17451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17835618#comment-17835618
 ] 

ASF GitHub Bot commented on HDFS-17451:
---

KeeProMise commented on code in PR #6697:
URL: https://github.com/apache/hadoop/pull/6697#discussion_r1558805351


##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterRpcServer.java:
##
@@ -1090,7 +1090,7 @@ DatanodeInfo[] getCachedDatanodeReport(DatanodeReportType 
type)
   throws IOException {
 try {
   DatanodeInfo[] dns = this.dnCache.get(type);
-  if (dns == null) {
+  if (dns.length == 0) {

Review Comment:
   @simbadzina IntelliJ IDEA annotate it
   https://github.com/apache/hadoop/assets/38941777/d91a9e69-14a6-44ac-b983-65364ae4dd8c;>
   





> RBF: fix spotbugs for redundant nullcheck of dns.
> -
>
> Key: HDFS-17451
> URL: https://issues.apache.org/jira/browse/HDFS-17451
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Jian Zhang
>Assignee: Jian Zhang
>Priority: Major
>  Labels: pull-request-available
>
> h2. Dodgy code Warnings
> ||Code||Warning||
> |RCN|Redundant nullcheck of dns, which is known to be non-null in 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getCachedDatanodeReport(HdfsConstants$DatanodeReportType)|
> | |[Bug type RCN_REDUNDANT_NULLCHECK_OF_NONNULL_VALUE (click for 
> details)|https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6655/8/artifact/out/branch-spotbugs-hadoop-hdfs-project_hadoop-hdfs-rbf-warnings.html#RCN_REDUNDANT_NULLCHECK_OF_NONNULL_VALUE]
> In class org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer
> In method 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getCachedDatanodeReport(HdfsConstants$DatanodeReportType)
> Value loaded from dns
> Return value of 
> org.apache.hadoop.thirdparty.com.google.common.cache.LoadingCache.get(Object) 
> of type Object
> Redundant null check at RouterRpcServer.java:[line 1093]|



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-17460) Make All HDFS Server side component uniform use the value of `dfs.internal.nameservices` to override `dfs.nameservices`

[jira] [Updated] (HDFS-17460) Make All HDFS Server side component uniform use the value of `dfs.internal.nameservices` to override `dfs.nameservices`

[jira] [Updated] (HDFS-17460) Make All HDFS Server side component uniform use the value of `dfs.internal.nameservices` to override `dfs.nameservices`

[jira] [Commented] (HDFS-17453) IncrementalBlockReport can have race condition with Edit Log Tailer

[jira] [Commented] (HDFS-17434) Selector.select in SocketIOWithTimeout.java has significant overhead

[jira] [Commented] (HDFS-17455) Fix Client throw IndexOutOfBoundsException in DFSInputStream#fetchBlockAt

[jira] [Resolved] (HDFS-17453) IncrementalBlockReport can have race condition with Edit Log Tailer

[jira] [Commented] (HDFS-17453) IncrementalBlockReport can have race condition with Edit Log Tailer

[jira] [Commented] (HDFS-17458) Remove unnecessary BP lock in ReplicaMap

[jira] [Commented] (HDFS-17455) Fix Client throw IndexOutOfBoundsException in DFSInputStream#fetchBlockAt

[jira] [Commented] (HDFS-17455) Fix Client throw IndexOutOfBoundsException in DFSInputStream#fetchBlockAt

[jira] [Commented] (HDFS-17460) Make All HDFS Server side component uniform use the value of `dfs.internal.nameservices` to override `dfs.nameservices`

[jira] [Updated] (HDFS-17458) Remove unnecessary BP lock in ReplicaMap

[jira] [Commented] (HDFS-17458) Remove unnecessary BP lock in ReplicaMap

[jira] [Created] (HDFS-17460) Make All HDFS Server side component uniform use the value of `dfs.internal.nameservices` to override `dfs.nameservices`

[jira] [Updated] (HDFS-17442) Fix prompt information in StandbyCheckpointer

[jira] [Updated] (HDFS-17442) Fix prompt information in StandbyCheckpointer

[jira] [Commented] (HDFS-17455) Fix Client throw IndexOutOfBoundsException in DFSInputStream#fetchBlockAt

[jira] [Commented] (HDFS-17455) Fix Client throw IndexOutOfBoundsException in DFSInputStream#fetchBlockAt

[jira] [Commented] (HDFS-17451) RBF: fix spotbugs for redundant nullcheck of dns.

20 matches

Site Navigation

Mail list logo

Footer information