[jira] [Commented] (HDFS-17485) Fix SpotBug in RouterRpcServer.java

2024-04-22 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17839514#comment-17839514
 ] 

ASF GitHub Bot commented on HDFS-17485:
---

ZanderXu opened a new pull request, #6757:
URL: https://github.com/apache/hadoop/pull/6757

   Fix SpotBug in RouterRpcServer.java.
   
   Jira: [HDFS-17485](https://issues.apache.org/jira/browse/HDFS-17485)
   




> Fix SpotBug in RouterRpcServer.java
> ---
>
> Key: HDFS-17485
> URL: https://issues.apache.org/jira/browse/HDFS-17485
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
> Attachments: image-2024-04-22-15-02-33-725.png
>
>
> !image-2024-04-22-15-02-33-725.png|width=1566,height=265!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17485) Fix SpotBug in RouterRpcServer.java

2024-04-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-17485:
--
Labels: pull-request-available  (was: )

> Fix SpotBug in RouterRpcServer.java
> ---
>
> Key: HDFS-17485
> URL: https://issues.apache.org/jira/browse/HDFS-17485
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2024-04-22-15-02-33-725.png
>
>
> !image-2024-04-22-15-02-33-725.png|width=1566,height=265!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17366) NameNode Fine-Grained Locking via Namespace Tree

2024-04-22 Thread ZanderXu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZanderXu updated HDFS-17366:

Attachment: NameNode Fine-Grained Locking Based On Directory Tree.pdf

> NameNode Fine-Grained Locking via Namespace Tree
> 
>
> Key: HDFS-17366
> URL: https://issues.apache.org/jira/browse/HDFS-17366
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, namenode
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
> Attachments: NameNode Fine-Grained Locking Based On Directory Tree.pdf
>
>
> As we all known, the write performance of NameNode is limited by the global 
> lock. We target to enable fine-grained locking based on the Namespace tree to 
> improve the performance of NameNode write operations.
> There are multiple motivations for creating this ticket:
>  * We have implemented this fine-grained locking and gained nearly 7x 
> performance improvements in our prod environment
>  * Other companies made similar improvements based on their internal branch. 
> Internal branches are quite different from the community, so few feedback and 
> discussions in the community.
>  * The topic of fine-grained locking has been discussed for a very long time, 
> but still without any results.
>  
> We implemented this fine-gained locking based on the namespace tree to 
> maximize the number of concurrency for disjoint or independent operations.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-17484) Introduce redundancy.considerLoad.minLoad to avoiding excluding nodes when they are not busy actually

2024-04-22 Thread farmmamba (Jira)
farmmamba created HDFS-17484:


 Summary: Introduce redundancy.considerLoad.minLoad to avoiding 
excluding nodes when they are not busy actually
 Key: HDFS-17484
 URL: https://issues.apache.org/jira/browse/HDFS-17484
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 3.4.0
Reporter: farmmamba
Assignee: farmmamba


Currently, we have `dfs.namenode.redundancy.considerLoad` equals true by 
default, and 

dfs.namenode.redundancy.considerLoad.factor equals 2.0 by default.

Think about below situation. when we are doing stress test, we may deploy hdfs 
client onto the datanode. So, this hdfs client will prefer to write to its 
local datanode and increase this machine's load.  Suppose we have 3 datanodes, 
the load of them are as below:  5.0, 0.2, 0.3.

 

The load equals to 5.0 will be excluded when choose datanodes for a block. But 
actually, it is not slow node when load equals to 5.0 for a machine with 80 cpu 
cores.

 

So, we should better add a new configuration entry :  
`dfs.namenode.redundancy.considerLoad.minLoad` to indicate the mininum factor 
we will make considerLoad take effect.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17488) DN can fail IBRs with NPE when a volume is removed

2024-04-22 Thread farmmamba (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17839558#comment-17839558
 ] 

farmmamba commented on HDFS-17488:
--

Hi, [~coconut_icecream] .  Sir, of course. I have look at your code roughly and 
will review it more carefully later.

> DN can fail IBRs with NPE when a volume is removed
> --
>
> Key: HDFS-17488
> URL: https://issues.apache.org/jira/browse/HDFS-17488
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Felix N
>Assignee: Felix N
>Priority: Major
>  Labels: pull-request-available
>
>  
> Error logs
> {code:java}
> 2024-04-22 15:46:33,422 [BP-1842952724-10.22.68.249-1713771988830 
> heartbeating to localhost/127.0.0.1:64977] ERROR datanode.DataNode 
> (BPServiceActor.java:run(922)) - Exception in BPOfferService for Block pool 
> BP-1842952724-10.22.68.249-1713771988830 (Datanode Uuid 
> 1659ffaf-1a80-4a8e-a542-643f6bd97ed4) service to localhost/127.0.0.1:64977
> java.lang.NullPointerException
>     at 
> org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.blockReceivedAndDeleted(DatanodeProtocolClientSideTranslatorPB.java:246)
>     at 
> org.apache.hadoop.hdfs.server.datanode.IncrementalBlockReportManager.sendIBRs(IncrementalBlockReportManager.java:218)
>     at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:749)
>     at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:920)
>     at java.lang.Thread.run(Thread.java:748) {code}
> The root cause is in BPOfferService#notifyNamenodeBlock, happens when it's 
> called on a block belonging to a volume already removed prior. Because the 
> volume was already removed
>  
> {code:java}
> private void notifyNamenodeBlock(ExtendedBlock block, BlockStatus status,
> String delHint, String storageUuid, boolean isOnTransientStorage) {
>   checkBlock(block);
>   final ReceivedDeletedBlockInfo info = new ReceivedDeletedBlockInfo(
>   block.getLocalBlock(), status, delHint);
>   final DatanodeStorage storage = dn.getFSDataset().getStorage(storageUuid);
>   
>   // storage == null here because it's already removed earlier.
>   for (BPServiceActor actor : bpServices) {
> actor.getIbrManager().notifyNamenodeBlock(info, storage,
> isOnTransientStorage);
>   }
> } {code}
> so IBRs with a null storage are now pending.
> The reason why notifyNamenodeBlock can trigger on such blocks is up in 
> DirectoryScanner#reconcile
> {code:java}
>   public void reconcile() throws IOException {
>     LOG.debug("reconcile start DirectoryScanning");
>     scan();
> // If a volume is removed here after scan() already finished running,
> // diffs is stale and checkAndUpdate will run on a removed volume
>     // HDFS-14476: run checkAndUpdate with batch to avoid holding the lock too
>     // long
>     int loopCount = 0;
>     synchronized (diffs) {
>       for (final Map.Entry entry : diffs.getEntries()) {
>         dataset.checkAndUpdate(entry.getKey(), entry.getValue());        
>     ...
>   } {code}
> Inside checkAndUpdate, memBlockInfo is null because all the block meta in 
> memory is removed during the volume removal, but diskFile still exists. Then 
> DataNode#notifyNamenodeDeletedBlock (and further down the line, 
> notifyNamenodeBlock) is called on this block.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-17486) VIO: dumpXattrs logic optimization

2024-04-22 Thread wangzhihui (Jira)
wangzhihui created HDFS-17486:
-

 Summary: VIO: dumpXattrs logic optimization
 Key: HDFS-17486
 URL: https://issues.apache.org/jira/browse/HDFS-17486
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs
Affects Versions: 3.3.3, 3.2.0
Reporter: wangzhihui


The dumpXattrs logic in VIO should use FSImageFormatPBINode.Loader.loadXAttrs() 
to get the Xattrs attribute for easy maintenance.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17484) Introduce redundancy.considerLoad.minLoad to avoiding excluding nodes when they are not busy actually

2024-04-22 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17839524#comment-17839524
 ] 

ASF GitHub Bot commented on HDFS-17484:
---

hfutatzhanghb opened a new pull request, #6758:
URL: https://github.com/apache/hadoop/pull/6758

   ### Description of PR
   Refer to HDFS-17484.
   
   Currently, we have `dfs.namenode.redundancy.considerLoad` equals true by 
default, and 
   
   dfs.namenode.redundancy.considerLoad.factor equals 2.0 by default.
   
   Think about below situation. when we are doing stress test, we may deploy 
hdfs client onto the datanode. So, this hdfs client will prefer to write to its 
local datanode and increase this machine's load.  Suppose we have 3 datanodes, 
the load of them are as below:  5.0, 0.2, 0.3.
   
   The load equals to 5.0 will be excluded when choose datanodes for a block. 
But actually, it is not slow node when load equals to 5.0 for a machine with 80 
cpu cores.
   
   So, we should better add a new configuration entry :  
`dfs.namenode.redundancy.considerLoad.minLoad` to indicate the mininum factor 
we will make considerLoad take effect.
   
   ### How was this patch tested?
   Add an unit test.
   




> Introduce redundancy.considerLoad.minLoad to avoiding excluding nodes when 
> they are not busy actually
> -
>
> Key: HDFS-17484
> URL: https://issues.apache.org/jira/browse/HDFS-17484
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.4.0
>Reporter: farmmamba
>Assignee: farmmamba
>Priority: Minor
>
> Currently, we have `dfs.namenode.redundancy.considerLoad` equals true by 
> default, and 
> dfs.namenode.redundancy.considerLoad.factor equals 2.0 by default.
> Think about below situation. when we are doing stress test, we may deploy 
> hdfs client onto the datanode. So, this hdfs client will prefer to write to 
> its local datanode and increase this machine's load.  Suppose we have 3 
> datanodes, the load of them are as below:  5.0, 0.2, 0.3.
>  
> The load equals to 5.0 will be excluded when choose datanodes for a block. 
> But actually, it is not slow node when load equals to 5.0 for a machine with 
> 80 cpu cores.
>  
> So, we should better add a new configuration entry :  
> `dfs.namenode.redundancy.considerLoad.minLoad` to indicate the mininum factor 
> we will make considerLoad take effect.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17484) Introduce redundancy.considerLoad.minLoad to avoiding excluding nodes when they are not busy actually

2024-04-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-17484:
--
Labels: pull-request-available  (was: )

> Introduce redundancy.considerLoad.minLoad to avoiding excluding nodes when 
> they are not busy actually
> -
>
> Key: HDFS-17484
> URL: https://issues.apache.org/jira/browse/HDFS-17484
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.4.0
>Reporter: farmmamba
>Assignee: farmmamba
>Priority: Minor
>  Labels: pull-request-available
>
> Currently, we have `dfs.namenode.redundancy.considerLoad` equals true by 
> default, and 
> dfs.namenode.redundancy.considerLoad.factor equals 2.0 by default.
> Think about below situation. when we are doing stress test, we may deploy 
> hdfs client onto the datanode. So, this hdfs client will prefer to write to 
> its local datanode and increase this machine's load.  Suppose we have 3 
> datanodes, the load of them are as below:  5.0, 0.2, 0.3.
>  
> The load equals to 5.0 will be excluded when choose datanodes for a block. 
> But actually, it is not slow node when load equals to 5.0 for a machine with 
> 80 cpu cores.
>  
> So, we should better add a new configuration entry :  
> `dfs.namenode.redundancy.considerLoad.minLoad` to indicate the mininum factor 
> we will make considerLoad take effect.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17488) DN can fail IBRs with NPE when a volume is removed

2024-04-22 Thread farmmamba (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17839543#comment-17839543
 ] 

farmmamba commented on HDFS-17488:
--

[~coconut_icecream] Sir, it is duplicated. Please refer to HDFS-17467.

> DN can fail IBRs with NPE when a volume is removed
> --
>
> Key: HDFS-17488
> URL: https://issues.apache.org/jira/browse/HDFS-17488
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Felix N
>Assignee: Felix N
>Priority: Major
>
>  
> Error logs
> {code:java}
> 2024-04-22 15:46:33,422 [BP-1842952724-10.22.68.249-1713771988830 
> heartbeating to localhost/127.0.0.1:64977] ERROR datanode.DataNode 
> (BPServiceActor.java:run(922)) - Exception in BPOfferService for Block pool 
> BP-1842952724-10.22.68.249-1713771988830 (Datanode Uuid 
> 1659ffaf-1a80-4a8e-a542-643f6bd97ed4) service to localhost/127.0.0.1:64977
> java.lang.NullPointerException
>     at 
> org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.blockReceivedAndDeleted(DatanodeProtocolClientSideTranslatorPB.java:246)
>     at 
> org.apache.hadoop.hdfs.server.datanode.IncrementalBlockReportManager.sendIBRs(IncrementalBlockReportManager.java:218)
>     at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:749)
>     at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:920)
>     at java.lang.Thread.run(Thread.java:748) {code}
> The root cause is in BPOfferService#notifyNamenodeBlock, happens when it's 
> called on a block belonging to a volume already removed prior. Because the 
> volume was already removed
>  
> {code:java}
> private void notifyNamenodeBlock(ExtendedBlock block, BlockStatus status,
> String delHint, String storageUuid, boolean isOnTransientStorage) {
>   checkBlock(block);
>   final ReceivedDeletedBlockInfo info = new ReceivedDeletedBlockInfo(
>   block.getLocalBlock(), status, delHint);
>   final DatanodeStorage storage = dn.getFSDataset().getStorage(storageUuid);
>   
>   // storage == null here because it's already removed earlier.
>   for (BPServiceActor actor : bpServices) {
> actor.getIbrManager().notifyNamenodeBlock(info, storage,
> isOnTransientStorage);
>   }
> } {code}
> so IBRs with a null storage are now pending.
> The reason why notifyNamenodeBlock can trigger on such blocks is up in 
> DirectoryScanner#reconcile
> {code:java}
>   public void reconcile() throws IOException {
>     LOG.debug("reconcile start DirectoryScanning");
>     scan();
> // If a volume is removed here after scan() already finished running,
> // diffs is stale and checkAndUpdate will run on a removed volume
>     // HDFS-14476: run checkAndUpdate with batch to avoid holding the lock too
>     // long
>     int loopCount = 0;
>     synchronized (diffs) {
>       for (final Map.Entry entry : diffs.getEntries()) {
>         dataset.checkAndUpdate(entry.getKey(), entry.getValue());        
>     ...
>   } {code}
> Inside checkAndUpdate, memBlockInfo is null because all the block meta in 
> memory is removed during the volume removal, but diskFile still exists. Then 
> DataNode#notifyNamenodeDeletedBlock (and further down the line, 
> notifyNamenodeBlock) is called on this block.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-17488) DN can fail IBRs with NPE when a volume is removed

2024-04-22 Thread Felix N (Jira)
Felix N created HDFS-17488:
--

 Summary: DN can fail IBRs with NPE when a volume is removed
 Key: HDFS-17488
 URL: https://issues.apache.org/jira/browse/HDFS-17488
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs
Reporter: Felix N
Assignee: Felix N


 

Error logs
{code:java}
2024-04-22 15:46:33,422 [BP-1842952724-10.22.68.249-1713771988830 heartbeating 
to localhost/127.0.0.1:64977] ERROR datanode.DataNode 
(BPServiceActor.java:run(922)) - Exception in BPOfferService for Block pool 
BP-1842952724-10.22.68.249-1713771988830 (Datanode Uuid 
1659ffaf-1a80-4a8e-a542-643f6bd97ed4) service to localhost/127.0.0.1:64977
java.lang.NullPointerException
    at 
org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.blockReceivedAndDeleted(DatanodeProtocolClientSideTranslatorPB.java:246)
    at 
org.apache.hadoop.hdfs.server.datanode.IncrementalBlockReportManager.sendIBRs(IncrementalBlockReportManager.java:218)
    at 
org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:749)
    at 
org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:920)
    at java.lang.Thread.run(Thread.java:748) {code}
The root cause is in BPOfferService#notifyNamenodeBlock, happens when it's 
called on a block belonging to a volume already removed prior. Because the 
volume was already removed

 
{code:java}
private void notifyNamenodeBlock(ExtendedBlock block, BlockStatus status,
String delHint, String storageUuid, boolean isOnTransientStorage) {
  checkBlock(block);
  final ReceivedDeletedBlockInfo info = new ReceivedDeletedBlockInfo(
  block.getLocalBlock(), status, delHint);
  final DatanodeStorage storage = dn.getFSDataset().getStorage(storageUuid);
  
  // storage == null here because it's already removed earlier.

  for (BPServiceActor actor : bpServices) {
actor.getIbrManager().notifyNamenodeBlock(info, storage,
isOnTransientStorage);
  }
} {code}
so IBRs with a null storage are now pending.

The reason why notifyNamenodeBlock can trigger on such blocks is up in 
DirectoryScanner#reconcile
{code:java}
  public void reconcile() throws IOException {
    LOG.debug("reconcile start DirectoryScanning");
    scan();

// If a volume is removed here after scan() already finished running,
// diffs is stale and checkAndUpdate will run on a removed volume

    // HDFS-14476: run checkAndUpdate with batch to avoid holding the lock too
    // long
    int loopCount = 0;
    synchronized (diffs) {
      for (final Map.Entry entry : diffs.getEntries()) {
        dataset.checkAndUpdate(entry.getKey(), entry.getValue());        
    ...
  } {code}
Inside checkAndUpdate, memBlockInfo is null because all the block meta in 
memory is removed during the volume removal, but diskFile still exists. Then 
DataNode#notifyNamenodeDeletedBlock (and further down the line, 
notifyNamenodeBlock) is called on this block.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17479) [FGL] Snapshot related operations still use global lock

2024-04-22 Thread ZanderXu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZanderXu updated HDFS-17479:

Description: 
Snapshot feature is a very useful feature in certain scenarios. As far as I 
know, very few companies use this feature in the prod environment. The 
implementation is complex and it is difficult to support FGL with only minor 
modifications.

So we can still use the Global lock to make snapshot-related operations 
thread-safe.

 

Snapshot has some access modules, let's analyze them and find a way to still 
use GlobalLock.

!image-2024-04-18-11-00-12-451.png|width=288,height=219!

The above picture shows a simple case, we can access the iNode foo through the 
following paths:
 # /abc/foo
 # /abc/.snapshot/s1/foo

If we want to delete the iNode foo, we need to lock /abc and /abc/.snapshot/s1 
(DirectoryWithSnapshotFeature on iNode abc).

If we want to change permission of the iNode foo, we need to lock /abc/foo and 
/abc/.snapshot/s1/foo (DirectoryWithSnapshotFeature on the iNode foo)

 

For this case, we can directly acquire the global lock when resolving the IIPs 
for the input path if there is an iNode that has DirectorySnapshottableFeature.

!image-2024-04-18-11-00-02-011.png|width=368,height=383!

After /abc/foo is renamed to /xyz/bar, the access modules will be changed, as 
the above picture shows. We can access this bar through the following path:
 # /abc/.snapshot/s1/bar
 # /xyz/bar

For /abc/.snapshot/s1/bar, since the iNode abc has 
DirectorySnapshottableFeature, so we can identify it and acquire the global 
lock.

For /xyz/bar, we can identify it through Reference flag, since the iNode bar is 
a DstReference Node.

 

So we can use DirectorySnapshottableFeature and Reference to determine if we 
need to acquire the Global lock when resolving the IIPs for input path.

 

  was:
Snapshot feature is a very useful feature in certain scenarios. As far as I 
know, very few companies use this feature on the prod environment. The 
implementation is complex and it is difficult to support FGL with only a minor 
modifications.

So we can still use the Global lock to make snapshot-related operations 
thread-safe.

 

Snapshot has some access modules, let's analyze them and find a way to still 
use GlobalLock.

!image-2024-04-18-11-00-12-451.png|width=288,height=219!

The above picture shows a simple case, we can access the iNode foo through the 
following paths:
 # /abc/foo
 # /abc/.snapshot/s1/foo

If we want to delete the iNode foo, we need to lock /abc and /abc/.snapshot/s1 
(DirectoryWithSnapshotFeature on iNode abc).

If we want to change permission of the iNode foo, we need to lock /abc/foo and 
/abc/.snapshot/s1/foo (DirectoryWithSnapshotFeature on the iNode foo)

 

For this case, we can directly acquire the global lock when resolving the IIPs 
for the input path if there is an iNode that has DirectorySnapshottableFeature.

!image-2024-04-18-11-00-02-011.png|width=368,height=383!

After /abc/foo is renamed to /xyz/bar, the access modules will be changed, as 
the above picture shows. We can access this bar through the following path:
 # /abc/.snapshot/s1/bar
 # /xyz/bar

For /abc/.snapshot/s1/bar, since the iNode abc has 
DirectorySnapshottableFeature, so we can identify it and acquire the global 
lock.

For /xyz/bar, we can identify it through Reference flag, since the iNode bar is 
a DstReference Node.

 

So we can use DirectorySnapshottableFeature and Reference to determine if we 
need to acquire the Global lock when resolving the IIPs for input path.

 


> [FGL] Snapshot related operations still use global lock
> ---
>
> Key: HDFS-17479
> URL: https://issues.apache.org/jira/browse/HDFS-17479
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
> Attachments: image-2024-04-18-11-00-02-011.png, 
> image-2024-04-18-11-00-12-451.png
>
>
> Snapshot feature is a very useful feature in certain scenarios. As far as I 
> know, very few companies use this feature in the prod environment. The 
> implementation is complex and it is difficult to support FGL with only minor 
> modifications.
> So we can still use the Global lock to make snapshot-related operations 
> thread-safe.
>  
> Snapshot has some access modules, let's analyze them and find a way to still 
> use GlobalLock.
> !image-2024-04-18-11-00-12-451.png|width=288,height=219!
> The above picture shows a simple case, we can access the iNode foo through 
> the following paths:
>  # /abc/foo
>  # /abc/.snapshot/s1/foo
> If we want to delete the iNode foo, we need to lock /abc and 
> /abc/.snapshot/s1 (DirectoryWithSnapshotFeature on iNode abc).
> If we want to change permission of the iNode foo, we need to lock /abc/foo 
> and /abc/.snapshot/s1/foo 

[jira] [Created] (HDFS-17485) Fix SpotBug in RouterRpcServer.java

2024-04-22 Thread ZanderXu (Jira)
ZanderXu created HDFS-17485:
---

 Summary: Fix SpotBug in RouterRpcServer.java
 Key: HDFS-17485
 URL: https://issues.apache.org/jira/browse/HDFS-17485
 Project: Hadoop HDFS
  Issue Type: Task
Reporter: ZanderXu
Assignee: ZanderXu
 Attachments: image-2024-04-22-15-02-33-725.png

!image-2024-04-22-15-02-33-725.png|width=1566,height=265!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17486) VIO: dumpXattrs logic optimization

2024-04-22 Thread Xiaobao Wu (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17839520#comment-17839520
 ] 

Xiaobao Wu commented on HDFS-17486:
---

[~hiwangzhihui] Okay, I will provide patch as soon as possible.

> VIO: dumpXattrs logic optimization
> --
>
> Key: HDFS-17486
> URL: https://issues.apache.org/jira/browse/HDFS-17486
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.2.0, 3.3.3
>Reporter: wangzhihui
>Priority: Minor
>
> The dumpXattrs logic in VIO should use 
> FSImageFormatPBINode.Loader.loadXAttrs() to get the Xattrs attribute for easy 
> maintenance.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17488) DN can fail IBRs with NPE when a volume is removed

2024-04-22 Thread Felix N (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17839556#comment-17839556
 ] 

Felix N commented on HDFS-17488:


Hi [~zhanghaobo], thanks for letting me know. I think you can review my PR 
since it should contain your patch + some extra steps to prevent the situation 
from appearing + unit tests

> DN can fail IBRs with NPE when a volume is removed
> --
>
> Key: HDFS-17488
> URL: https://issues.apache.org/jira/browse/HDFS-17488
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Felix N
>Assignee: Felix N
>Priority: Major
>  Labels: pull-request-available
>
>  
> Error logs
> {code:java}
> 2024-04-22 15:46:33,422 [BP-1842952724-10.22.68.249-1713771988830 
> heartbeating to localhost/127.0.0.1:64977] ERROR datanode.DataNode 
> (BPServiceActor.java:run(922)) - Exception in BPOfferService for Block pool 
> BP-1842952724-10.22.68.249-1713771988830 (Datanode Uuid 
> 1659ffaf-1a80-4a8e-a542-643f6bd97ed4) service to localhost/127.0.0.1:64977
> java.lang.NullPointerException
>     at 
> org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.blockReceivedAndDeleted(DatanodeProtocolClientSideTranslatorPB.java:246)
>     at 
> org.apache.hadoop.hdfs.server.datanode.IncrementalBlockReportManager.sendIBRs(IncrementalBlockReportManager.java:218)
>     at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:749)
>     at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:920)
>     at java.lang.Thread.run(Thread.java:748) {code}
> The root cause is in BPOfferService#notifyNamenodeBlock, happens when it's 
> called on a block belonging to a volume already removed prior. Because the 
> volume was already removed
>  
> {code:java}
> private void notifyNamenodeBlock(ExtendedBlock block, BlockStatus status,
> String delHint, String storageUuid, boolean isOnTransientStorage) {
>   checkBlock(block);
>   final ReceivedDeletedBlockInfo info = new ReceivedDeletedBlockInfo(
>   block.getLocalBlock(), status, delHint);
>   final DatanodeStorage storage = dn.getFSDataset().getStorage(storageUuid);
>   
>   // storage == null here because it's already removed earlier.
>   for (BPServiceActor actor : bpServices) {
> actor.getIbrManager().notifyNamenodeBlock(info, storage,
> isOnTransientStorage);
>   }
> } {code}
> so IBRs with a null storage are now pending.
> The reason why notifyNamenodeBlock can trigger on such blocks is up in 
> DirectoryScanner#reconcile
> {code:java}
>   public void reconcile() throws IOException {
>     LOG.debug("reconcile start DirectoryScanning");
>     scan();
> // If a volume is removed here after scan() already finished running,
> // diffs is stale and checkAndUpdate will run on a removed volume
>     // HDFS-14476: run checkAndUpdate with batch to avoid holding the lock too
>     // long
>     int loopCount = 0;
>     synchronized (diffs) {
>       for (final Map.Entry entry : diffs.getEntries()) {
>         dataset.checkAndUpdate(entry.getKey(), entry.getValue());        
>     ...
>   } {code}
> Inside checkAndUpdate, memBlockInfo is null because all the block meta in 
> memory is removed during the volume removal, but diskFile still exists. Then 
> DataNode#notifyNamenodeDeletedBlock (and further down the line, 
> notifyNamenodeBlock) is called on this block.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17457) [FGL] UTs support fine-grained locking

2024-04-22 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17839507#comment-17839507
 ] 

ASF GitHub Bot commented on HDFS-17457:
---

ZanderXu commented on PR #6741:
URL: https://github.com/apache/hadoop/pull/6741#issuecomment-2068623064

   > @ZanderXu could you check the failed test cases and the spotbugs
   
   @ferhui All of failed UTs have been fixed by HDFS-17435. And the spotbugs is 
not introduced by this PR, I will submit a PR to fix it base on the trunk 
branch.




> [FGL] UTs support fine-grained locking
> --
>
> Key: HDFS-17457
> URL: https://issues.apache.org/jira/browse/HDFS-17457
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
>
> [FGL] UTs support fine-grained locking



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15413) DFSStripedInputStream throws exception when datanodes close idle connections

2024-04-22 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17839517#comment-17839517
 ] 

ASF GitHub Bot commented on HDFS-15413:
---

hadoop-yetus commented on PR #5829:
URL: https://github.com/apache/hadoop/pull/5829#issuecomment-2068640885

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   6m 49s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  0s |  |  xmllint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m  4s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  19m 54s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   3m  0s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |   2m 48s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   0m 47s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 18s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m  8s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 37s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   3m  2s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  20m 53s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | -0 :warning: |  patch  |  21m  7s |  |  Used diff version of patch file. 
Binary files and potentially other changes not applied. Please rebase and 
squash commits if necessary.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 22s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m  4s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   2m 50s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |   2m 50s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   2m 51s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |   2m 51s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 38s | 
[/results-checkstyle-hadoop-hdfs-project.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5829/11/artifact/out/results-checkstyle-hadoop-hdfs-project.txt)
 |  hadoop-hdfs-project: The patch generated 1 new + 113 unchanged - 0 fixed = 
114 total (was 113)  |
   | +1 :green_heart: |  mvnsite  |   1m  8s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 55s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 27s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   3m  8s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  21m  5s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   1m 49s |  |  hadoop-hdfs-client in the patch 
passed.  |
   | +1 :green_heart: |  unit  | 214m 41s |  |  hadoop-hdfs in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 26s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 329m 18s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.45 ServerAPI=1.45 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5829/11/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5829 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint |
   | uname | Linux f0536dd0a688 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 
15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / d698a9445b4a06fd8978ee4c5005964270d236d9 |
   | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   | 

[jira] [Commented] (HDFS-17457) [FGL] UTs support fine-grained locking

2024-04-22 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17839516#comment-17839516
 ] 

ASF GitHub Bot commented on HDFS-17457:
---

ZanderXu commented on PR #6741:
URL: https://github.com/apache/hadoop/pull/6741#issuecomment-2068640362

   The spotbugs will be fixed by HDFS-17485.




> [FGL] UTs support fine-grained locking
> --
>
> Key: HDFS-17457
> URL: https://issues.apache.org/jira/browse/HDFS-17457
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
>
> [FGL] UTs support fine-grained locking



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17488) DN can fail IBRs with NPE when a volume is removed

2024-04-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-17488:
--
Labels: pull-request-available  (was: )

> DN can fail IBRs with NPE when a volume is removed
> --
>
> Key: HDFS-17488
> URL: https://issues.apache.org/jira/browse/HDFS-17488
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Felix N
>Assignee: Felix N
>Priority: Major
>  Labels: pull-request-available
>
>  
> Error logs
> {code:java}
> 2024-04-22 15:46:33,422 [BP-1842952724-10.22.68.249-1713771988830 
> heartbeating to localhost/127.0.0.1:64977] ERROR datanode.DataNode 
> (BPServiceActor.java:run(922)) - Exception in BPOfferService for Block pool 
> BP-1842952724-10.22.68.249-1713771988830 (Datanode Uuid 
> 1659ffaf-1a80-4a8e-a542-643f6bd97ed4) service to localhost/127.0.0.1:64977
> java.lang.NullPointerException
>     at 
> org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.blockReceivedAndDeleted(DatanodeProtocolClientSideTranslatorPB.java:246)
>     at 
> org.apache.hadoop.hdfs.server.datanode.IncrementalBlockReportManager.sendIBRs(IncrementalBlockReportManager.java:218)
>     at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:749)
>     at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:920)
>     at java.lang.Thread.run(Thread.java:748) {code}
> The root cause is in BPOfferService#notifyNamenodeBlock, happens when it's 
> called on a block belonging to a volume already removed prior. Because the 
> volume was already removed
>  
> {code:java}
> private void notifyNamenodeBlock(ExtendedBlock block, BlockStatus status,
> String delHint, String storageUuid, boolean isOnTransientStorage) {
>   checkBlock(block);
>   final ReceivedDeletedBlockInfo info = new ReceivedDeletedBlockInfo(
>   block.getLocalBlock(), status, delHint);
>   final DatanodeStorage storage = dn.getFSDataset().getStorage(storageUuid);
>   
>   // storage == null here because it's already removed earlier.
>   for (BPServiceActor actor : bpServices) {
> actor.getIbrManager().notifyNamenodeBlock(info, storage,
> isOnTransientStorage);
>   }
> } {code}
> so IBRs with a null storage are now pending.
> The reason why notifyNamenodeBlock can trigger on such blocks is up in 
> DirectoryScanner#reconcile
> {code:java}
>   public void reconcile() throws IOException {
>     LOG.debug("reconcile start DirectoryScanning");
>     scan();
> // If a volume is removed here after scan() already finished running,
> // diffs is stale and checkAndUpdate will run on a removed volume
>     // HDFS-14476: run checkAndUpdate with batch to avoid holding the lock too
>     // long
>     int loopCount = 0;
>     synchronized (diffs) {
>       for (final Map.Entry entry : diffs.getEntries()) {
>         dataset.checkAndUpdate(entry.getKey(), entry.getValue());        
>     ...
>   } {code}
> Inside checkAndUpdate, memBlockInfo is null because all the block meta in 
> memory is removed during the volume removal, but diskFile still exists. Then 
> DataNode#notifyNamenodeDeletedBlock (and further down the line, 
> notifyNamenodeBlock) is called on this block.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-17488) DN can fail IBRs with NPE when a volume is removed

2024-04-22 Thread farmmamba (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17839543#comment-17839543
 ] 

farmmamba edited comment on HDFS-17488 at 4/22/24 8:15 AM:
---

[~coconut_icecream] Sir, Thanks for your reporting, i think it is duplicated. 
Please refer to HDFS-17467.


was (Author: zhanghaobo):
[~coconut_icecream] Sir, it is duplicated. Please refer to HDFS-17467.

> DN can fail IBRs with NPE when a volume is removed
> --
>
> Key: HDFS-17488
> URL: https://issues.apache.org/jira/browse/HDFS-17488
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Felix N
>Assignee: Felix N
>Priority: Major
>
>  
> Error logs
> {code:java}
> 2024-04-22 15:46:33,422 [BP-1842952724-10.22.68.249-1713771988830 
> heartbeating to localhost/127.0.0.1:64977] ERROR datanode.DataNode 
> (BPServiceActor.java:run(922)) - Exception in BPOfferService for Block pool 
> BP-1842952724-10.22.68.249-1713771988830 (Datanode Uuid 
> 1659ffaf-1a80-4a8e-a542-643f6bd97ed4) service to localhost/127.0.0.1:64977
> java.lang.NullPointerException
>     at 
> org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.blockReceivedAndDeleted(DatanodeProtocolClientSideTranslatorPB.java:246)
>     at 
> org.apache.hadoop.hdfs.server.datanode.IncrementalBlockReportManager.sendIBRs(IncrementalBlockReportManager.java:218)
>     at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:749)
>     at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:920)
>     at java.lang.Thread.run(Thread.java:748) {code}
> The root cause is in BPOfferService#notifyNamenodeBlock, happens when it's 
> called on a block belonging to a volume already removed prior. Because the 
> volume was already removed
>  
> {code:java}
> private void notifyNamenodeBlock(ExtendedBlock block, BlockStatus status,
> String delHint, String storageUuid, boolean isOnTransientStorage) {
>   checkBlock(block);
>   final ReceivedDeletedBlockInfo info = new ReceivedDeletedBlockInfo(
>   block.getLocalBlock(), status, delHint);
>   final DatanodeStorage storage = dn.getFSDataset().getStorage(storageUuid);
>   
>   // storage == null here because it's already removed earlier.
>   for (BPServiceActor actor : bpServices) {
> actor.getIbrManager().notifyNamenodeBlock(info, storage,
> isOnTransientStorage);
>   }
> } {code}
> so IBRs with a null storage are now pending.
> The reason why notifyNamenodeBlock can trigger on such blocks is up in 
> DirectoryScanner#reconcile
> {code:java}
>   public void reconcile() throws IOException {
>     LOG.debug("reconcile start DirectoryScanning");
>     scan();
> // If a volume is removed here after scan() already finished running,
> // diffs is stale and checkAndUpdate will run on a removed volume
>     // HDFS-14476: run checkAndUpdate with batch to avoid holding the lock too
>     // long
>     int loopCount = 0;
>     synchronized (diffs) {
>       for (final Map.Entry entry : diffs.getEntries()) {
>         dataset.checkAndUpdate(entry.getKey(), entry.getValue());        
>     ...
>   } {code}
> Inside checkAndUpdate, memBlockInfo is null because all the block meta in 
> memory is removed during the volume removal, but diskFile still exists. Then 
> DataNode#notifyNamenodeDeletedBlock (and further down the line, 
> notifyNamenodeBlock) is called on this block.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17486) VIO: dumpXattrs logic optimization

2024-04-22 Thread wangzhihui (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17839519#comment-17839519
 ] 

wangzhihui commented on HDFS-17486:
---

hi,  [~wuxiaobao] Please follow up on it.

> VIO: dumpXattrs logic optimization
> --
>
> Key: HDFS-17486
> URL: https://issues.apache.org/jira/browse/HDFS-17486
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.2.0, 3.3.3
>Reporter: wangzhihui
>Priority: Minor
>
> The dumpXattrs logic in VIO should use 
> FSImageFormatPBINode.Loader.loadXAttrs() to get the Xattrs attribute for easy 
> maintenance.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17487) [FGL] Make rollEdits thread safe

2024-04-22 Thread ZanderXu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZanderXu updated HDFS-17487:

Parent: HDFS-17385
Issue Type: Sub-task  (was: Improvement)

> [FGL] Make rollEdits thread safe
> 
>
> Key: HDFS-17487
> URL: https://issues.apache.org/jira/browse/HDFS-17487
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>
> rollEdits is a very common used RPCs. It is not thread-safe, so it still 
> needs to hold the global write lock. So it has a big impact on the 
> performance.
>  
> We need to make it thread-safe to let it hold the global read lock to improve 
> the performance.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-17385) [FGL] Replace the global FS write locking with directory tree-based fine-grained locking

2024-04-22 Thread ZanderXu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZanderXu reassigned HDFS-17385:
---

Assignee: ZanderXu

> [FGL] Replace the global FS write locking with directory tree-based 
> fine-grained locking
> 
>
> Key: HDFS-17385
> URL: https://issues.apache.org/jira/browse/HDFS-17385
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: FGL
>
> Second, we can replace the global FS write locking with directory tree-based 
> fine-grained locking.
> This ticket doesn't remove this global FS locking, it just replaces the 
> global FS write lock with the global FS read lock and directory tree-based 
> fine-grained lock.
> For block-related operations and DN-related operations still need the global 
> BM lock.
>  
> The lock order should be:
>  * The global FS lock
>  * The global BM lock
>  * The directory tree-based fine-grained lock
>  
> This ticket should supports:
>  * End-user can choose lock mode
>  ** One global lock mode
>  ** Global FS lock and global BM lock mode
>  ** Global FS read lock, global BM lock and tree-based fine-grained lock



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-17487) [FGL] Make rollEdits thread safe

2024-04-22 Thread ZanderXu (Jira)
ZanderXu created HDFS-17487:
---

 Summary: [FGL] Make rollEdits thread safe
 Key: HDFS-17487
 URL: https://issues.apache.org/jira/browse/HDFS-17487
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: ZanderXu
Assignee: ZanderXu


rollEdits is a very common used RPCs. It is not thread-safe, so it still needs 
to hold the global write lock. So it has a big impact on the performance.

 

We need to make it thread-safe to let it hold the global read lock to improve 
the performance.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17487) [FGL] Make rollEdits thread safe

2024-04-22 Thread ZanderXu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZanderXu updated HDFS-17487:

Parent: (was: HDFS-17385)
Issue Type: Improvement  (was: Sub-task)

> [FGL] Make rollEdits thread safe
> 
>
> Key: HDFS-17487
> URL: https://issues.apache.org/jira/browse/HDFS-17487
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>
> rollEdits is a very common used RPCs. It is not thread-safe, so it still 
> needs to hold the global write lock. So it has a big impact on the 
> performance.
>  
> We need to make it thread-safe to let it hold the global read lock to improve 
> the performance.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17488) DN can fail IBRs with NPE when a volume is removed

2024-04-22 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17839544#comment-17839544
 ] 

ASF GitHub Bot commented on HDFS-17488:
---

kokonguyen191 opened a new pull request, #6759:
URL: https://github.com/apache/hadoop/pull/6759

   ### Description of PR
   
   Error logs
   
   ```
   2024-04-22 15:46:33,422 [BP-1842952724-10.22.68.249-1713771988830 
heartbeating to localhost/127.0.0.1:64977] ERROR datanode.DataNode 
(BPServiceActor.java:run(922)) - Exception in BPOfferService for Block pool 
BP-1842952724-10.22.68.249-1713771988830 (Datanode Uuid 
1659ffaf-1a80-4a8e-a542-643f6bd97ed4) service to localhost/127.0.0.1:64977
   java.lang.NullPointerException
   at 
org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.blockReceivedAndDeleted(DatanodeProtocolClientSideTranslatorPB.java:246)
   at 
org.apache.hadoop.hdfs.server.datanode.IncrementalBlockReportManager.sendIBRs(IncrementalBlockReportManager.java:218)
   at 
org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:749)
   at 
org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:920)
   at java.lang.Thread.run(Thread.java:748) 
   ```
   
   The root cause is in `BPOfferService#notifyNamenodeBlock`, happens when it's 
called on a block belonging to a volume already removed prior. Because the 
volume was already removed
   
   ```java
   private void notifyNamenodeBlock(ExtendedBlock block, BlockStatus status,
   String delHint, String storageUuid, boolean isOnTransientStorage) {
 checkBlock(block);
 final ReceivedDeletedBlockInfo info = new ReceivedDeletedBlockInfo(
 block.getLocalBlock(), status, delHint);
 final DatanodeStorage storage = dn.getFSDataset().getStorage(storageUuid);
 
 // storage == null here because it's already removed earlier.
   
 for (BPServiceActor actor : bpServices) {
   actor.getIbrManager().notifyNamenodeBlock(info, storage,
   isOnTransientStorage);
 }
   } 
   ```
   
   so IBRs with a null storage are now pending.
   
   The reason why notifyNamenodeBlock can trigger on such blocks is up in 
DirectoryScanner#reconcile
   ```java
 public void reconcile() throws IOException {
   LOG.debug("reconcile start DirectoryScanning");
   scan();
   
   // If a volume is removed here after scan() already finished running,
   // diffs is stale and checkAndUpdate will run on a removed volume
   
   // HDFS-14476: run checkAndUpdate with batch to avoid holding the lock 
too
   // long
   int loopCount = 0;
   synchronized (diffs) {
 for (final Map.Entry entry : diffs.getEntries()) {
   dataset.checkAndUpdate(entry.getKey(), entry.getValue());
   ...
 } 
   ```
   
   Inside `checkAndUpdate`, `memBlockInfo` is null because all the block meta 
in memory is removed during the volume removal, but `diskFile` still exists. 
Then `DataNode#notifyNamenodeDeletedBlock` (and further down the line, 
`notifyNamenodeBlock`) is called on this block.
   
   This patch effectively contains 2 parts:
   * Preventing processing IBRs with null storage
   * Preventing `FsDatasetImpl#checkAndUpdate` from running on a block 
belonging to a removed storage
   
   ### How was this patch tested?
   
   Unit tests. Partially on production cluster.
   
   ### For code changes:
   
   - [x] Does the title or this PR starts with the corresponding JIRA issue id 
(e.g. 'HADOOP-17799. Your PR title ...')?




> DN can fail IBRs with NPE when a volume is removed
> --
>
> Key: HDFS-17488
> URL: https://issues.apache.org/jira/browse/HDFS-17488
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Felix N
>Assignee: Felix N
>Priority: Major
>
>  
> Error logs
> {code:java}
> 2024-04-22 15:46:33,422 [BP-1842952724-10.22.68.249-1713771988830 
> heartbeating to localhost/127.0.0.1:64977] ERROR datanode.DataNode 
> (BPServiceActor.java:run(922)) - Exception in BPOfferService for Block pool 
> BP-1842952724-10.22.68.249-1713771988830 (Datanode Uuid 
> 1659ffaf-1a80-4a8e-a542-643f6bd97ed4) service to localhost/127.0.0.1:64977
> java.lang.NullPointerException
>     at 
> org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.blockReceivedAndDeleted(DatanodeProtocolClientSideTranslatorPB.java:246)
>     at 
> org.apache.hadoop.hdfs.server.datanode.IncrementalBlockReportManager.sendIBRs(IncrementalBlockReportManager.java:218)
>     at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:749)
>     at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:920)
>     at java.lang.Thread.run(Thread.java:748) {code}
> The 

[jira] [Commented] (HDFS-17457) [FGL] UTs support fine-grained locking

2024-04-22 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17839580#comment-17839580
 ] 

ASF GitHub Bot commented on HDFS-17457:
---

kokonguyen191 commented on code in PR #6741:
URL: https://github.com/apache/hadoop/pull/6741#discussion_r1574451782


##
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestListOpenFiles.java:
##
@@ -48,6 +48,7 @@
 import org.apache.hadoop.hdfs.protocol.OpenFilesIterator;
 import org.apache.hadoop.hdfs.protocol.OpenFilesIterator.OpenFilesType;
 import org.apache.hadoop.hdfs.server.namenode.ha.HATestUtil;
+import org.apache.hadoop.hdfs.server.namenode.fgl.FSNamesystemLockMode;

Review Comment:
   `import org.apache.hadoop.hdfs.server.namenode.fgl.FSNamesystemLockMode;` 
should be above `import org.apache.hadoop.hdfs.server.namenode.ha.HATestUtil;`. 
Probably better to set up the import order in IDE and let the auto formatter 
handle it.





> [FGL] UTs support fine-grained locking
> --
>
> Key: HDFS-17457
> URL: https://issues.apache.org/jira/browse/HDFS-17457
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
>
> [FGL] UTs support fine-grained locking



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17451) RBF: fix spotbugs for redundant nullcheck of dns.

2024-04-22 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17839611#comment-17839611
 ] 

ASF GitHub Bot commented on HDFS-17451:
---

ZanderXu commented on PR #6697:
URL: https://github.com/apache/hadoop/pull/6697#issuecomment-2069092826

   As I described in HDFS-17485, this spotbug should be fixed.
   
   If there is no input type DNs, `dnCache` will return an empty result, not 
null. So I thinks this modification is ok.




> RBF: fix spotbugs for redundant nullcheck of dns.
> -
>
> Key: HDFS-17451
> URL: https://issues.apache.org/jira/browse/HDFS-17451
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Jian Zhang
>Assignee: Jian Zhang
>Priority: Major
>  Labels: pull-request-available
>
> h2. Dodgy code Warnings
> ||Code||Warning||
> |RCN|Redundant nullcheck of dns, which is known to be non-null in 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getCachedDatanodeReport(HdfsConstants$DatanodeReportType)|
> | |[Bug type RCN_REDUNDANT_NULLCHECK_OF_NONNULL_VALUE (click for 
> details)|https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6655/8/artifact/out/branch-spotbugs-hadoop-hdfs-project_hadoop-hdfs-rbf-warnings.html#RCN_REDUNDANT_NULLCHECK_OF_NONNULL_VALUE]
> In class org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer
> In method 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getCachedDatanodeReport(HdfsConstants$DatanodeReportType)
> Value loaded from dns
> Return value of 
> org.apache.hadoop.thirdparty.com.google.common.cache.LoadingCache.get(Object) 
> of type Object
> Redundant null check at RouterRpcServer.java:[line 1093]|



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17484) Introduce redundancy.considerLoad.minLoad to avoiding excluding nodes when they are not busy actually

2024-04-22 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17839679#comment-17839679
 ] 

ASF GitHub Bot commented on HDFS-17484:
---

hadoop-yetus commented on PR #6758:
URL: https://github.com/apache/hadoop/pull/6758#issuecomment-2069616190

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   5m 32s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  0s |  |  xmllint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  47m 17s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 20s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |   1m 16s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   1m 13s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 25s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m  7s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 45s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   3m 31s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  35m 57s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 15s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 17s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |   1m 17s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 15s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |   1m 15s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   1m  1s | 
[/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6758/2/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs-project/hadoop-hdfs: The patch generated 2 new + 251 unchanged 
- 0 fixed = 253 total (was 251)  |
   | +1 :green_heart: |  mvnsite  |   1m 16s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 55s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 37s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   3m 31s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  36m 19s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 244m 35s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6758/2/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 43s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 395m 57s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.TestDFSStripedOutputStream |
   |   | hadoop.hdfs.server.blockmanagement.TestReplicationPolicyConsiderLoad |
   |   | hadoop.hdfs.TestReconstructStripedFileWithRandomECPolicy |
   |   | hadoop.hdfs.server.namenode.TestCacheDirectives |
   |   | hadoop.hdfs.server.namenode.TestReconstructStripedBlocks |
   |   | hadoop.tools.TestHdfsConfigFields |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.45 ServerAPI=1.45 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6758/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6758 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint |
   | uname | Linux c90e17858d0b 5.15.0-101-generic #111-Ubuntu SMP Tue Mar 5 
20:16:58 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
 

[jira] [Commented] (HDFS-17488) DN can fail IBRs with NPE when a volume is removed

2024-04-22 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17839723#comment-17839723
 ] 

ASF GitHub Bot commented on HDFS-17488:
---

hadoop-yetus commented on PR #6759:
URL: https://github.com/apache/hadoop/pull/6759#issuecomment-2069730206

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |  12m 25s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  44m 20s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 23s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |   1m 14s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   1m 12s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 24s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m  6s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 46s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   3m 18s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  35m 28s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 12s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 14s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |   1m 14s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  7s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |   1m  7s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   1m  1s | 
[/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6759/1/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 230 unchanged 
- 0 fixed = 231 total (was 230)  |
   | +1 :green_heart: |  mvnsite  |   1m 12s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 53s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 34s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   3m 18s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  35m 53s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 235m 27s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6759/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 44s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 388m 51s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.server.datanode.TestBPOfferService |
   |   | hadoop.hdfs.server.datanode.TestLargeBlockReport |
   |   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.45 ServerAPI=1.45 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6759/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6759 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 34352d4fc452 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 
15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / d5a87e15db1c70fb0c3974832c75504ff184b0d1 |
   | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   | Multi-JDK versions | 

[jira] [Created] (HDFS-17491) [FGL] Make getFullPathName in INode.java thread safe

2024-04-22 Thread ZanderXu (Jira)
ZanderXu created HDFS-17491:
---

 Summary: [FGL] Make getFullPathName in INode.java thread safe
 Key: HDFS-17491
 URL: https://issues.apache.org/jira/browse/HDFS-17491
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: ZanderXu


Make getFullPathName in INode.java thread safe, so that we can safely get the 
fullpath of an iNode.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-17491) [FGL] Make getFullPathName in INode.java thread safe

2024-04-22 Thread ZanderXu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZanderXu reassigned HDFS-17491:
---

Assignee: ZanderXu

> [FGL] Make getFullPathName in INode.java thread safe
> 
>
> Key: HDFS-17491
> URL: https://issues.apache.org/jira/browse/HDFS-17491
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>
> Make getFullPathName in INode.java thread safe, so that we can safely get the 
> fullpath of an iNode.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17493) [FGL] Create RPC supports fine-grained Locking

2024-04-22 Thread ZanderXu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZanderXu updated HDFS-17493:

Description: Create RPC supports fine-grained Locking  (was: Make INodeMap 
thread safe, since it may be accessed or updated concurrently.)

> [FGL] Create RPC supports fine-grained Locking
> --
>
> Key: HDFS-17493
> URL: https://issues.apache.org/jira/browse/HDFS-17493
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>
> Create RPC supports fine-grained Locking



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17493) [FGL] Create RPC supports fine-grained Locking

2024-04-22 Thread ZanderXu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZanderXu updated HDFS-17493:

Summary: [FGL] Create RPC supports fine-grained Locking  (was: [FGL] Make 
INodeMap thread safe)

> [FGL] Create RPC supports fine-grained Locking
> --
>
> Key: HDFS-17493
> URL: https://issues.apache.org/jira/browse/HDFS-17493
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>
> Make INodeMap thread safe, since it may be accessed or updated concurrently.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-17489) [FGL] Implement a LockPool

2024-04-22 Thread ZanderXu (Jira)
ZanderXu created HDFS-17489:
---

 Summary: [FGL] Implement a LockPool
 Key: HDFS-17489
 URL: https://issues.apache.org/jira/browse/HDFS-17489
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: ZanderXu
Assignee: ZanderXu


A lockPool to mange all locks.

It will allocate a lock and cache it if this lock doesn't been cached when 
acquiring lock, and it will uncache a unused lock from memory when releasing 
lock.

And it also should support persist locks in memory



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-17492) [FGL] Abstract a INodeLockManager to mange acquiring and releasing locks in the directory-tree

2024-04-22 Thread ZanderXu (Jira)
ZanderXu created HDFS-17492:
---

 Summary: [FGL] Abstract a INodeLockManager to mange acquiring and 
releasing locks in the directory-tree
 Key: HDFS-17492
 URL: https://issues.apache.org/jira/browse/HDFS-17492
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: ZanderXu
Assignee: ZanderXu


Abstract a INodeLockManager to mange acquiring and releasing locks in the 
directory-tree.
 # Abstract a lock type to cover all cases in NN
 # Acquire the full path lock for the input path base on the input lock type
 # Acquire the full path lock for the input iNodeId base on the input lock type
 # Acquire the full path lock for some input paths, such as for rename, concat

 

INodeLockManager should returns an IIP which contains both iNodes and locks.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17485) Fix SpotBug in RouterRpcServer.java

2024-04-22 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17839608#comment-17839608
 ] 

ASF GitHub Bot commented on HDFS-17485:
---

ZanderXu closed pull request #6757: HDFS-17485. Remove redundant nullcheck in 
RouterRpcServer.java
URL: https://github.com/apache/hadoop/pull/6757




> Fix SpotBug in RouterRpcServer.java
> ---
>
> Key: HDFS-17485
> URL: https://issues.apache.org/jira/browse/HDFS-17485
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2024-04-22-15-02-33-725.png
>
>
> !image-2024-04-22-15-02-33-725.png|width=1566,height=265!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17485) Fix SpotBug in RouterRpcServer.java

2024-04-22 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17839596#comment-17839596
 ] 

ASF GitHub Bot commented on HDFS-17485:
---

hadoop-yetus commented on PR #6757:
URL: https://github.com/apache/hadoop/pull/6757#issuecomment-2069029513

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |  19m 11s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  49m 45s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 42s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |   0m 38s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   0m 31s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 42s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 43s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 33s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | -1 :x: |  spotbugs  |   1m 22s | 
[/branch-spotbugs-hadoop-hdfs-project_hadoop-hdfs-rbf-warnings.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6757/1/artifact/out/branch-spotbugs-hadoop-hdfs-project_hadoop-hdfs-rbf-warnings.html)
 |  hadoop-hdfs-project/hadoop-hdfs-rbf in trunk has 1 extant spotbugs 
warnings.  |
   | +1 :green_heart: |  shadedclient  |  39m 22s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 32s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 33s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |   0m 33s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 30s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |   0m 30s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 19s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 33s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 29s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 24s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   1m 22s |  |  
hadoop-hdfs-project/hadoop-hdfs-rbf generated 0 new + 0 unchanged - 1 fixed = 0 
total (was 1)  |
   | +1 :green_heart: |  shadedclient  |  39m 22s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  31m 38s |  |  hadoop-hdfs-rbf in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 39s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 195m 47s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.45 ServerAPI=1.45 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6757/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6757 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux f89a44debf13 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 
15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / a270576ad84fb985927883e82f2da12f64114fe2 |
   | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   |  Test Results | 

[jira] [Created] (HDFS-17490) [FGL] Make INodesInPath closeable

2024-04-22 Thread ZanderXu (Jira)
ZanderXu created HDFS-17490:
---

 Summary: [FGL] Make INodesInPath closeable
 Key: HDFS-17490
 URL: https://issues.apache.org/jira/browse/HDFS-17490
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: ZanderXu
Assignee: ZanderXu


Add an array to store the locks corresponding to each iNode in INodesInPath, 
and make INodesInPath closeable



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-17494) [FGL] GetFileInfo supports fine-grained locking

2024-04-22 Thread ZanderXu (Jira)
ZanderXu created HDFS-17494:
---

 Summary: [FGL] GetFileInfo supports fine-grained locking
 Key: HDFS-17494
 URL: https://issues.apache.org/jira/browse/HDFS-17494
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: ZanderXu
Assignee: ZanderXu


[FGL] GetFileInfo supports fine-grained locking



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-17493) [FGL] Make INodeMap thread safe

2024-04-22 Thread ZanderXu (Jira)
ZanderXu created HDFS-17493:
---

 Summary: [FGL] Make INodeMap thread safe
 Key: HDFS-17493
 URL: https://issues.apache.org/jira/browse/HDFS-17493
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: ZanderXu
Assignee: ZanderXu


Make INodeMap thread safe, since it may be accessed or updated concurrently.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-17485) Fix SpotBug in RouterRpcServer.java

2024-04-22 Thread ZanderXu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZanderXu resolved HDFS-17485.
-
Resolution: Duplicate

> Fix SpotBug in RouterRpcServer.java
> ---
>
> Key: HDFS-17485
> URL: https://issues.apache.org/jira/browse/HDFS-17485
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2024-04-22-15-02-33-725.png
>
>
> !image-2024-04-22-15-02-33-725.png|width=1566,height=265!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17484) Introduce redundancy.considerLoad.minLoad to avoiding excluding nodes when they are not busy actually

2024-04-22 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17839680#comment-17839680
 ] 

ASF GitHub Bot commented on HDFS-17484:
---

hadoop-yetus commented on PR #6758:
URL: https://github.com/apache/hadoop/pull/6758#issuecomment-2069624972

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |  12m 17s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  0s |  |  xmllint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  46m 47s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 23s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |   1m 19s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   1m 15s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 29s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m  9s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 45s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   3m 26s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  36m 20s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 16s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 16s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |   1m 16s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 10s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |   1m 10s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   1m  2s | 
[/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6758/1/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs-project/hadoop-hdfs: The patch generated 2 new + 251 unchanged 
- 0 fixed = 253 total (was 251)  |
   | +1 :green_heart: |  mvnsite  |   1m 20s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 55s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 35s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   3m 34s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  36m 14s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 246m 54s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6758/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 45s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 404m 20s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.TestDFSStripedOutputStream |
   |   | hadoop.hdfs.server.datanode.TestDataNodeReconfiguration |
   |   | hadoop.hdfs.TestReconstructStripedFileWithRandomECPolicy |
   |   | hadoop.hdfs.server.namenode.TestCacheDirectives |
   |   | hadoop.hdfs.server.namenode.TestReconstructStripedBlocks |
   |   | hadoop.tools.TestHdfsConfigFields |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.45 ServerAPI=1.45 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6758/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6758 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint |
   | uname | Linux 4e072519eda8 5.15.0-101-generic #111-Ubuntu SMP Tue Mar 5 
20:16:58 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | 

[jira] [Commented] (HDFS-17484) Introduce redundancy.considerLoad.minLoad to avoiding excluding nodes when they are not busy actually

2024-04-22 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17839734#comment-17839734
 ] 

ASF GitHub Bot commented on HDFS-17484:
---

hadoop-yetus commented on PR #6758:
URL: https://github.com/apache/hadoop/pull/6758#issuecomment-2069821953

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 45s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  0s |  |  xmllint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  50m  8s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 23s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |   1m 15s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   1m 13s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 22s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m  8s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 40s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   3m 18s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  41m 57s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 13s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 15s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |   1m 15s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  8s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |   1m  8s |  |  the patch passed  |
   | -1 :x: |  blanks  |   0m  0s | 
[/blanks-eol.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6758/3/artifact/out/blanks-eol.txt)
 |  The patch has 1 line(s) that end in blanks. Use git apply --whitespace=fix 
<>. Refer https://git-scm.com/docs/git-apply  |
   | -0 :warning: |  checkstyle  |   1m  2s | 
[/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6758/3/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs-project/hadoop-hdfs: The patch generated 2 new + 252 unchanged 
- 0 fixed = 254 total (was 252)  |
   | +1 :green_heart: |  mvnsite  |   1m 13s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 55s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 37s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   3m 20s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  41m 25s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 277m 15s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6758/3/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 54s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 437m  8s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.tools.TestHdfsConfigFields |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.44 ServerAPI=1.44 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6758/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6758 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint |
   | uname | Linux fafa9f8a14b5 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 
15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / b01d3bcf66844d7f7f66c04a496d8dcc39cddae6 |
   | Default 

[jira] [Created] (HDFS-17495) Change FSNamesystem.digest to use a configurable algorithm.

2024-04-22 Thread Tsz-wo Sze (Jira)
Tsz-wo Sze created HDFS-17495:
-

 Summary: Change FSNamesystem.digest to use a configurable 
algorithm.
 Key: HDFS-17495
 URL: https://issues.apache.org/jira/browse/HDFS-17495
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namanode
Reporter: Tsz-wo Sze
Assignee: Tsz-wo Sze


FSNamesystem.digest currently is hardcoded to use the MD5 algorithm.  This Jira 
is to make it configurable.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17457) [FGL] UTs support fine-grained locking

2024-04-22 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17839793#comment-17839793
 ] 

ASF GitHub Bot commented on HDFS-17457:
---

hadoop-yetus commented on PR #6741:
URL: https://github.com/apache/hadoop/pull/6741#issuecomment-2070306218

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   7m 37s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 40 new or modified test files.  |
    _ HDFS-17384 Compile Tests _ |
   | +0 :ok: |  mvndep  |  13m 57s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  19m 48s |  |  HDFS-17384 passed  |
   | +1 :green_heart: |  compile  |   8m 58s |  |  HDFS-17384 passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |   8m 24s |  |  HDFS-17384 passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   2m  8s |  |  HDFS-17384 passed  |
   | +1 :green_heart: |  mvnsite  |   2m  3s |  |  HDFS-17384 passed  |
   | +1 :green_heart: |  javadoc  |   1m 56s |  |  HDFS-17384 passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 16s |  |  HDFS-17384 passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | -1 :x: |  spotbugs  |   0m 54s | 
[/branch-spotbugs-hadoop-hdfs-project_hadoop-hdfs-rbf-warnings.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6741/4/artifact/out/branch-spotbugs-hadoop-hdfs-project_hadoop-hdfs-rbf-warnings.html)
 |  hadoop-hdfs-project/hadoop-hdfs-rbf in HDFS-17384 has 1 extant spotbugs 
warnings.  |
   | +1 :green_heart: |  shadedclient  |  20m 18s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 22s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 12s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   8m 26s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |   8m 26s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   8m 11s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |   8m 11s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   2m  6s |  |  root: The patch generated 
0 new + 356 unchanged - 1 fixed = 356 total (was 357)  |
   | +1 :green_heart: |  mvnsite  |   2m  0s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   1m 43s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 11s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   3m 46s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  20m 10s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  | 198m 18s |  |  hadoop-hdfs in the patch 
passed.  |
   | -1 :x: |  unit  |  29m 28s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6741/4/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt)
 |  hadoop-hdfs-rbf in the patch passed.  |
   | +1 :green_heart: |  unit  |   0m 39s |  |  hadoop-fs2img in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 47s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 375m 58s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.hdfs.server.federation.router.TestRouterRpcMultiDestination |
   |   | hadoop.hdfs.server.federation.router.TestRouterRpc |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.45 ServerAPI=1.45 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6741/4/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6741 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 

[jira] [Commented] (HDFS-17457) [FGL] UTs support fine-grained locking

2024-04-22 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17839908#comment-17839908
 ] 

ASF GitHub Bot commented on HDFS-17457:
---

ferhui merged PR #6741:
URL: https://github.com/apache/hadoop/pull/6741




> [FGL] UTs support fine-grained locking
> --
>
> Key: HDFS-17457
> URL: https://issues.apache.org/jira/browse/HDFS-17457
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
>
> [FGL] UTs support fine-grained locking



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17457) [FGL] UTs support fine-grained locking

2024-04-22 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17839909#comment-17839909
 ] 

ASF GitHub Bot commented on HDFS-17457:
---

ferhui commented on PR #6741:
URL: https://github.com/apache/hadoop/pull/6741#issuecomment-2071256036

   @ZanderXu Thanks for contribution. @kokonguyen191 Thanks for review. Merged.




> [FGL] UTs support fine-grained locking
> --
>
> Key: HDFS-17457
> URL: https://issues.apache.org/jira/browse/HDFS-17457
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
>
> [FGL] UTs support fine-grained locking



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17484) Introduce redundancy.considerLoad.minLoad to avoiding excluding nodes when they are not busy actually

2024-04-22 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17839911#comment-17839911
 ] 

ASF GitHub Bot commented on HDFS-17484:
---

hadoop-yetus commented on PR #6758:
URL: https://github.com/apache/hadoop/pull/6758#issuecomment-2071256591

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m 01s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  spotbugs  |   0m 00s |  |  spotbugs executables are not 
available.  |
   | +0 :ok: |  codespell  |   0m 00s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m 00s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m 00s |  |  xmllint was not available.  |
   | +1 :green_heart: |  @author  |   0m 00s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m 00s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  89m 50s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   5m 55s |  |  trunk passed  |
   | +1 :green_heart: |  checkstyle  |   4m 58s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   6m 30s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   5m 47s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  | 148m 26s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   4m 49s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   3m 26s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |   3m 26s |  |  the patch passed  |
   | -1 :x: |  blanks  |   0m 00s | 
[/blanks-eol.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6758/1/artifact/out/blanks-eol.txt)
 |  The patch has 1 line(s) that end in blanks. Use git apply --whitespace=fix 
<>. Refer https://git-scm.com/docs/git-apply  |
   | +1 :green_heart: |  checkstyle  |   2m 19s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   4m 03s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   3m 27s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  | 158m 13s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  asflicense  |   5m 29s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 423m 09s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | GITHUB PR | https://github.com/apache/hadoop/pull/6758 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint |
   | uname | MINGW64_NT-10.0-17763 be352c545e0b 3.4.10-87d57229.x86_64 
2024-02-14 20:17 UTC x86_64 Msys |
   | Build tool | maven |
   | Personality | /c/hadoop/dev-support/bin/hadoop.sh |
   | git revision | trunk / b01d3bcf66844d7f7f66c04a496d8dcc39cddae6 |
   | Default Java | Azul Systems, Inc.-1.8.0_332-b09 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6758/1/testReport/
 |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6758/1/console
 |
   | versions | git=2.44.0.windows.1 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




> Introduce redundancy.considerLoad.minLoad to avoiding excluding nodes when 
> they are not busy actually
> -
>
> Key: HDFS-17484
> URL: https://issues.apache.org/jira/browse/HDFS-17484
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.4.0
>Reporter: farmmamba
>Assignee: farmmamba
>Priority: Minor
>  Labels: pull-request-available
>
> Currently, we have `dfs.namenode.redundancy.considerLoad` equals true by 
> default, and 
> dfs.namenode.redundancy.considerLoad.factor equals 2.0 by default.
> Think about below situation. when we are doing stress test, we may deploy 
> hdfs client onto the datanode. So, this hdfs client will prefer to write to 
> its local datanode and increase this machine's load.  Suppose we have 3 
> datanodes, the load of them are as below:  5.0, 0.2, 0.3.
>  
> The load equals to 5.0 will be excluded when choose datanodes for a block. 
> But actually, it is not slow node when load equals to 5.0 for a machine 

[jira] [Resolved] (HDFS-17457) [FGL] UTs support fine-grained locking

2024-04-22 Thread Hui Fei (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hui Fei resolved HDFS-17457.

Resolution: Fixed

> [FGL] UTs support fine-grained locking
> --
>
> Key: HDFS-17457
> URL: https://issues.apache.org/jira/browse/HDFS-17457
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
>
> [FGL] UTs support fine-grained locking



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17384) [FGL] Replace the global lock with global FS Lock and global BM lock

2024-04-22 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17839915#comment-17839915
 ] 

ASF GitHub Bot commented on HDFS-17384:
---

ZanderXu opened a new pull request, #6762:
URL: https://github.com/apache/hadoop/pull/6762

   We plan to merge HDFS-17384 to the trunk branch. 
   
   This PR is used to review all changes in HDFS-17384.




> [FGL] Replace the global lock with global FS Lock and global BM lock
> 
>
> Key: HDFS-17384
> URL: https://issues.apache.org/jira/browse/HDFS-17384
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: FGL
>
> First, we can replace the current global lock with two locks, global FS lock 
> and global BM lock.
> The global FS lock is used to make directory tree-related operations 
> thread-safe.
> The global BM lock is used to make block-related operations and DN-related 
> operations thread-safe.
>  
> For some operations involving both directory tree and block or DN, the global 
> FS lock and the global BM lock are acquired.
>  
> The lock order should be:
>  * The global FS lock
>  * The global BM lock
>  
> There are some special requirements for this ticket.
>  * End-user can choose to use global lock or fine-grained lock through 
> configuration.
>  * Try not to modify the current implementation logic as much as possible.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17384) [FGL] Replace the global lock with global FS Lock and global BM lock

2024-04-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-17384:
--
Labels: FGL pull-request-available  (was: FGL)

> [FGL] Replace the global lock with global FS Lock and global BM lock
> 
>
> Key: HDFS-17384
> URL: https://issues.apache.org/jira/browse/HDFS-17384
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: FGL, pull-request-available
>
> First, we can replace the current global lock with two locks, global FS lock 
> and global BM lock.
> The global FS lock is used to make directory tree-related operations 
> thread-safe.
> The global BM lock is used to make block-related operations and DN-related 
> operations thread-safe.
>  
> For some operations involving both directory tree and block or DN, the global 
> FS lock and the global BM lock are acquired.
>  
> The lock order should be:
>  * The global FS lock
>  * The global BM lock
>  
> There are some special requirements for this ticket.
>  * End-user can choose to use global lock or fine-grained lock through 
> configuration.
>  * Try not to modify the current implementation logic as much as possible.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17484) Introduce redundancy.considerLoad.minLoad to avoiding excluding nodes when they are not busy actually

2024-04-22 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17839919#comment-17839919
 ] 

ASF GitHub Bot commented on HDFS-17484:
---

hfutatzhanghb commented on PR #6758:
URL: https://github.com/apache/hadoop/pull/6758#issuecomment-2071305686

   @Hexiaoqiao @zhangshuyan0 @haiyang1987 @tomscut Sir, could you please help 
me review this PR when you have free time? Thanks ahead.




> Introduce redundancy.considerLoad.minLoad to avoiding excluding nodes when 
> they are not busy actually
> -
>
> Key: HDFS-17484
> URL: https://issues.apache.org/jira/browse/HDFS-17484
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.4.0
>Reporter: farmmamba
>Assignee: farmmamba
>Priority: Minor
>  Labels: pull-request-available
>
> Currently, we have `dfs.namenode.redundancy.considerLoad` equals true by 
> default, and 
> dfs.namenode.redundancy.considerLoad.factor equals 2.0 by default.
> Think about below situation. when we are doing stress test, we may deploy 
> hdfs client onto the datanode. So, this hdfs client will prefer to write to 
> its local datanode and increase this machine's load.  Suppose we have 3 
> datanodes, the load of them are as below:  5.0, 0.2, 0.3.
>  
> The load equals to 5.0 will be excluded when choose datanodes for a block. 
> But actually, it is not slow node when load equals to 5.0 for a machine with 
> 80 cpu cores.
>  
> So, we should better add a new configuration entry :  
> `dfs.namenode.redundancy.considerLoad.minLoad` to indicate the mininum factor 
> we will make considerLoad take effect.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17488) DN can fail IBRs with NPE when a volume is removed

2024-04-22 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17839929#comment-17839929
 ] 

ASF GitHub Bot commented on HDFS-17488:
---

hadoop-yetus commented on PR #6759:
URL: https://github.com/apache/hadoop/pull/6759#issuecomment-2071375244

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m 02s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  spotbugs  |   0m 00s |  |  spotbugs executables are not 
available.  |
   | +0 :ok: |  codespell  |   0m 00s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m 01s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m 00s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m 00s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  | 118m 49s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   8m 15s |  |  trunk passed  |
   | +1 :green_heart: |  checkstyle  |   6m 34s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   8m 28s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   7m 31s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  | 190m 44s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   6m 57s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   4m 54s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |   4m 54s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m 00s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   3m 25s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   6m 07s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   4m 56s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  | 207m 49s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  asflicense  |   9m 28s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 557m 52s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | GITHUB PR | https://github.com/apache/hadoop/pull/6759 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | MINGW64_NT-10.0-17763 9dbcaf4dc965 3.4.10-87d57229.x86_64 
2024-02-14 20:17 UTC x86_64 Msys |
   | Build tool | maven |
   | Personality | /c/hadoop/dev-support/bin/hadoop.sh |
   | git revision | trunk / d5a87e15db1c70fb0c3974832c75504ff184b0d1 |
   | Default Java | Azul Systems, Inc.-1.8.0_332-b09 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6759/1/testReport/
 |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6759/1/console
 |
   | versions | git=2.44.0.windows.1 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




> DN can fail IBRs with NPE when a volume is removed
> --
>
> Key: HDFS-17488
> URL: https://issues.apache.org/jira/browse/HDFS-17488
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Felix N
>Assignee: Felix N
>Priority: Major
>  Labels: pull-request-available
>
>  
> Error logs
> {code:java}
> 2024-04-22 15:46:33,422 [BP-1842952724-10.22.68.249-1713771988830 
> heartbeating to localhost/127.0.0.1:64977] ERROR datanode.DataNode 
> (BPServiceActor.java:run(922)) - Exception in BPOfferService for Block pool 
> BP-1842952724-10.22.68.249-1713771988830 (Datanode Uuid 
> 1659ffaf-1a80-4a8e-a542-643f6bd97ed4) service to localhost/127.0.0.1:64977
> java.lang.NullPointerException
>     at 
> org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.blockReceivedAndDeleted(DatanodeProtocolClientSideTranslatorPB.java:246)
>     at 
> org.apache.hadoop.hdfs.server.datanode.IncrementalBlockReportManager.sendIBRs(IncrementalBlockReportManager.java:218)
>     at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:749)
>     at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:920)
>     at java.lang.Thread.run(Thread.java:748) {code}
> The root cause is in BPOfferService#notifyNamenodeBlock, happens when it's 
>