[jira] [Created] (HDFS-17662) Block recovery inter-datanode operations should have higher priority than DataXceiver

Shangshu Qian (Jira) Mon, 11 Nov 2024 17:17:07 -0800

Shangshu Qian created HDFS-17662:
------------------------------------

             Summary: Block recovery inter-datanode operations should have 
higher priority than DataXceiver
                 Key: HDFS-17662
                 URL: https://issues.apache.org/jira/browse/HDFS-17662
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: datanode
            Reporter: Shangshu Qian



We found a potential feedback loop than can cause workload amplification of 
block transfers and block recovery. Currently, in the heartbeat (HB) response 
from the NN to the DN, block recovery commands have higher priority than other 
block operations. However, these two types of operations are of the same 
priority in the InterDataNodeProtocol.

The feedback loop is like this:
 # The pipeline rebuild process causes extra block transfer operations in the 
cluster.
 # The pipeline rebuild operations can cause contention with block recovery 
commands, causing them to fail if none of the DN is successful.
 # The failed block recovery may cause extra retries, leading to even higher 
load in the DN.
 # The sendIBR in BPServiceActor can run into IOException caused by network or 
CPU congestion. The IBR is simply delayed until the next report cycle.
 # At the same time, the write pipeline may fail in the 
FsDatasetImpl.checkBlock() due to ReplicaNotFoundException(), resulting in more 
pipeline rebuild operations.

Making the block-recovery-related inter-datanode commands' priority higher can 
reduce the chance of this feedback loop.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-17662) Block recovery inter-datanode operations should have higher priority than DataXceiver

Reply via email to