[
https://issues.apache.org/jira/browse/HDFS-17913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ZanderXu resolved HDFS-17913.
-----------------------------
Resolution: Fixed
> Dead DataNode in Host2NodesMap can break block location sorting
> ---------------------------------------------------------------
>
> Key: HDFS-17913
> URL: https://issues.apache.org/jira/browse/HDFS-17913
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: hdfs
> Affects Versions: 3.4.3
> Reporter: Yue Wang
> Assignee: Yue Wang
> Priority: Major
> Labels: pull-request-available
>
> When HeartbeatManager#heartbeatCheck removes a dead DataNode via
> DatanodeManager#removeDeadDatanode, the node is removed from NetworkTopology,
> but it may still be returned by host2DatanodeMap.
> If an HDFS client is co-located on the same host/IP as that dead DataNode,
> DatanodeManager#sortLocatedBlock may treat the client as a DataNode reader.
> Since the descriptor has already been removed from NetworkTopology, its
> parent is null, and NetworkTopology#sortByDistance can compute incorrect
> weights for replicas. This may cause rack locality to be lost, especially
> when dfs.namenode.read.considerLoad=true.
> Expected behavior:
> A DataNode descriptor detached from NetworkTopology should not be treated as
> a DataNode reader.
> Proposed fix:
> In DatanodeManager#sortLocatedBlock, ignore a host-map hit whose topology
> parent is null
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]