[
https://issues.apache.org/jira/browse/HDFS-15605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jinglun updated HDFS-15605:
---------------------------
Attachment: HDFS-15605.003.patch
> DeadNodeDetector supports getting deadnode from NameNode.
> ---------------------------------------------------------
>
> Key: HDFS-15605
> URL: https://issues.apache.org/jira/browse/HDFS-15605
> Project: Hadoop HDFS
> Issue Type: Improvement
> Reporter: Jinglun
> Assignee: Jinglun
> Priority: Major
> Attachments: HDFS-15605.001.patch, HDFS-15605.002.patch,
> HDFS-15605.003.patch
>
>
> When we are using DeadNodeDetector, sometimes it marks too many nodes as dead
> and cause the read failures. The DeadNodeDetector assumes all the
> getDatanodeInfo rpcs failed to return in time are dead nodes. But actually
> not. A client side error or a slow rpc in DataNode might be marked as dead
> too. For example the client side delay of the rpcThreadPool might cause the
> getDatanodeInfo rpcs timeout and adding many datanodes to the dead list.
> We have a simple improvement for this: the NameNode already knows which
> datanodes are dead. So just update the dead list from NameNode using
> DFSClient.datanodeReport().
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]