[
https://issues.apache.org/jira/browse/HDFS-15605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17215809#comment-17215809
]
Ayush Saxena commented on HDFS-15605:
-------------------------------------
Just had a very quick look, Can we not leverage the existing class, rather than
having an AbstractClass then implementing two different child classes.
Can we just not add a configuration for this behaviour, and if this
configuration is turned on, The client can go to the Namenode to confirm the
details, else work as usual.
A point to note is getDatanodeReport is a very heavy call, Refetching block
locations again might be cheaper in some cases. :)
> DeadNodeDetector supports getting deadnode from NameNode.
> ---------------------------------------------------------
>
> Key: HDFS-15605
> URL: https://issues.apache.org/jira/browse/HDFS-15605
> Project: Hadoop HDFS
> Issue Type: Improvement
> Reporter: Jinglun
> Assignee: Jinglun
> Priority: Major
> Attachments: HDFS-15605.001.patch, HDFS-15605.002.patch,
> HDFS-15605.003.patch
>
>
> When we are using DeadNodeDetector, sometimes it marks too many nodes as dead
> and cause the read failures. The DeadNodeDetector assumes all the
> getDatanodeInfo rpcs failed to return in time are dead nodes. But actually
> not. A client side error or a slow rpc in DataNode might be marked as dead
> too. For example the client side delay of the rpcThreadPool might cause the
> getDatanodeInfo rpcs timeout and adding many datanodes to the dead list.
> We have a simple improvement for this: the NameNode already knows which
> datanodes are dead. So just update the dead list from NameNode using
> DFSClient.datanodeReport().
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]