[
https://issues.apache.org/jira/browse/HDFS-14651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16978253#comment-16978253
]
Lisheng Sun edited comment on HDFS-14651 at 11/20/19 10:10 AM:
---------------------------------------------------------------
[~linyiqun]
when client read failed from Datanode, there are generally two reasons as
follow:
1.There is a problem with the datanode itself.
2.There is a problem with the replica on datanode and datanode is good.
The client can't distinguish between the two cases.
For the second case, we should not add the datanode to dead list. it need to
be confirmed by re-probing and requires a higher priority processing. so we add
re-probing node to suspicious list. At the same time the datanode in suspicious
list is accessed by other dfsinputstream.
was (Author: leosun08):
[~linyiqun]
when client read failed from Datanode, there are generally two reasons as
follow:
1.There is a problem with the datanode itself.
2.There is a problem with the replica on datanode and datanode is good.
The client can't distinguish between the two cases.
For the second case, we should not add the datanode to dead list. it need to be
confirmed by re-probing and requires a higher priority processing. so we add
re-probing node to suspicious list. At the same time the datanode in suspicious
list is accessed from other dfsinputstream.
> DeadNodeDetector checks dead node periodically
> ----------------------------------------------
>
> Key: HDFS-14651
> URL: https://issues.apache.org/jira/browse/HDFS-14651
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Reporter: Lisheng Sun
> Assignee: Lisheng Sun
> Priority: Major
> Attachments: HDFS-14651.001.patch, HDFS-14651.002.patch,
> HDFS-14651.003.patch, HDFS-14651.004.patch, HDFS-14651.005.patch
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]