[ 
https://issues.apache.org/jira/browse/HDFS-9103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14999004#comment-14999004
 ] 

Haohui Mai commented on HDFS-9103:
----------------------------------

bq. The thinking here was it could be possible to have a handful of datanodes 
in the cluster that wouldn't be touched often, so the check in 
BadDataNodeTracker::IsBadNode would never be able to remove them from the map.

I think that this is specific to the map implementation. An alternative 
approach is to sort the nodes based on the expiration time.

{code}
vector<pair<TimePoint, std::string>> excluded_nodes;
auto it = lower_bound(excluded_nodes.begin(), excluded_nodes.end(), dummy_now, 
[...](lhs, rhs) { return lhs.first < rhs.first; });
excluded_nodes.erase(excluded_nodes.begin(), it);
{code}

It's okay to do a linear scan when testing for excluded nodes as we expect the 
number of excluded nodes are small at any given point of time.


> Retry reads on DN failure
> -------------------------
>
>                 Key: HDFS-9103
>                 URL: https://issues.apache.org/jira/browse/HDFS-9103
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: hdfs-client
>            Reporter: Bob Hansen
>            Assignee: James Clampffer
>             Fix For: HDFS-8707
>
>         Attachments: HDFS-9103.1.patch, HDFS-9103.2.patch, 
> HDFS-9103.HDFS-8707.006.patch, HDFS-9103.HDFS-8707.007.patch, 
> HDFS-9103.HDFS-8707.008.patch, HDFS-9103.HDFS-8707.3.patch, 
> HDFS-9103.HDFS-8707.4.patch, HDFS-9103.HDFS-8707.5.patch
>
>
> When AsyncPreadSome fails, add the failed DataNode to the excluded list and 
> try again.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to