[
https://issues.apache.org/jira/browse/HDFS-9103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14999004#comment-14999004
]
Haohui Mai commented on HDFS-9103:
----------------------------------
bq. The thinking here was it could be possible to have a handful of datanodes
in the cluster that wouldn't be touched often, so the check in
BadDataNodeTracker::IsBadNode would never be able to remove them from the map.
I think that this is specific to the map implementation. An alternative
approach is to sort the nodes based on the expiration time.
{code}
vector<pair<TimePoint, std::string>> excluded_nodes;
auto it = lower_bound(excluded_nodes.begin(), excluded_nodes.end(), dummy_now,
[...](lhs, rhs) { return lhs.first < rhs.first; });
excluded_nodes.erase(excluded_nodes.begin(), it);
{code}
It's okay to do a linear scan when testing for excluded nodes as we expect the
number of excluded nodes are small at any given point of time.
> Retry reads on DN failure
> -------------------------
>
> Key: HDFS-9103
> URL: https://issues.apache.org/jira/browse/HDFS-9103
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: hdfs-client
> Reporter: Bob Hansen
> Assignee: James Clampffer
> Fix For: HDFS-8707
>
> Attachments: HDFS-9103.1.patch, HDFS-9103.2.patch,
> HDFS-9103.HDFS-8707.006.patch, HDFS-9103.HDFS-8707.007.patch,
> HDFS-9103.HDFS-8707.008.patch, HDFS-9103.HDFS-8707.3.patch,
> HDFS-9103.HDFS-8707.4.patch, HDFS-9103.HDFS-8707.5.patch
>
>
> When AsyncPreadSome fails, add the failed DataNode to the excluded list and
> try again.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)