[
https://issues.apache.org/jira/browse/HDFS-15119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17024526#comment-17024526
]
Ayush Saxena commented on HDFS-15119:
-------------------------------------
Thanx [~ahussein] for the details, Blocks moving usually isn't very common, The
client would be reading anyway from the nearest node, usually the local one, if
that doesn't change, refetching the nodes would be waste.
The initial intent of the Jira was also to prevent remote reads if local node
has been marked bad, if the local node is intact, then In most cases refetching
isn't required.
I agree in some cases that can fetch you a better location, In case the local
node wasn't available or stale when the client fetched the locations first. But
I think such happenings shall be rare, retrying in such case at every interval
would be unnecessary load to namenode for most of the times.
IMO, fetching when the best known is intact, isn't required as such. Just my
opinion. :)
> Allow expiration of cached locations in DFSInputStream
> ------------------------------------------------------
>
> Key: HDFS-15119
> URL: https://issues.apache.org/jira/browse/HDFS-15119
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: dfsclient
> Reporter: Ahmed Hussein
> Assignee: Ahmed Hussein
> Priority: Minor
> Fix For: 3.3.0, 3.1.4, 3.2.2
>
> Attachments: HDFS-15119.001.patch, HDFS-15119.002.patch,
> HDFS-15119.003.patch
>
>
> Staleness and other transient conditions can affect reads for a long time
> since the block locations may not be re-fetched. It makes sense to make
> cached locations to expire.
> For example, we may not take advantage of local-reads since the nodes are
> blacklisted and have not been updated.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]