[
https://issues.apache.org/jira/browse/HDFS-8182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14507982#comment-14507982
]
Jason Lowe commented on HDFS-8182:
----------------------------------
Quota seems like it could be problematic for some of the use-cases involved in
the distributed cache. Not all files being downloaded by a user belong to that
user. Public localized resources are a good example of this. How does quota
enter the picture in the case where the user who owns the file is not the user
requesting the file? For example, are other users accessing my public files
going to be able to cause my quota usage to increase by this feature?
> Implement topology-aware CDN-style caching
> ------------------------------------------
>
> Key: HDFS-8182
> URL: https://issues.apache.org/jira/browse/HDFS-8182
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: hdfs-client, namenode
> Affects Versions: 2.6.0
> Reporter: Gera Shegalov
>
> To scale reads of hot blocks in large clusters, it would be beneficial if we
> could read a block across the ToR switches only once. Example scenarios are
> localization of binaries, MR distributed cache files for map-side joins and
> similar. There are multiple layers where this could be implemented (YARN
> service or individual apps such as MR) but I believe it is best done in HDFS
> or even common FileSystem to support as many use cases as possible.
> The life cycle could look like this e.g. for the YARN localization scenario:
> 1. inputStream = fs.open(path, ..., CACHE_IN_RACK)
> 2. instead of reading from a remote DN directly, NN tells the client to read
> via the local DN1 and the DN1 creates a replica of each block.
> When the next localizer on DN2 in the same rack starts it will learn from NN
> about the replica in DN1 and the client will read from DN1 using the
> conventional path.
> When the application ends the AM or NM's can instruct the NN in a fadvise
> DONTNEED style, it can start telling DN's to discard extraneous replica.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)