[
https://issues.apache.org/jira/browse/HDFS-5096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13741293#comment-13741293
]
Andrew Wang commented on HDFS-5096:
-----------------------------------
Hi Bikas,
bq. Would automatic cache eviction be a pre-requisite for automatic cache
addition?
No, we're not planning on automatic eviction yet, unless you count deleting a
cached file or file in a cached directory.
bq. Will this automatic caching kick-in when the file is completely written or
while writes are still in-progress...all or none property.
Our plan for v1 is whole-file caching. We cache all finalized blocks when the
caching request is made. If new finalized blocks are added over time (append,
new files), we try to cache those too.
The all-or-nothing property is more of a concern when designing an automatic
eviction algorithm, it's a way of modeling the working set of an MR job. Since
we're just doing a static, user-driven policy at first, it doesn't come up as
much. We do our best to satisfy the user's request, and leave it up to them to
maintain their own working set and not wedge themselves with bad caching
requests.
> Automatically cache new data added to a cached path
> ---------------------------------------------------
>
> Key: HDFS-5096
> URL: https://issues.apache.org/jira/browse/HDFS-5096
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: datanode, namenode
> Reporter: Andrew Wang
>
> For some applications, it's convenient to specify a path to cache, and have
> HDFS automatically cache new data added to the path without sending a new
> caching request or a manual refresh command.
> One example is new data appended to a cached file. It would be nice to
> re-cache a block at the new appended length, and cache new blocks added to
> the file.
> Another example is a cached Hive partition directory, where a user can drop
> new files directly into the partition. It would be nice if these new files
> were cached.
> In both cases, this automatic caching would happen after the file is closed,
> i.e. block replica is finalized.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira