[ 
https://issues.apache.org/jira/browse/HDFS-8401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14563399#comment-14563399
 ] 

Colin Patrick McCabe commented on HDFS-8401:
--------------------------------------------

bq. Allow using memory features without calling HDFS-specific APIs. This also 
isolates applications from evolving APIs. Applications currently use shims and 
reflection tricks to work with different versions of HDFS.

HDFS-4949 didn't require applications to call any HDFS-specific APIs.  The 
administrator simply set a list of files and directories to be cached.  When 
applications read those files or directories, they were retrieved from the 
cache.

We could do something similar here by specifying that we wanted opportunistic 
caching on a certain directory subtree.  For example we could set a 2Q eviction 
policy on a certain directory subtree and have the NameNode manage that.  
[~andrew.wang] and I discussed doing that for HDFS-4949, but we simply didn't 
have time.

bq. Once applications start using memfs someone could write a memfs layer over 
another HCFS e.g. Amazon S3.

That does raise the question of why this belongs in HDFS, though.  If we just 
want a generic FS caching layer in Hadoop, we could do that in hadoop-common.

> Memfs - a layered file system for in-memory storage in HDFS
> -----------------------------------------------------------
>
>                 Key: HDFS-8401
>                 URL: https://issues.apache.org/jira/browse/HDFS-8401
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Arpit Agarwal
>            Assignee: Arpit Agarwal
>
> We propose creating a layered filesystem that can provide in-memory storage 
> using existing features within HDFS. memfs will use lazy persist writes 
> introduced by HDFS-6581. For reads, memfs can use the Centralized Cache 
> Management feature introduced in HDFS-4949 to load hot data to memory.
> Paths in memfs and hdfs will correspond 1:1 so memfs will require no 
> additional metadata and it can be implemented entirely as a client-side 
> library.
> The advantage of a layered file system is that it requires little or no 
> changes to existing applications. e.g. Applications can use something like 
> {{memfs://}} instead of {{hdfs://}} for files targeted to memory storage. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to