[jira] [Commented] (HDFS-14740) HDFS read cache persistence support

Rakesh R (Jira) Wed, 28 Aug 2019 20:09:08 -0700


    [ 
https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16918242#comment-16918242
 ]


Rakesh R commented on HDFS-14740:
---------------------------------

Thanks [~Rui Mo] for the contribution. Overall the idea looks good. Added few 
comments, please take care.

# Please remove duplicate checks in #restoreCache() method as you already doing 
the checks inside #createBlockPoolDir().
{code}
#createBlockPoolDir()

        if (!cacheDir.exists() && !cacheDir.mkdir()) {
{code}
{code}
#restoreCache()
        if (cacheDir.exists()) {
{code}
# {{pmemVolume/BlockPoolId/BlockPoolId-BlockId}}.
{{BlockPoolId}} is duplicated and please remove this from the file name. 
This will avoid {{cachedFile.getName().split("-");}} splitting logic and make 
it simple.
# Can you explore the chances of using hierarchical way of storing blocks 
similar to the existing datanode data.dir, this is to avoid chances of growing 
blocks under one single blockPoolId. Assume cache capacity in TBs and large set 
of data blocks in cache under a blockPool. Please refer 
{{DatanodeUtil.idToBlockDir(finalizedDir, b.getBlockId());}}
# {{restoreCache()}} - How about moving specific parsing/restore logic to 
respective MappableBlockLoaders. PmemMappableBlockLoader#restoreCache() and 
NativePmemMappableBlockLoader#restoreCache().
# {{dfs.datanode.cache.persistence.enabled}} - by default this can be true as 
this will allow to get maximum capabilities of pmem device. Overall the feature 
is disabled and default value of "dfs.datanode.cache.pmem.dirs"  is empty and 
will be DRAM based. So, once the user enables pmem, they can utilize the 
potential of this device and no case of compatibility.

> HDFS read cache persistence support
> -----------------------------------
>
>                 Key: HDFS-14740
>                 URL: https://issues.apache.org/jira/browse/HDFS-14740
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Feilong He
>            Assignee: Rui Mo
>            Priority: Major
>         Attachments: HDFS-14740.000.patch, HDFS-14740.001.patch, 
> HDFS-14740.002.patch
>
>
> In HDFS-13762, persistent memory is enabled in HDFS centralized cache 
> management. Even though persistent memory can persist cache data, for 
> simplifying the implementation, the previous cache data will be cleaned up 
> during DataNode restarts. We propose to improve HDFS persistent memory (PM) 
> cache by taking advantage of PM's data persistence characteristic, i.e., 
> recovering the cache status when DataNode restarts, thus, cache warm up time 
> can be saved for user.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HDFS-14740) HDFS read cache persistence support

Reply via email to