[
https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16918242#comment-16918242
]
Rakesh R commented on HDFS-14740:
---------------------------------
Thanks [~Rui Mo] for the contribution. Overall the idea looks good. Added few
comments, please take care.
# Please remove duplicate checks in #restoreCache() method as you already doing
the checks inside #createBlockPoolDir().
{code}
#createBlockPoolDir()
if (!cacheDir.exists() && !cacheDir.mkdir()) {
{code}
{code}
#restoreCache()
if (cacheDir.exists()) {
{code}
# {{pmemVolume/BlockPoolId/BlockPoolId-BlockId}}.
{{BlockPoolId}} is duplicated and please remove this from the file name.
This will avoid {{cachedFile.getName().split("-");}} splitting logic and make
it simple.
# Can you explore the chances of using hierarchical way of storing blocks
similar to the existing datanode data.dir, this is to avoid chances of growing
blocks under one single blockPoolId. Assume cache capacity in TBs and large set
of data blocks in cache under a blockPool. Please refer
{{DatanodeUtil.idToBlockDir(finalizedDir, b.getBlockId());}}
# {{restoreCache()}} - How about moving specific parsing/restore logic to
respective MappableBlockLoaders. PmemMappableBlockLoader#restoreCache() and
NativePmemMappableBlockLoader#restoreCache().
# {{dfs.datanode.cache.persistence.enabled}} - by default this can be true as
this will allow to get maximum capabilities of pmem device. Overall the feature
is disabled and default value of "dfs.datanode.cache.pmem.dirs" is empty and
will be DRAM based. So, once the user enables pmem, they can utilize the
potential of this device and no case of compatibility.
> HDFS read cache persistence support
> -----------------------------------
>
> Key: HDFS-14740
> URL: https://issues.apache.org/jira/browse/HDFS-14740
> Project: Hadoop HDFS
> Issue Type: Improvement
> Reporter: Feilong He
> Assignee: Rui Mo
> Priority: Major
> Attachments: HDFS-14740.000.patch, HDFS-14740.001.patch,
> HDFS-14740.002.patch
>
>
> In HDFS-13762, persistent memory is enabled in HDFS centralized cache
> management. Even though persistent memory can persist cache data, for
> simplifying the implementation, the previous cache data will be cleaned up
> during DataNode restarts. We propose to improve HDFS persistent memory (PM)
> cache by taking advantage of PM's data persistence characteristic, i.e.,
> recovering the cache status when DataNode restarts, thus, cache warm up time
> can be saved for user.
--
This message was sent by Atlassian Jira
(v8.3.2#803003)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]