[ 
https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16989573#comment-16989573
 ] 

Feilong He commented on HDFS-14740:
-----------------------------------

Thanks [~rakeshr] so much for your comments. Sorry for this late reply.
 # Yes, 'dfs.datanode.cache.persistence.enabled' looks a bit ambiguous to user. 
This property is used to control whether the cache on pmem should be restored 
to aviod unnecessarily pulling data to pmem again after DataNode restarts. I 
prefer to use 'dfs.datanode.cache.restore.enabled'. If you have other comment, 
please kindly let me know.
 # I have conducted some tests on the case you mentioned.  1) In my test, a 
file is cached to pmem by HDFS with the above flag set to true. Then, I 
shutdown the cluster and set the flag to false. After restarted the cluster, I 
noted that the previous cache is dropped on pmem and DataNode has to recache 
the block data to pmem, as we expected. 2) I also did another test. Firstly, a 
file is cached to pmem by HDFS with the above flag set to false. Then, I 
shutdown the cluster and set the flat to true. During the restarting of 
DataNode, I can see that the previous cache is restored, as we expected. To sum 
up, the behavior in the two tests aligns with the purpose of this flag. 

> Recover data blocks from persistent memory read cache during datanode restarts
> ------------------------------------------------------------------------------
>
>                 Key: HDFS-14740
>                 URL: https://issues.apache.org/jira/browse/HDFS-14740
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: caching, datanode
>            Reporter: Feilong He
>            Assignee: Feilong He
>            Priority: Major
>         Attachments: HDFS-14740.000.patch, HDFS-14740.001.patch, 
> HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, 
> HDFS-14740.005.patch, HDFS-14740.006.patch, 
> HDFS_Persistent_Read-Cache_Design-v1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.pdf, HDFS_Persistent_Read-Cache_Test-v2.pdf
>
>
> In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache 
> management. Even though PM can persist cache data, for simplifying the 
> initial implementation, the previous cache data will be cleaned up during 
> DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking 
> advantage of PM's data persistence characteristic, i.e., recovering the 
> status for cached data, if any, when DataNode restarts, thus, cache warm up 
> time can be saved for user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to