[ https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16989573#comment-16989573 ]
Feilong He commented on HDFS-14740: ----------------------------------- Thanks [~rakeshr] so much for your comments. Sorry for this late reply. # Yes, 'dfs.datanode.cache.persistence.enabled' looks a bit ambiguous to user. This property is used to control whether the cache on pmem should be restored to aviod unnecessarily pulling data to pmem again after DataNode restarts. I prefer to use 'dfs.datanode.cache.restore.enabled'. If you have other comment, please kindly let me know. # I have conducted some tests on the case you mentioned. 1) In my test, a file is cached to pmem by HDFS with the above flag set to true. Then, I shutdown the cluster and set the flag to false. After restarted the cluster, I noted that the previous cache is dropped on pmem and DataNode has to recache the block data to pmem, as we expected. 2) I also did another test. Firstly, a file is cached to pmem by HDFS with the above flag set to false. Then, I shutdown the cluster and set the flat to true. During the restarting of DataNode, I can see that the previous cache is restored, as we expected. To sum up, the behavior in the two tests aligns with the purpose of this flag. > Recover data blocks from persistent memory read cache during datanode restarts > ------------------------------------------------------------------------------ > > Key: HDFS-14740 > URL: https://issues.apache.org/jira/browse/HDFS-14740 > Project: Hadoop HDFS > Issue Type: Improvement > Components: caching, datanode > Reporter: Feilong He > Assignee: Feilong He > Priority: Major > Attachments: HDFS-14740.000.patch, HDFS-14740.001.patch, > HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, > HDFS-14740.005.patch, HDFS-14740.006.patch, > HDFS_Persistent_Read-Cache_Design-v1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.pdf, HDFS_Persistent_Read-Cache_Test-v2.pdf > > > In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache > management. Even though PM can persist cache data, for simplifying the > initial implementation, the previous cache data will be cleaned up during > DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking > advantage of PM's data persistence characteristic, i.e., recovering the > status for cached data, if any, when DataNode restarts, thus, cache warm up > time can be saved for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org