[ 
https://issues.apache.org/jira/browse/HDFS-13762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16827857#comment-16827857
 ] 

Feilong He commented on HDFS-13762:
-----------------------------------

As the subtask section shows, we divided this Jira into several subtasks, among 
which HDFS-14355 and HDFS-14356 are respectively for pure java impl and PMDK 
based impl. In the past weeks, we focused on pure java impl and got the patch 
merged into upstream. In the following weeks, we will mainly focus on the PMDK 
based impl. Most of recent comments posted by reviewers in this Jira are 
relevent to PMDK based impl. I am replying to these comments. Sorry for the 
late reply.

 

[~umamaheswararao]'s comments:
{quote}I think interface design can be cleaner. Now we have interfaces for 
MappableBlock but we really will not depend much on interface, we actually 
directly use class to load block as they are staic methods.I am thinking 
something like this: Lets have MemoryVolumeManager interface which will be 
implemented by two separate impelemnattions: PMemVolumeManager and 
MemMappedVolumeManagerThey can have init methods and getMappableBlockLoaderThey 
can return respective MappableBlockLoader classes, they are PMemMappedBlock or 
MemoryMappedBlock..
{quote}
Yes, it is better to use interface design. We have refactored this impl by 
HDFS-14354.
{quote}"Fail to msync region. --> Failed to msync region
{quote}
I will fix this expression issue in our new patch.
{quote}// Load Intel ISA-L : you may want to removed this in pmdk file. I saw 
in couple of other places referring to ISA-L. Please remove them as its not 
related to that.
{quote}
I will check again to remove irrelevant comments.
{quote}If some DNs configured with pmem volumes and some not, then blocks will 
be cached in both mmapped and pmemmapped? Probably you would like to document 
your recommendation?
{quote}
For a datanode, the block can be cached either to memory or to pmem. It is 
acceptable that some DNs are configured with pmem volumes and some are not. 
This will be well documented.
{quote}You may want to update the documentation for this feature 
[https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/CentralizedCacheManagement.html]
{quote}
Yes, we opened a subtask HDFS-14357 to do document work. 
{quote}Since its persistet, how does this handle or remember cached blocks when 
DN restart?
{quote}
Actually, for this Jira, the persistent characteristic of pmem is not 
considered. The cached block will not be kept when DN restarts. So the pmem 
cache acts like DRAM cache. We will consider to support persistent cache in the 
future.
{quote}Do you want to have different metrics for different cache types? I am 
not sure its really necessary, but people may be interested to know how many 
blocks cached in pmem area? (Low priority)
{quote}
We opened a subtask HDFS-14458 for collecting the pmem cache metrics and 
reporting to namenode. The cache metrics are similar to DRAM cache's. Please 
watch this subtask to get the update in the near future.
{quote}You need to fix for Windows as well. Am I missing something?
{quote}
Currenly, for PMDK based impl, we only support linux env. 
{quote}From pmdk_load.h Quick question does this types available if non-unix 
env? I think you did tydef in Unix env only right
{quote}
 These types are not supported in non-unix env. As the above comment indicates, 
the PMDK based impl are only supported in linux env.

 

[~Sammi]'s comments:
{quote}NativeIO.java,  suggest to define the different PMDK support code and 
it's meaning using a enum, so it will be easy to map the code to description.
{quote}
Good suggestion! It would be better to maintain the PMDK support codes and the 
corresponding message in enum. I will refine this part of impl in our new patch.
{quote}NativeIO.c, it would be nice to refactor the error message, to avoid the 
potential array overflow if pmem_errormsg() returns too long content. There are 
several piece of similar code
{quote}
 Not exactly, snprintf method will truncate excessive characters. So there 
should be no array overflow issue. I will check our code to avoid such 
potential issue.
{quote}Support persistent memeory cache on Windows can be a follow-up task 
since Windows is not very common used in user environment. For build patch on 
Windows, if it's a goal of this JIRA, please update the "Building on Windows" 
in BUILDING.txt for any specific steps user need to take. If it's not a goal of 
this JIRA, make sure build on Windows will have clear message for this case. 
{quote}
Thanks for your clarification. Yes, the windows platform is not considered by 
this Jira. I will make sure clear message is presented to user when building on 
windows.
{quote}Cmakelists.txt "#Require ISA-L" is for ISA-L, better keep it. 
{quote}
 OK. I will check the new comments one by one and irrelevant ones to code will 
be removed. 
{quote}make PmemVolumeManager pmemManager final
{quote}
 In our new impl, we made PmemVolumeManager singleton. And the 
pmemVolumeManager in cache loader is set by initialize method. So we keep it 
not final.
{quote}This piece of code has the chance to delete user files accidentally. 
Better to provide user a choice to decide auto delete or not.
{quote}
 We will consider to create a conventional dir udner user specified directory, 
like the behavior for datanode's storage volume. Thus, accidental data lose can 
be avoided generally.
{quote}make count in PmemVolumeManager final
{quote}
 Yes, it would be better to make this variable final. I will udpate the code 
accordingly.
{quote}fillBuffer in MemoryMappedBlock and PmemMappedBlock are the same. 
refactor to keep only one piece of code
{quote}
Good suggestion. We have refactored this part of impl. The difference is that 
in our new impl, MemoryMappableBlockLoader and PmemMappableBlockLoader are 
responsible for loading blocks. So we moved fillBuffer to their parent abstract 
class MappableBlockLoader. Thus, DRAM cache loader and Pmem cache loader can 
call this common method when mapping block data.

 

Thanks [~umamaheswararao] & [~Sammi] for your valuable comments. Currently, we 
are working on subtask HDFS-14356. Please note the update in this subtask. 
Other reviewers can also go to subtask HDFS-14356 to review the patch and leave 
your comments. Thanks!

> Support non-volatile storage class memory(SCM) in HDFS cache directives
> -----------------------------------------------------------------------
>
>                 Key: HDFS-13762
>                 URL: https://issues.apache.org/jira/browse/HDFS-13762
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: caching, datanode
>            Reporter: Sammi Chen
>            Assignee: Feilong He
>            Priority: Major
>         Attachments: HDFS-13762.000.patch, HDFS-13762.001.patch, 
> HDFS-13762.002.patch, HDFS-13762.003.patch, HDFS-13762.004.patch, 
> HDFS-13762.005.patch, HDFS-13762.006.patch, HDFS-13762.007.patch, 
> HDFS-13762.008.patch, SCMCacheDesign-2018-11-08.pdf, 
> SCMCacheDesign-2019-3-26.pdf, SCMCacheTestPlan-2019-3-27.pdf, 
> SCMCacheTestPlan.pdf, SCM_Cache_Perf_Results-v1.pdf
>
>
> No-volatile storage class memory is a type of memory that can keep the data 
> content after power failure or between the power cycle. Non-volatile storage 
> class memory device usually has near access speed as memory DIMM while has 
> lower cost than memory.  So today It is usually used as a supplement to 
> memory to hold long tern persistent data, such as data in cache. 
> Currently in HDFS, we have OS page cache backed read only cache and RAMDISK 
> based lazy write cache.  Non-volatile memory suits for both these functions. 
> This Jira aims to enable storage class memory first in read cache. Although 
> storage class memory has non-volatile characteristics, to keep the same 
> behavior as current read only cache, we don't use its persistent 
> characteristics currently.  
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to