[
https://issues.apache.org/jira/browse/HDFS-13762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Feilong He updated HDFS-13762:
------------------------------
Release Note: Non-volatile storage class memory (SCM, also known as
persistent memory) is supported in HDFS cache. To enable SCM cache, user just
needs to configure SCM volume for property “dfs.datanode.cache.pmem.dirs” in
hdfs-site.xml. And all HDFS cache directives keep unchanged. There are two
implementations for HDFS SCM Cache, one is pure java code implementation and
the other is native PMDK based implementation. The latter implementation can
bring user better performance gain in cache write and cache read. If PMDK
native libs could be loaded, it will use PMDK based implementation otherwise it
will fallback to java code implementation. To enable PMDK based implementation,
user should install PMDK library by referring to the official site
http://pmem.io/. Then, build Hadoop with PMDK support by referring to "PMDK
library build options" section in `BUILDING.txt` in the source code. If
multiple SCM volumes are configured, a round-robin policy is used to select an
available volume for caching a block. Consistent with DRAM cache, SCM cache
also has no cache eviction mechanism. When DataNode receives a data read
request from a client, if the corresponding block is cached into SCM, DataNode
will instantiate an InputStream with the block location path on SCM (pure java
implementation) or cache address on SCM (PMDK based implementation). Once the
InputStream is created, DataNode will send the cached data to the client.
Please refer "Centralized Cache Management" guide for more details. (was:
Non-volatile storage class memory (SCM, also known as persistent memory) is
supported in HDFS cache. To enable SCM cache, user just needs to configure SCM
volume for property “dfs.datanode.cache.pmem.dirs” in hdfs-site.xml. And all
HDFS cache directives keep unchanged. There are two implementations for HDFS
SCM Cache, one is pure java code implementation and the other is native PMDK
based implementation. The latter implementation can bring user better
performance gain in cache write and cache read. To enable PMDK based
implementation, user should install PMDK library by referring to the official
site http://pmem.io/. Then, build Hadoop with PMDK support by referring to
"PMDK library build options" section in `BUILDING.txt` in the source code. If
multiple SCM volumes are configured, a round-robin policy is used to select an
available volume for caching a block. Consistent with DRAM cache, SCM cache
also has no cache eviction mechanism. When DataNode receives a data read
request from a client, if the corresponding block is cached into SCM, DataNode
will instantiate an InputStream with the block location path on SCM (pure java
implementation) or cache address on SCM (PMDK based implementation). Once the
InputStream is created, DataNode will send the cached data to the client.
Please refer "Centralized Cache Management" guide for more details. )
> Support non-volatile storage class memory(SCM) in HDFS cache directives
> -----------------------------------------------------------------------
>
> Key: HDFS-13762
> URL: https://issues.apache.org/jira/browse/HDFS-13762
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: caching, datanode
> Reporter: Sammi Chen
> Assignee: Feilong He
> Priority: Major
> Attachments: HDFS-13762.000.patch, HDFS-13762.001.patch,
> HDFS-13762.002.patch, HDFS-13762.003.patch, HDFS-13762.004.patch,
> HDFS-13762.005.patch, HDFS-13762.006.patch, HDFS-13762.007.patch,
> HDFS-13762.008.patch, SCMCacheDesign-2018-11-08.pdf,
> SCMCacheDesign-2019-07-12.pdf, SCMCacheDesign-2019-07-16.pdf,
> SCMCacheDesign-2019-3-26.pdf, SCMCacheTestPlan-2019-3-27.pdf,
> SCMCacheTestPlan.pdf, SCM_Cache_Perf_Results-v1.pdf
>
>
> No-volatile storage class memory is a type of memory that can keep the data
> content after power failure or between the power cycle. Non-volatile storage
> class memory device usually has near access speed as memory DIMM while has
> lower cost than memory. So today It is usually used as a supplement to
> memory to hold long tern persistent data, such as data in cache.
> Currently in HDFS, we have OS page cache backed read only cache and RAMDISK
> based lazy write cache. Non-volatile memory suits for both these functions.
> This Jira aims to enable storage class memory first in read cache. Although
> storage class memory has non-volatile characteristics, to keep the same
> behavior as current read only cache, we don't use its persistent
> characteristics currently.
>
>
>
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]