[ https://issues.apache.org/jira/browse/HDFS-13762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16762058#comment-16762058 ]
Wei-Chiu Chuang commented on HDFS-13762: ---------------------------------------- Thanks for the patch. I don't fully understand this hardware, but here's a few general comments: {code:java} if (!NativeIO.isAvailable() || !NativeIO.POSIX.isPmemAvailable()) { throw new IOException("Persistent memory storage configured, " + "but not available (" + NativeIO.POSIX.PMDK_SUPPORT_STATE + ")."); }{code} This error message is not particularly useful. It should be more explicit to indicate where the problem is: does the native Hadoop libhadoop.so fail to load? Does it fail to load the pmdk library? Or the hardware is not present? Hint: see ErasureCodeNative.java for an example. A user won't be able to understand the meaning of the magic number NativeIO.POSIX.PMDK_SUPPORT_STATE unless she reads the code. FsDatasetCache {code:java} String[] pmemVolumes = dataset.datanode.getDnConf().getPmemVolumes(); if (pmemVolumes != null && pmemVolumes.length != 0) { if (!NativeIO.isAvailable() || !NativeIO.POSIX.isPmemAvailable()) { throw new IOException("Persistent memory storage configured, " + "but not available (" + NativeIO.POSIX.PMDK_SUPPORT_STATE + ")."); } this.pmemManager = new PmemVolumeManager(); this.pmemManager.load(pmemVolumes); PmemMappedBlock.setPersistentMemoryManager(this.pmemManager); PmemMappedBlock.setDataset(dataset); } {code} Can we refactor this segment of code into its own method? This segment is self-contained and frankly most of users won't have PMEM hardware so they won't care. In TestNativeIO.java: {code:java} if (!NativeIO.POSIX.isPmemAvailable()) {{code} Use Assume.assumeTrue(NativeIO.POSIX.isPmemAvailable()) instead {code:java} private static void deletePmemMappedFile(String filePath) { try { if (filePath != null) { boolean result = Files.deleteIfExists(Paths.get(filePath)); if (!result) { LOG.error("Fail to delete mapped file " + filePath + " from persistent memory"); } } } catch (Throwable e) { LOG.error("Fail to delete mapped file " + filePath + " for " + e.getMessage() + " from persistent memory"); } } {code} If it throws a Throwable, shouldn't it just throw it and let it fail? I don't think it's a good idea to catch it. If there's an exception that you think should be logged at ERROR or WARN level, it's probably a better idea to log its stack trace as well. So instead of {code:java} LOG.error("Fail to delete mapped file " + filePath + " for " + e.getMessage() + " from persistent memory");{code} do {code:java} LOG.error("Fail to delete mapped file " + filePath + " from persistent memory", e);{code} > Support non-volatile storage class memory(SCM) in HDFS cache directives > ----------------------------------------------------------------------- > > Key: HDFS-13762 > URL: https://issues.apache.org/jira/browse/HDFS-13762 > Project: Hadoop HDFS > Issue Type: Improvement > Components: caching, datanode > Reporter: Sammi Chen > Assignee: Feilong He > Priority: Major > Attachments: HDFS-13762.000.patch, HDFS-13762.001.patch, > HDFS-13762.002.patch, HDFS-13762.003.patch, HDFS-13762.004.patch, > HDFS-13762.005.patch, HDFS-13762.006.patch, HDFS-13762.007.patch, > HDFS-13762.008.patch, SCMCacheDesign-2018-11-08.pdf, SCMCacheTestPlan.pdf > > > No-volatile storage class memory is a type of memory that can keep the data > content after power failure or between the power cycle. Non-volatile storage > class memory device usually has near access speed as memory DIMM while has > lower cost than memory. So today It is usually used as a supplement to > memory to hold long tern persistent data, such as data in cache. > Currently in HDFS, we have OS page cache backed read only cache and RAMDISK > based lazy write cache. Non-volatile memory suits for both these functions. > This Jira aims to enable storage class memory first in read cache. Although > storage class memory has non-volatile characteristics, to keep the same > behavior as current read only cache, we don't use its persistent > characteristics currently. > > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org