[jira] [Updated] (HDFS-16014) Fix an issue in checking native pmdk lib by 'hadoop checknative' command
[ https://issues.apache.org/jira/browse/HDFS-16014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh Radhakrishnan updated HDFS-16014: Fix Version/s: 3.4.0 Resolution: Fixed Status: Resolved (was: Patch Available) > Fix an issue in checking native pmdk lib by 'hadoop checknative' command > > > Key: HDFS-16014 > URL: https://issues.apache.org/jira/browse/HDFS-16014 > Project: Hadoop HDFS > Issue Type: Bug > Components: native >Affects Versions: 3.4.0 >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Attachments: HDFS-16014-01.patch, HDFS-16014-02.patch > > Time Spent: 1h 20m > Remaining Estimate: 0h > > In HDFS-14818, we proposed a patch to support checking native pmdk lib. The > expected target is to display hint to user regarding pmdk lib loaded state. > Recently, it was found that pmdk lib was not successfully loaded actually but > the `hadoop checknative` command still tells user that it was. This issue can > be reproduced by moving libpmem.so* from specified installed path to other > place, or directly deleting these libs, after the project is built. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15788) Correct the statement for pmem cache to reflect cache persistence support
[ https://issues.apache.org/jira/browse/HDFS-15788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh Radhakrishnan updated HDFS-15788: Fix Version/s: 3.4.0 Resolution: Fixed Status: Resolved (was: Patch Available) > Correct the statement for pmem cache to reflect cache persistence support > - > > Key: HDFS-15788 > URL: https://issues.apache.org/jira/browse/HDFS-15788 > Project: Hadoop HDFS > Issue Type: Bug > Components: documentation >Affects Versions: 3.4.0 >Reporter: Feilong He >Assignee: Feilong He >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > Attachments: HDFS-15788-01.patch, HDFS-15788-02.patch > > Time Spent: 40m > Remaining Estimate: 0h > > Correct the statement for pmem cache to reflect cache persistence support. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-16362) [FSO] Refactor isFileSystemOptimized usage in OzoneManagerUtils
Rakesh Radhakrishnan created HDFS-16362: --- Summary: [FSO] Refactor isFileSystemOptimized usage in OzoneManagerUtils Key: HDFS-16362 URL: https://issues.apache.org/jira/browse/HDFS-16362 Project: Hadoop HDFS Issue Type: Bug Reporter: Rakesh Radhakrishnan This task is to refactor the om request instantiation based on #isFileSystemOptimized() check in OzoneManagerUtils class. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16014) Issue in checking native pmdk lib by 'hadoop checknative' command
[ https://issues.apache.org/jira/browse/HDFS-16014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17451069#comment-17451069 ] Rakesh Radhakrishnan commented on HDFS-16014: - [~PhiloHe] can you please re-run the build by attaching a new patch and get latest QA report. Thanks! +1 patch looks good to me > Issue in checking native pmdk lib by 'hadoop checknative' command > - > > Key: HDFS-16014 > URL: https://issues.apache.org/jira/browse/HDFS-16014 > Project: Hadoop HDFS > Issue Type: Bug > Components: native >Affects Versions: 3.4.0 >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Attachments: HDFS-16014-01.patch > > > In HDFS-14818, we proposed a patch to support checking native pmdk lib. The > expected target is to display hint to user regarding pmdk lib loaded state. > Recently, it was found that pmdk lib was not successfully loaded actually but > the `hadoop checknative` command still tells user that it was. This issue can > be reproduced by moving libpmem.so* from specified installed path to other > place, or directly deleting these libs, after the project is built. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15788) Correct the statement for pmem cache to reflect cache persistence support
[ https://issues.apache.org/jira/browse/HDFS-15788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17450279#comment-17450279 ] Rakesh Radhakrishnan commented on HDFS-15788: - +1 LGTM, thanks [~PhiloHe] for the contribution. > Correct the statement for pmem cache to reflect cache persistence support > - > > Key: HDFS-15788 > URL: https://issues.apache.org/jira/browse/HDFS-15788 > Project: Hadoop HDFS > Issue Type: Bug > Components: documentation >Affects Versions: 3.4.0 >Reporter: Feilong He >Assignee: Feilong He >Priority: Minor > Attachments: HDFS-15788-01.patch, HDFS-15788-02.patch > > > Correct the statement for pmem cache to reflect cache persistence support. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-15253) Set default throttle value on dfs.image.transfer.bandwidthPerSec
[ https://issues.apache.org/jira/browse/HDFS-15253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh Radhakrishnan resolved HDFS-15253. - Fix Version/s: 3.4.0 Resolution: Fixed > Set default throttle value on dfs.image.transfer.bandwidthPerSec > > > Key: HDFS-15253 > URL: https://issues.apache.org/jira/browse/HDFS-15253 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Karthik Palanisamy >Assignee: Karthik Palanisamy >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 40m > Remaining Estimate: 0h > > The default value dfs.image.transfer.bandwidthPerSec is set to 0 so it can > use maximum available bandwidth for fsimage transfers during checkpoint. I > think we should throttle this. Many users were experienced namenode failover > when transferring large image size along with fsimage replication on > dfs.namenode.name.dir. eg. >25Gb. > Thought to set, > dfs.image.transfer.bandwidthPerSec=52428800. (50 MB/s) > dfs.namenode.checkpoint.txns=200 (Default is 1M, good to avoid frequent > checkpoint. However, the default checkpoint runs every 6 hours once) > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15080) Fix the issue in reading persistent memory cached data with an offset
[ https://issues.apache.org/jira/browse/HDFS-15080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh Radhakrishnan updated HDFS-15080: Resolution: Fixed Status: Resolved (was: Patch Available) Committed the patch to {{trunk}}, {{branch-3.2}} and {{branch-3.1}} branches. Thanks [~PhiloHe] for the contribution. > Fix the issue in reading persistent memory cached data with an offset > - > > Key: HDFS-15080 > URL: https://issues.apache.org/jira/browse/HDFS-15080 > Project: Hadoop HDFS > Issue Type: Bug > Components: caching, datanode >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Fix For: 3.3.0, 3.1.4, 3.2.2 > > Attachments: HDFS-15080-000.patch, HDFS-15080-branch-3.1-000.patch, > HDFS-15080-branch-3.2-000.patch > > > Some applications can read a segment of pmem cache with an offset specified. > The previous implementation for pmem cache read with DirectByteBuffer didn't > cover this situation. > Let me explain further. In our test, we used spark SQL to run some TPC-DS > workload to read the cache data and hits read exception. This was due to the > missed seek offset arg, which is used in spark SQL to read data packet by > packet. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15080) Fix the issue in reading persistent memory cached data with an offset
[ https://issues.apache.org/jira/browse/HDFS-15080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh Radhakrishnan updated HDFS-15080: Summary: Fix the issue in reading persistent memory cached data with an offset (was: Fix the issue in reading persistent memory cache with an offset) > Fix the issue in reading persistent memory cached data with an offset > - > > Key: HDFS-15080 > URL: https://issues.apache.org/jira/browse/HDFS-15080 > Project: Hadoop HDFS > Issue Type: Bug > Components: caching, datanode >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Fix For: 3.3.0, 3.1.4, 3.2.2 > > Attachments: HDFS-15080-000.patch, HDFS-15080-branch-3.1-000.patch, > HDFS-15080-branch-3.2-000.patch > > > Some applications can read a segment of pmem cache with an offset specified. > The previous implementation for pmem cache read with DirectByteBuffer didn't > cover this situation. > Let me explain further. In our test, we used spark SQL to run some TPC-DS > workload to read the cache data and hits read exception. This was due to the > missed seek offset arg, which is used in spark SQL to read data packet by > packet. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15080) Fix the issue in reading persistent memory cache with an offset
[ https://issues.apache.org/jira/browse/HDFS-15080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010479#comment-17010479 ] Rakesh Radhakrishnan commented on HDFS-15080: - Thanks [~PhiloHe], its good finding. +1 patch looks good to me. I will commit this shortly to the branches. Since this is related to the Pmem native bindings, excluding unit test case for this change. > Fix the issue in reading persistent memory cache with an offset > --- > > Key: HDFS-15080 > URL: https://issues.apache.org/jira/browse/HDFS-15080 > Project: Hadoop HDFS > Issue Type: Bug > Components: caching, datanode >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Fix For: 3.3.0, 3.1.4, 3.2.2 > > Attachments: HDFS-15080-000.patch, HDFS-15080-branch-3.1-000.patch, > HDFS-15080-branch-3.2-000.patch > > > Some applications can read a segment of pmem cache with an offset specified. > The previous implementation for pmem cache read with DirectByteBuffer didn't > cover this situation. > Let me explain further. In our test, we used spark SQL to run some TPC-DS > workload to read the cache data and hits read exception. This was due to the > missed seek offset arg, which is used in spark SQL to read data packet by > packet. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts
[ https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh Radhakrishnan updated HDFS-14740: Fix Version/s: 3.2.2 3.1.4 3.3.0 Resolution: Fixed Status: Resolved (was: Patch Available) I have committed latest patch to the respective branches - trunk, branch-3.2 and branch-3.1. Thanks [~PhiloHe] and [~Rui Mo] for the contribution. > Recover data blocks from persistent memory read cache during datanode restarts > -- > > Key: HDFS-14740 > URL: https://issues.apache.org/jira/browse/HDFS-14740 > Project: Hadoop HDFS > Issue Type: Improvement > Components: caching, datanode >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Fix For: 3.3.0, 3.1.4, 3.2.2 > > Attachments: HDFS-14740-branch-3.1-000.patch, > HDFS-14740-branch-3.1-001.patch, HDFS-14740-branch-3.2-000.patch, > HDFS-14740-branch-3.2-001.patch, HDFS-14740.000.patch, HDFS-14740.001.patch, > HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, > HDFS-14740.005.patch, HDFS-14740.006.patch, HDFS-14740.007.patch, > HDFS-14740.008.patch, HDFS-14740.009.patch, > HDFS_Persistent_Read-Cache_Design-v1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.pdf, HDFS_Persistent_Read-Cache_Test-v2.pdf > > > In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache > management. Even though PM can persist cache data, for simplifying the > initial implementation, the previous cache data will be cleaned up during > DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking > advantage of PM's data persistence characteristic, i.e., recovering the > status for cached data, if any, when DataNode restarts, thus, cache warm up > time can be saved for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts
[ https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17006607#comment-17006607 ] Rakesh Radhakrishnan commented on HDFS-14740: - Thanks [~PhiloHe] for the patch. +1 looks good to me. I will commit it shortly. > Recover data blocks from persistent memory read cache during datanode restarts > -- > > Key: HDFS-14740 > URL: https://issues.apache.org/jira/browse/HDFS-14740 > Project: Hadoop HDFS > Issue Type: Improvement > Components: caching, datanode >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Attachments: HDFS-14740-branch-3.1-000.patch, > HDFS-14740-branch-3.1-001.patch, HDFS-14740-branch-3.2-000.patch, > HDFS-14740-branch-3.2-001.patch, HDFS-14740.000.patch, HDFS-14740.001.patch, > HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, > HDFS-14740.005.patch, HDFS-14740.006.patch, HDFS-14740.007.patch, > HDFS-14740.008.patch, HDFS-14740.009.patch, > HDFS_Persistent_Read-Cache_Design-v1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.pdf, HDFS_Persistent_Read-Cache_Test-v2.pdf > > > In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache > management. Even though PM can persist cache data, for simplifying the > initial implementation, the previous cache data will be cleaned up during > DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking > advantage of PM's data persistence characteristic, i.e., recovering the > status for cached data, if any, when DataNode restarts, thus, cache warm up > time can be saved for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts
[ https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16989600#comment-16989600 ] Rakesh Radhakrishnan commented on HDFS-14740: - Thanks [~PhiloHe] for the updates. How about keeping the two pmem related configs with matching names like below : {{'dfs.datanode.pmem.cache.restore'}} and {{'dfs.datanode.pmem.cache.dirs'}} ? > Recover data blocks from persistent memory read cache during datanode restarts > -- > > Key: HDFS-14740 > URL: https://issues.apache.org/jira/browse/HDFS-14740 > Project: Hadoop HDFS > Issue Type: Improvement > Components: caching, datanode >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Attachments: HDFS-14740.000.patch, HDFS-14740.001.patch, > HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, > HDFS-14740.005.patch, HDFS-14740.006.patch, HDFS-14740.007.patch, > HDFS_Persistent_Read-Cache_Design-v1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.pdf, HDFS_Persistent_Read-Cache_Test-v2.pdf > > > In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache > management. Even though PM can persist cache data, for simplifying the > initial implementation, the previous cache data will be cleaned up during > DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking > advantage of PM's data persistence characteristic, i.e., recovering the > status for cached data, if any, when DataNode restarts, thus, cache warm up > time can be saved for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts
[ https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976700#comment-16976700 ] Rakesh Radhakrishnan commented on HDFS-14740: - Thanks [~Rui Mo] for the test result. Adding few comments, please address the same. Apart from below comments, the patch looks good to me. +Comment-1)+ Blocks will be persisted into the PMem device irrespective of the config '{{dfs.datanode.cache.persistence.enabled}}' value, either true or false. Again, the purpose of this flag is to restore the block cache in datanode process using the physical blocks present in the PMem device. In that sense how about renaming the config to '{{dfs.datanode.cache.pmem.block.restore'}} or some other better name reflecting the behavior? +Comment-2)+ If not tested, then could you please capture the following scenario into your test sheet. *scenario:* User has cached {{file A}}. Now, admin has restarted datanode with the above flag to {{false}}. Assume user has submitted cache directive command to cache same {{file A}}. > Recover data blocks from persistent memory read cache during datanode restarts > -- > > Key: HDFS-14740 > URL: https://issues.apache.org/jira/browse/HDFS-14740 > Project: Hadoop HDFS > Issue Type: Improvement > Components: caching, datanode >Reporter: Feilong He >Assignee: Rui Mo >Priority: Major > Attachments: HDFS-14740.000.patch, HDFS-14740.001.patch, > HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, > HDFS-14740.005.patch, HDFS-14740.006.patch, > HDFS_Persistent_Read-Cache_Design-v1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.pdf, HDFS_Persistent_Read-Cache_Test-v2.pdf > > > In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache > management. Even though PM can persist cache data, for simplifying the > initial implementation, the previous cache data will be cleaned up during > DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking > advantage of PM's data persistence characteristic, i.e., recovering the > status for cached data, if any, when DataNode restarts, thus, cache warm up > time can be saved for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14745) Backport HDFS persistent memory read cache support to branch-3.1
[ https://issues.apache.org/jira/browse/HDFS-14745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh Radhakrishnan updated HDFS-14745: Issue Type: New Feature (was: Improvement) > Backport HDFS persistent memory read cache support to branch-3.1 > > > Key: HDFS-14745 > URL: https://issues.apache.org/jira/browse/HDFS-14745 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Labels: cache, datanode > Fix For: 3.1.4 > > Attachments: HDFS-14745-branch-3.1-000.patch, > HDFS-14745-branch-3.1-001.patch, HDFS-14745-branch-3.1-002.patch, > HDFS-14745-branch-3.1-003.patch > > > We are proposing to backport the patches for HDFS-13762, HDFS persistent > memory read cache support, to branch-3.1. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14905) Backport HDFS persistent memory read cache support to branch-3.2
[ https://issues.apache.org/jira/browse/HDFS-14905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh Radhakrishnan updated HDFS-14905: Issue Type: New Feature (was: Improvement) > Backport HDFS persistent memory read cache support to branch-3.2 > > > Key: HDFS-14905 > URL: https://issues.apache.org/jira/browse/HDFS-14905 > Project: Hadoop HDFS > Issue Type: New Feature > Components: caching, datanode >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Fix For: 3.2.2 > > Attachments: HDFS-14905-branch-3.2-000.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14905) Backport HDFS persistent memory read cache support to branch-3.2
[ https://issues.apache.org/jira/browse/HDFS-14905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh Radhakrishnan updated HDFS-14905: Fix Version/s: (was: 3.3.0) 3.2.2 Hadoop Flags: Reviewed Release Note: Non-volatile storage class memory (SCM, also known as persistent memory) is supported in HDFS cache. To enable SCM cache, user just needs to configure SCM volume for property “dfs.datanode.cache.pmem.dirs” in hdfs-site.xml. And all HDFS cache directives keep unchanged. There are two implementations for HDFS SCM Cache, one is pure java code implementation and the other is native PMDK based implementation. The latter implementation can bring user better performance gain in cache write and cache read. If PMDK native libs could be loaded, it will use PMDK based implementation otherwise it will fallback to java code implementation. To enable PMDK based implementation, user should install PMDK library by referring to the official site http://pmem.io/. Then, build Hadoop with PMDK support by referring to "PMDK library build options" section in `BUILDING.txt` in the source code. If multiple SCM volumes are configured, a round-robin policy is used to select an available volume for caching a block. Consistent with DRAM cache, SCM cache also has no cache eviction mechanism. When DataNode receives a data read request from a client, if the corresponding block is cached into SCM, DataNode will instantiate an InputStream with the block location path on SCM (pure java implementation) or cache address on SCM (PMDK based implementation). Once the InputStream is created, DataNode will send the cached data to the client. Please refer "Centralized Cache Management" guide for more details. Resolution: Fixed Status: Resolved (was: Patch Available) Thanks [~PhiloHe] for the consolidated patch. +1, I have cherry picked following 10 commits from {{trunk}} to {{branch-3.2}} {code} HDFS-14354 - 15/March/2019 ba50a36a3ead628c3d44d384f7ed4d2b3a55dd07 HDFS-14393 -29/March/2019 f3f51284d57ef2e0c7e968b6eea56eab578f7e93 HDFS-14355 - 31/March/2019 35ff31dd9462cf4fb4ebf5556ee8ae6bcd7c5c3a HDFS-14401 - 08/May/19 9b0aace1e6c54f201784912c0b623707aa82b761 HDFS-14402 - 29/May/19 37900c5639f8ba8d41b9fedc3d41ee0fbda7d5db HDFS-14356 - 05/Jun/19 d1aad444907e1fc5314e8e64529e57c51ed7561c HDFS-14458 - 15/Jul/19 e98adb00b7da8fa913b86ecf2049444b1d8617d4 HDFS-14357 - 15/Jul/19 30a8f840f1572129fe7d02f8a784c47ab57ce89a HDFS-14700 - 09/Aug/19 f6fa865d6fcb0ef0a25a00615f16f383e5032373 HDFS-14818 - 22/Sep/19 659c88801d008bb352d10a1cb3bd0e401486cc9b {code} > Backport HDFS persistent memory read cache support to branch-3.2 > > > Key: HDFS-14905 > URL: https://issues.apache.org/jira/browse/HDFS-14905 > Project: Hadoop HDFS > Issue Type: Improvement > Components: caching, datanode >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Fix For: 3.2.2 > > Attachments: HDFS-14905-branch-3.2-000.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts
[ https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh Radhakrishnan updated HDFS-14740: Summary: Recover data blocks from persistent memory read cache during datanode restarts (was: HDFS read cache persistence support) > Recover data blocks from persistent memory read cache during datanode restarts > -- > > Key: HDFS-14740 > URL: https://issues.apache.org/jira/browse/HDFS-14740 > Project: Hadoop HDFS > Issue Type: Improvement > Components: caching, datanode >Reporter: Feilong He >Assignee: Rui Mo >Priority: Major > Attachments: HDFS-14740.000.patch, HDFS-14740.001.patch, > HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, > HDFS-14740.005.patch, HDFS-14740.006.patch, > HDFS_Persistent_Read-Cache_Design-v1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.pdf > > > In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache > management. Even though PM can persist cache data, for simplifying the > initial implementation, the previous cache data will be cleaned up during > DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking > advantage of PM's data persistence characteristic, i.e., recovering the > status for cached data, if any, when DataNode restarts, thus, cache warm up > time can be saved for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14745) Backport HDFS persistent memory read cache support to branch-3.1
[ https://issues.apache.org/jira/browse/HDFS-14745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh Radhakrishnan updated HDFS-14745: Fix Version/s: (was: 3.3.0) 3.1.4 Release Note: Non-volatile storage class memory (SCM, also known as persistent memory) is supported in HDFS cache. To enable SCM cache, user just needs to configure SCM volume for property “dfs.datanode.cache.pmem.dirs” in hdfs-site.xml. And all HDFS cache directives keep unchanged. There are two implementations for HDFS SCM Cache, one is pure java code implementation and the other is native PMDK based implementation. The latter implementation can bring user better performance gain in cache write and cache read. If PMDK native libs could be loaded, it will use PMDK based implementation otherwise it will fallback to java code implementation. To enable PMDK based implementation, user should install PMDK library by referring to the official site http://pmem.io/. Then, build Hadoop with PMDK support by referring to "PMDK library build options" section in `BUILDING.txt` in the source code. If multiple SCM volumes are configured, a round-robin policy is used to select an available volume for caching a block. Consistent with DRAM cache, SCM cache also has no cache eviction mechanism. When DataNode receives a data read request from a client, if the corresponding block is cached into SCM, DataNode will instantiate an InputStream with the block location path on SCM (pure java implementation) or cache address on SCM (PMDK based implementation). Once the InputStream is created, DataNode will send the cached data to the client. Please refer "Centralized Cache Management" guide for more details. Resolution: Fixed Status: Resolved (was: Patch Available) > Backport HDFS persistent memory read cache support to branch-3.1 > > > Key: HDFS-14745 > URL: https://issues.apache.org/jira/browse/HDFS-14745 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Labels: cache, datanode > Fix For: 3.1.4 > > Attachments: HDFS-14745-branch-3.1-000.patch, > HDFS-14745-branch-3.1-001.patch, HDFS-14745-branch-3.1-002.patch, > HDFS-14745-branch-3.1-003.patch > > > We are proposing to backport the patches for HDFS-13762, HDFS persistent > memory read cache support, to branch-3.1. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-14745) Backport HDFS persistent memory read cache support to branch-3.1
[ https://issues.apache.org/jira/browse/HDFS-14745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16960645#comment-16960645 ] Rakesh Radhakrishnan edited comment on HDFS-14745 at 10/27/19 5:21 PM: --- Thanks [~PhiloHe] for the consolidated patch. +1, I have cherry picked following 10 commits from {{trunk}} to {{branch-3.1}} {code:java} HDFS-14354 - 15/March/2019 ba50a36a3ead628c3d44d384f7ed4d2b3a55dd07 HDFS-14393 -29/March/2019 f3f51284d57ef2e0c7e968b6eea56eab578f7e93 HDFS-14355 - 31/March/2019 35ff31dd9462cf4fb4ebf5556ee8ae6bcd7c5c3a HDFS-14401 - 08/May/19 9b0aace1e6c54f201784912c0b623707aa82b761 HDFS-14402 - 29/May/19 37900c5639f8ba8d41b9fedc3d41ee0fbda7d5db HDFS-14356 - 05/Jun/19 d1aad444907e1fc5314e8e64529e57c51ed7561c HDFS-14458 - 15/Jul/19 e98adb00b7da8fa913b86ecf2049444b1d8617d4 HDFS-14357 - 15/Jul/19 30a8f840f1572129fe7d02f8a784c47ab57ce89a HDFS-14700 - 09/Aug/19 f6fa865d6fcb0ef0a25a00615f16f383e5032373 HDFS-14818 - 22/Sep/19 659c88801d008bb352d10a1cb3bd0e401486cc9b {code} was (Author: rakeshr): Thanks [~PhiloHe] for the consolidated patch. I have cherry picked following 10 commits from {{trunk}} to {{branch-3.1}} {code:java} HDFS-14354 - 15/March/2019 ba50a36a3ead628c3d44d384f7ed4d2b3a55dd07 HDFS-14393 -29/March/2019 f3f51284d57ef2e0c7e968b6eea56eab578f7e93 HDFS-14355 - 31/March/2019 35ff31dd9462cf4fb4ebf5556ee8ae6bcd7c5c3a HDFS-14401 - 08/May/19 9b0aace1e6c54f201784912c0b623707aa82b761 HDFS-14402 - 29/May/19 37900c5639f8ba8d41b9fedc3d41ee0fbda7d5db HDFS-14356 - 05/Jun/19 d1aad444907e1fc5314e8e64529e57c51ed7561c HDFS-14458 - 15/Jul/19 e98adb00b7da8fa913b86ecf2049444b1d8617d4 HDFS-14357 - 15/Jul/19 30a8f840f1572129fe7d02f8a784c47ab57ce89a HDFS-14700 - 09/Aug/19 f6fa865d6fcb0ef0a25a00615f16f383e5032373 HDFS-14818 - 22/Sep/19 659c88801d008bb352d10a1cb3bd0e401486cc9b {code} > Backport HDFS persistent memory read cache support to branch-3.1 > > > Key: HDFS-14745 > URL: https://issues.apache.org/jira/browse/HDFS-14745 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Labels: cache, datanode > Fix For: 3.3.0 > > Attachments: HDFS-14745-branch-3.1-000.patch, > HDFS-14745-branch-3.1-001.patch, HDFS-14745-branch-3.1-002.patch, > HDFS-14745-branch-3.1-003.patch > > > We are proposing to backport the patches for HDFS-13762, HDFS persistent > memory read cache support, to branch-3.1. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14745) Backport HDFS persistent memory read cache support to branch-3.1
[ https://issues.apache.org/jira/browse/HDFS-14745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16960645#comment-16960645 ] Rakesh Radhakrishnan commented on HDFS-14745: - Thanks [~PhiloHe] for the consolidated patch. I have cherry picked following 10 commits from {{trunk}} to {{branch-3.1}} {code:java} HDFS-14354 - 15/March/2019 ba50a36a3ead628c3d44d384f7ed4d2b3a55dd07 HDFS-14393 -29/March/2019 f3f51284d57ef2e0c7e968b6eea56eab578f7e93 HDFS-14355 - 31/March/2019 35ff31dd9462cf4fb4ebf5556ee8ae6bcd7c5c3a HDFS-14401 - 08/May/19 9b0aace1e6c54f201784912c0b623707aa82b761 HDFS-14402 - 29/May/19 37900c5639f8ba8d41b9fedc3d41ee0fbda7d5db HDFS-14356 - 05/Jun/19 d1aad444907e1fc5314e8e64529e57c51ed7561c HDFS-14458 - 15/Jul/19 e98adb00b7da8fa913b86ecf2049444b1d8617d4 HDFS-14357 - 15/Jul/19 30a8f840f1572129fe7d02f8a784c47ab57ce89a HDFS-14700 - 09/Aug/19 f6fa865d6fcb0ef0a25a00615f16f383e5032373 HDFS-14818 - 22/Sep/19 659c88801d008bb352d10a1cb3bd0e401486cc9b {code} > Backport HDFS persistent memory read cache support to branch-3.1 > > > Key: HDFS-14745 > URL: https://issues.apache.org/jira/browse/HDFS-14745 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Feilong He >Assignee: Feilong He >Priority: Major > Labels: cache, datanode > Fix For: 3.3.0 > > Attachments: HDFS-14745-branch-3.1-000.patch, > HDFS-14745-branch-3.1-001.patch, HDFS-14745-branch-3.1-002.patch, > HDFS-14745-branch-3.1-003.patch > > > We are proposing to backport the patches for HDFS-13762, HDFS persistent > memory read cache support, to branch-3.1. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org