[jira] [Updated] (HDFS-16014) Fix an issue in checking native pmdk lib by 'hadoop checknative' command

2021-12-14 Thread Rakesh Radhakrishnan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh Radhakrishnan updated HDFS-16014:

Fix Version/s: 3.4.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> Fix an issue in checking native pmdk lib by 'hadoop checknative' command
> 
>
> Key: HDFS-16014
> URL: https://issues.apache.org/jira/browse/HDFS-16014
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: native
>Affects Versions: 3.4.0
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: HDFS-16014-01.patch, HDFS-16014-02.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> In HDFS-14818, we proposed a patch to support checking native pmdk lib. The 
> expected target is to display hint to user regarding pmdk lib loaded state. 
> Recently, it was found that pmdk lib was not successfully loaded actually but 
> the `hadoop checknative` command still tells user that it was. This issue can 
> be reproduced by moving libpmem.so* from specified installed path to other 
> place, or directly deleting these libs, after the project is built.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15788) Correct the statement for pmem cache to reflect cache persistence support

2021-12-08 Thread Rakesh Radhakrishnan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh Radhakrishnan updated HDFS-15788:

Fix Version/s: 3.4.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> Correct the statement for pmem cache to reflect cache persistence support
> -
>
> Key: HDFS-15788
> URL: https://issues.apache.org/jira/browse/HDFS-15788
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 3.4.0
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: HDFS-15788-01.patch, HDFS-15788-02.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Correct the statement for pmem cache to reflect cache persistence support.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-16362) [FSO] Refactor isFileSystemOptimized usage in OzoneManagerUtils

2021-11-30 Thread Rakesh Radhakrishnan (Jira)
Rakesh Radhakrishnan created HDFS-16362:
---

 Summary: [FSO] Refactor isFileSystemOptimized usage in 
OzoneManagerUtils
 Key: HDFS-16362
 URL: https://issues.apache.org/jira/browse/HDFS-16362
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Rakesh Radhakrishnan


This task is to refactor the om request instantiation based on 
#isFileSystemOptimized() check in OzoneManagerUtils class.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16014) Issue in checking native pmdk lib by 'hadoop checknative' command

2021-11-30 Thread Rakesh Radhakrishnan (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17451069#comment-17451069
 ] 

Rakesh Radhakrishnan commented on HDFS-16014:
-

[~PhiloHe] can you please re-run the build by attaching a new patch and get 
latest QA report. Thanks! 

+1 patch looks good to me 

> Issue in checking native pmdk lib by 'hadoop checknative' command
> -
>
> Key: HDFS-16014
> URL: https://issues.apache.org/jira/browse/HDFS-16014
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: native
>Affects Versions: 3.4.0
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Attachments: HDFS-16014-01.patch
>
>
> In HDFS-14818, we proposed a patch to support checking native pmdk lib. The 
> expected target is to display hint to user regarding pmdk lib loaded state. 
> Recently, it was found that pmdk lib was not successfully loaded actually but 
> the `hadoop checknative` command still tells user that it was. This issue can 
> be reproduced by moving libpmem.so* from specified installed path to other 
> place, or directly deleting these libs, after the project is built.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15788) Correct the statement for pmem cache to reflect cache persistence support

2021-11-29 Thread Rakesh Radhakrishnan (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17450279#comment-17450279
 ] 

Rakesh Radhakrishnan commented on HDFS-15788:
-

+1 LGTM, thanks [~PhiloHe] for the contribution.

> Correct the statement for pmem cache to reflect cache persistence support
> -
>
> Key: HDFS-15788
> URL: https://issues.apache.org/jira/browse/HDFS-15788
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 3.4.0
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Minor
> Attachments: HDFS-15788-01.patch, HDFS-15788-02.patch
>
>
> Correct the statement for pmem cache to reflect cache persistence support.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-15253) Set default throttle value on dfs.image.transfer.bandwidthPerSec

2020-10-07 Thread Rakesh Radhakrishnan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh Radhakrishnan resolved HDFS-15253.
-
Fix Version/s: 3.4.0
   Resolution: Fixed

> Set default throttle value on dfs.image.transfer.bandwidthPerSec
> 
>
> Key: HDFS-15253
> URL: https://issues.apache.org/jira/browse/HDFS-15253
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Karthik Palanisamy
>Assignee: Karthik Palanisamy
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> The default value dfs.image.transfer.bandwidthPerSec is set to 0 so it can 
> use maximum available bandwidth for fsimage transfers during checkpoint. I 
> think we should throttle this. Many users were experienced namenode failover 
> when transferring large image size along with fsimage replication on 
> dfs.namenode.name.dir. eg. >25Gb.  
> Thought to set,
> dfs.image.transfer.bandwidthPerSec=52428800. (50 MB/s)
> dfs.namenode.checkpoint.txns=200 (Default is 1M, good to avoid frequent 
> checkpoint. However, the default checkpoint runs every 6 hours once)
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15080) Fix the issue in reading persistent memory cached data with an offset

2020-01-08 Thread Rakesh Radhakrishnan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh Radhakrishnan updated HDFS-15080:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed the patch to {{trunk}}, {{branch-3.2}} and {{branch-3.1}} branches.

Thanks [~PhiloHe] for the contribution.

> Fix the issue in reading persistent memory cached data with an offset
> -
>
> Key: HDFS-15080
> URL: https://issues.apache.org/jira/browse/HDFS-15080
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: caching, datanode
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Fix For: 3.3.0, 3.1.4, 3.2.2
>
> Attachments: HDFS-15080-000.patch, HDFS-15080-branch-3.1-000.patch, 
> HDFS-15080-branch-3.2-000.patch
>
>
> Some applications can read a segment of pmem cache with an offset specified. 
> The previous implementation for pmem cache read with DirectByteBuffer didn't 
> cover this situation.
> Let me explain further. In our test, we used spark SQL to run some TPC-DS 
> workload to read the cache data and hits read exception. This was due to the 
> missed seek offset arg, which is used in spark SQL to read data packet by 
> packet.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15080) Fix the issue in reading persistent memory cached data with an offset

2020-01-08 Thread Rakesh Radhakrishnan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh Radhakrishnan updated HDFS-15080:

Summary: Fix the issue in reading persistent memory cached data with an 
offset  (was: Fix the issue in reading persistent memory cache with an offset)

> Fix the issue in reading persistent memory cached data with an offset
> -
>
> Key: HDFS-15080
> URL: https://issues.apache.org/jira/browse/HDFS-15080
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: caching, datanode
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Fix For: 3.3.0, 3.1.4, 3.2.2
>
> Attachments: HDFS-15080-000.patch, HDFS-15080-branch-3.1-000.patch, 
> HDFS-15080-branch-3.2-000.patch
>
>
> Some applications can read a segment of pmem cache with an offset specified. 
> The previous implementation for pmem cache read with DirectByteBuffer didn't 
> cover this situation.
> Let me explain further. In our test, we used spark SQL to run some TPC-DS 
> workload to read the cache data and hits read exception. This was due to the 
> missed seek offset arg, which is used in spark SQL to read data packet by 
> packet.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15080) Fix the issue in reading persistent memory cache with an offset

2020-01-08 Thread Rakesh Radhakrishnan (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010479#comment-17010479
 ] 

Rakesh Radhakrishnan commented on HDFS-15080:
-

Thanks [~PhiloHe], its good finding. +1 patch looks good to me. I will commit 
this shortly to the branches.
Since this is related to the Pmem native bindings, excluding unit test case for 
this change.

> Fix the issue in reading persistent memory cache with an offset
> ---
>
> Key: HDFS-15080
> URL: https://issues.apache.org/jira/browse/HDFS-15080
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: caching, datanode
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Fix For: 3.3.0, 3.1.4, 3.2.2
>
> Attachments: HDFS-15080-000.patch, HDFS-15080-branch-3.1-000.patch, 
> HDFS-15080-branch-3.2-000.patch
>
>
> Some applications can read a segment of pmem cache with an offset specified. 
> The previous implementation for pmem cache read with DirectByteBuffer didn't 
> cover this situation.
> Let me explain further. In our test, we used spark SQL to run some TPC-DS 
> workload to read the cache data and hits read exception. This was due to the 
> missed seek offset arg, which is used in spark SQL to read data packet by 
> packet.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts

2020-01-01 Thread Rakesh Radhakrishnan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh Radhakrishnan updated HDFS-14740:

Fix Version/s: 3.2.2
   3.1.4
   3.3.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

I have committed latest patch to the respective branches - trunk, branch-3.2 
and branch-3.1.

Thanks [~PhiloHe] and [~Rui Mo] for the contribution.



> Recover data blocks from persistent memory read cache during datanode restarts
> --
>
> Key: HDFS-14740
> URL: https://issues.apache.org/jira/browse/HDFS-14740
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: caching, datanode
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Fix For: 3.3.0, 3.1.4, 3.2.2
>
> Attachments: HDFS-14740-branch-3.1-000.patch, 
> HDFS-14740-branch-3.1-001.patch, HDFS-14740-branch-3.2-000.patch, 
> HDFS-14740-branch-3.2-001.patch, HDFS-14740.000.patch, HDFS-14740.001.patch, 
> HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, 
> HDFS-14740.005.patch, HDFS-14740.006.patch, HDFS-14740.007.patch, 
> HDFS-14740.008.patch, HDFS-14740.009.patch, 
> HDFS_Persistent_Read-Cache_Design-v1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.pdf, HDFS_Persistent_Read-Cache_Test-v2.pdf
>
>
> In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache 
> management. Even though PM can persist cache data, for simplifying the 
> initial implementation, the previous cache data will be cleaned up during 
> DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking 
> advantage of PM's data persistence characteristic, i.e., recovering the 
> status for cached data, if any, when DataNode restarts, thus, cache warm up 
> time can be saved for user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts

2020-01-01 Thread Rakesh Radhakrishnan (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17006607#comment-17006607
 ] 

Rakesh Radhakrishnan commented on HDFS-14740:
-

Thanks [~PhiloHe] for the patch. +1 looks good to me. I will commit it shortly.

> Recover data blocks from persistent memory read cache during datanode restarts
> --
>
> Key: HDFS-14740
> URL: https://issues.apache.org/jira/browse/HDFS-14740
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: caching, datanode
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Attachments: HDFS-14740-branch-3.1-000.patch, 
> HDFS-14740-branch-3.1-001.patch, HDFS-14740-branch-3.2-000.patch, 
> HDFS-14740-branch-3.2-001.patch, HDFS-14740.000.patch, HDFS-14740.001.patch, 
> HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, 
> HDFS-14740.005.patch, HDFS-14740.006.patch, HDFS-14740.007.patch, 
> HDFS-14740.008.patch, HDFS-14740.009.patch, 
> HDFS_Persistent_Read-Cache_Design-v1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.pdf, HDFS_Persistent_Read-Cache_Test-v2.pdf
>
>
> In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache 
> management. Even though PM can persist cache data, for simplifying the 
> initial implementation, the previous cache data will be cleaned up during 
> DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking 
> advantage of PM's data persistence characteristic, i.e., recovering the 
> status for cached data, if any, when DataNode restarts, thus, cache warm up 
> time can be saved for user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts

2019-12-06 Thread Rakesh Radhakrishnan (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16989600#comment-16989600
 ] 

Rakesh Radhakrishnan commented on HDFS-14740:
-

Thanks [~PhiloHe] for the updates.

How about keeping the two pmem related configs with matching names like below : 

{{'dfs.datanode.pmem.cache.restore'}} and {{'dfs.datanode.pmem.cache.dirs'}} ?

> Recover data blocks from persistent memory read cache during datanode restarts
> --
>
> Key: HDFS-14740
> URL: https://issues.apache.org/jira/browse/HDFS-14740
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: caching, datanode
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Attachments: HDFS-14740.000.patch, HDFS-14740.001.patch, 
> HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, 
> HDFS-14740.005.patch, HDFS-14740.006.patch, HDFS-14740.007.patch, 
> HDFS_Persistent_Read-Cache_Design-v1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.pdf, HDFS_Persistent_Read-Cache_Test-v2.pdf
>
>
> In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache 
> management. Even though PM can persist cache data, for simplifying the 
> initial implementation, the previous cache data will be cleaned up during 
> DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking 
> advantage of PM's data persistence characteristic, i.e., recovering the 
> status for cached data, if any, when DataNode restarts, thus, cache warm up 
> time can be saved for user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts

2019-11-18 Thread Rakesh Radhakrishnan (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976700#comment-16976700
 ] 

Rakesh Radhakrishnan commented on HDFS-14740:
-

Thanks [~Rui Mo] for the test result.  Adding few comments, please address the 
same. Apart from below comments, the patch looks good to me.

+Comment-1)+ Blocks will be persisted into the PMem device irrespective of the 
config '{{dfs.datanode.cache.persistence.enabled}}' value, either true or 
false. Again, the purpose of this flag is to restore the block cache in 
datanode process using the physical blocks present in the PMem device. In that 
sense how about renaming the config to 
'{{dfs.datanode.cache.pmem.block.restore'}} or some other better name 
reflecting the behavior?

+Comment-2)+ If not tested, then could you please capture the following 
scenario into your test sheet.
 *scenario:* User has cached {{file A}}. Now, admin has restarted datanode with 
the above flag to {{false}}. Assume user has submitted cache directive command 
to cache same {{file A}}.

 

> Recover data blocks from persistent memory read cache during datanode restarts
> --
>
> Key: HDFS-14740
> URL: https://issues.apache.org/jira/browse/HDFS-14740
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: caching, datanode
>Reporter: Feilong He
>Assignee: Rui Mo
>Priority: Major
> Attachments: HDFS-14740.000.patch, HDFS-14740.001.patch, 
> HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, 
> HDFS-14740.005.patch, HDFS-14740.006.patch, 
> HDFS_Persistent_Read-Cache_Design-v1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.pdf, HDFS_Persistent_Read-Cache_Test-v2.pdf
>
>
> In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache 
> management. Even though PM can persist cache data, for simplifying the 
> initial implementation, the previous cache data will be cleaned up during 
> DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking 
> advantage of PM's data persistence characteristic, i.e., recovering the 
> status for cached data, if any, when DataNode restarts, thus, cache warm up 
> time can be saved for user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14745) Backport HDFS persistent memory read cache support to branch-3.1

2019-10-27 Thread Rakesh Radhakrishnan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh Radhakrishnan updated HDFS-14745:

Issue Type: New Feature  (was: Improvement)

> Backport HDFS persistent memory read cache support to branch-3.1
> 
>
> Key: HDFS-14745
> URL: https://issues.apache.org/jira/browse/HDFS-14745
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
>  Labels: cache, datanode
> Fix For: 3.1.4
>
> Attachments: HDFS-14745-branch-3.1-000.patch, 
> HDFS-14745-branch-3.1-001.patch, HDFS-14745-branch-3.1-002.patch, 
> HDFS-14745-branch-3.1-003.patch
>
>
> We are proposing to backport the patches for HDFS-13762, HDFS persistent 
> memory read cache support, to branch-3.1.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14905) Backport HDFS persistent memory read cache support to branch-3.2

2019-10-27 Thread Rakesh Radhakrishnan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh Radhakrishnan updated HDFS-14905:

Issue Type: New Feature  (was: Improvement)

> Backport HDFS persistent memory read cache support to branch-3.2
> 
>
> Key: HDFS-14905
> URL: https://issues.apache.org/jira/browse/HDFS-14905
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: caching, datanode
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Fix For: 3.2.2
>
> Attachments: HDFS-14905-branch-3.2-000.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14905) Backport HDFS persistent memory read cache support to branch-3.2

2019-10-27 Thread Rakesh Radhakrishnan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh Radhakrishnan updated HDFS-14905:

Fix Version/s: (was: 3.3.0)
   3.2.2
 Hadoop Flags: Reviewed
 Release Note: Non-volatile storage class memory (SCM, also known as 
persistent memory) is supported in HDFS cache. To enable SCM cache, user just 
needs to configure SCM volume for property “dfs.datanode.cache.pmem.dirs” in 
hdfs-site.xml. And all HDFS cache directives keep unchanged. There are two 
implementations for HDFS SCM Cache, one is pure java code implementation and 
the other is native PMDK based implementation. The latter implementation can 
bring user better performance gain in cache write and cache read. If PMDK 
native libs could be loaded, it will use PMDK based implementation otherwise it 
will fallback to java code implementation. To enable PMDK based implementation, 
user should install PMDK library by referring to the official site 
http://pmem.io/. Then, build Hadoop with PMDK support by referring to "PMDK 
library build options" section in `BUILDING.txt` in the source code. If 
multiple SCM volumes are configured, a round-robin policy is used to select an 
available volume for caching a block. Consistent with DRAM cache, SCM cache 
also has no cache eviction mechanism. When DataNode receives a data read 
request from a client, if the corresponding block is cached into SCM, DataNode 
will instantiate an InputStream with the block location path on SCM (pure java 
implementation) or cache address on SCM (PMDK based implementation). Once the 
InputStream is created, DataNode will send the cached data to the client. 
Please refer "Centralized Cache Management" guide for more details.
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Thanks [~PhiloHe] for the consolidated patch. +1, I have cherry picked 
following 10 commits from {{trunk}} to {{branch-3.2}}

{code}
HDFS-14354 - 15/March/2019
ba50a36a3ead628c3d44d384f7ed4d2b3a55dd07

HDFS-14393 -29/March/2019
f3f51284d57ef2e0c7e968b6eea56eab578f7e93

HDFS-14355 - 31/March/2019
35ff31dd9462cf4fb4ebf5556ee8ae6bcd7c5c3a

HDFS-14401 - 08/May/19
9b0aace1e6c54f201784912c0b623707aa82b761

HDFS-14402 - 29/May/19
37900c5639f8ba8d41b9fedc3d41ee0fbda7d5db

HDFS-14356 - 05/Jun/19
d1aad444907e1fc5314e8e64529e57c51ed7561c

HDFS-14458 - 15/Jul/19
e98adb00b7da8fa913b86ecf2049444b1d8617d4

HDFS-14357 - 15/Jul/19
30a8f840f1572129fe7d02f8a784c47ab57ce89a

HDFS-14700 - 09/Aug/19
f6fa865d6fcb0ef0a25a00615f16f383e5032373

HDFS-14818 - 22/Sep/19
659c88801d008bb352d10a1cb3bd0e401486cc9b
{code}

> Backport HDFS persistent memory read cache support to branch-3.2
> 
>
> Key: HDFS-14905
> URL: https://issues.apache.org/jira/browse/HDFS-14905
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: caching, datanode
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Fix For: 3.2.2
>
> Attachments: HDFS-14905-branch-3.2-000.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts

2019-10-27 Thread Rakesh Radhakrishnan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh Radhakrishnan updated HDFS-14740:

Summary: Recover data blocks from persistent memory read cache during 
datanode restarts  (was: HDFS read cache persistence support)

> Recover data blocks from persistent memory read cache during datanode restarts
> --
>
> Key: HDFS-14740
> URL: https://issues.apache.org/jira/browse/HDFS-14740
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: caching, datanode
>Reporter: Feilong He
>Assignee: Rui Mo
>Priority: Major
> Attachments: HDFS-14740.000.patch, HDFS-14740.001.patch, 
> HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, 
> HDFS-14740.005.patch, HDFS-14740.006.patch, 
> HDFS_Persistent_Read-Cache_Design-v1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.pdf
>
>
> In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache 
> management. Even though PM can persist cache data, for simplifying the 
> initial implementation, the previous cache data will be cleaned up during 
> DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking 
> advantage of PM's data persistence characteristic, i.e., recovering the 
> status for cached data, if any, when DataNode restarts, thus, cache warm up 
> time can be saved for user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14745) Backport HDFS persistent memory read cache support to branch-3.1

2019-10-27 Thread Rakesh Radhakrishnan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh Radhakrishnan updated HDFS-14745:

Fix Version/s: (was: 3.3.0)
   3.1.4
 Release Note: Non-volatile storage class memory (SCM, also known as 
persistent memory) is supported in HDFS cache. To enable SCM cache, user just 
needs to configure SCM volume for property “dfs.datanode.cache.pmem.dirs” in 
hdfs-site.xml. And all HDFS cache directives keep unchanged. There are two 
implementations for HDFS SCM Cache, one is pure java code implementation and 
the other is native PMDK based implementation. The latter implementation can 
bring user better performance gain in cache write and cache read. If PMDK 
native libs could be loaded, it will use PMDK based implementation otherwise it 
will fallback to java code implementation. To enable PMDK based implementation, 
user should install PMDK library by referring to the official site 
http://pmem.io/. Then, build Hadoop with PMDK support by referring to "PMDK 
library build options" section in `BUILDING.txt` in the source code. If 
multiple SCM volumes are configured, a round-robin policy is used to select an 
available volume for caching a block. Consistent with DRAM cache, SCM cache 
also has no cache eviction mechanism. When DataNode receives a data read 
request from a client, if the corresponding block is cached into SCM, DataNode 
will instantiate an InputStream with the block location path on SCM (pure java 
implementation) or cache address on SCM (PMDK based implementation). Once the 
InputStream is created, DataNode will send the cached data to the client. 
Please refer "Centralized Cache Management" guide for more details.
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> Backport HDFS persistent memory read cache support to branch-3.1
> 
>
> Key: HDFS-14745
> URL: https://issues.apache.org/jira/browse/HDFS-14745
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
>  Labels: cache, datanode
> Fix For: 3.1.4
>
> Attachments: HDFS-14745-branch-3.1-000.patch, 
> HDFS-14745-branch-3.1-001.patch, HDFS-14745-branch-3.1-002.patch, 
> HDFS-14745-branch-3.1-003.patch
>
>
> We are proposing to backport the patches for HDFS-13762, HDFS persistent 
> memory read cache support, to branch-3.1.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14745) Backport HDFS persistent memory read cache support to branch-3.1

2019-10-27 Thread Rakesh Radhakrishnan (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16960645#comment-16960645
 ] 

Rakesh Radhakrishnan edited comment on HDFS-14745 at 10/27/19 5:21 PM:
---

Thanks [~PhiloHe] for the consolidated patch. +1, I have cherry picked 
following 10 commits from {{trunk}} to {{branch-3.1}}
{code:java}
HDFS-14354 - 15/March/2019
ba50a36a3ead628c3d44d384f7ed4d2b3a55dd07

HDFS-14393 -29/March/2019
f3f51284d57ef2e0c7e968b6eea56eab578f7e93

HDFS-14355 - 31/March/2019
35ff31dd9462cf4fb4ebf5556ee8ae6bcd7c5c3a

HDFS-14401 - 08/May/19
9b0aace1e6c54f201784912c0b623707aa82b761

HDFS-14402 - 29/May/19
37900c5639f8ba8d41b9fedc3d41ee0fbda7d5db

HDFS-14356 - 05/Jun/19
d1aad444907e1fc5314e8e64529e57c51ed7561c

HDFS-14458 - 15/Jul/19
e98adb00b7da8fa913b86ecf2049444b1d8617d4

HDFS-14357 - 15/Jul/19
30a8f840f1572129fe7d02f8a784c47ab57ce89a

HDFS-14700 - 09/Aug/19
f6fa865d6fcb0ef0a25a00615f16f383e5032373

HDFS-14818 - 22/Sep/19
659c88801d008bb352d10a1cb3bd0e401486cc9b
{code}


was (Author: rakeshr):
Thanks [~PhiloHe] for the consolidated patch. I have cherry picked following 10 
commits from {{trunk}} to {{branch-3.1}}
{code:java}
HDFS-14354 - 15/March/2019
ba50a36a3ead628c3d44d384f7ed4d2b3a55dd07

HDFS-14393 -29/March/2019
f3f51284d57ef2e0c7e968b6eea56eab578f7e93

HDFS-14355 - 31/March/2019
35ff31dd9462cf4fb4ebf5556ee8ae6bcd7c5c3a

HDFS-14401 - 08/May/19
9b0aace1e6c54f201784912c0b623707aa82b761

HDFS-14402 - 29/May/19
37900c5639f8ba8d41b9fedc3d41ee0fbda7d5db

HDFS-14356 - 05/Jun/19
d1aad444907e1fc5314e8e64529e57c51ed7561c

HDFS-14458 - 15/Jul/19
e98adb00b7da8fa913b86ecf2049444b1d8617d4

HDFS-14357 - 15/Jul/19
30a8f840f1572129fe7d02f8a784c47ab57ce89a

HDFS-14700 - 09/Aug/19
f6fa865d6fcb0ef0a25a00615f16f383e5032373

HDFS-14818 - 22/Sep/19
659c88801d008bb352d10a1cb3bd0e401486cc9b
{code}

> Backport HDFS persistent memory read cache support to branch-3.1
> 
>
> Key: HDFS-14745
> URL: https://issues.apache.org/jira/browse/HDFS-14745
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
>  Labels: cache, datanode
> Fix For: 3.3.0
>
> Attachments: HDFS-14745-branch-3.1-000.patch, 
> HDFS-14745-branch-3.1-001.patch, HDFS-14745-branch-3.1-002.patch, 
> HDFS-14745-branch-3.1-003.patch
>
>
> We are proposing to backport the patches for HDFS-13762, HDFS persistent 
> memory read cache support, to branch-3.1.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14745) Backport HDFS persistent memory read cache support to branch-3.1

2019-10-27 Thread Rakesh Radhakrishnan (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16960645#comment-16960645
 ] 

Rakesh Radhakrishnan commented on HDFS-14745:
-

Thanks [~PhiloHe] for the consolidated patch. I have cherry picked following 10 
commits from {{trunk}} to {{branch-3.1}}
{code:java}
HDFS-14354 - 15/March/2019
ba50a36a3ead628c3d44d384f7ed4d2b3a55dd07

HDFS-14393 -29/March/2019
f3f51284d57ef2e0c7e968b6eea56eab578f7e93

HDFS-14355 - 31/March/2019
35ff31dd9462cf4fb4ebf5556ee8ae6bcd7c5c3a

HDFS-14401 - 08/May/19
9b0aace1e6c54f201784912c0b623707aa82b761

HDFS-14402 - 29/May/19
37900c5639f8ba8d41b9fedc3d41ee0fbda7d5db

HDFS-14356 - 05/Jun/19
d1aad444907e1fc5314e8e64529e57c51ed7561c

HDFS-14458 - 15/Jul/19
e98adb00b7da8fa913b86ecf2049444b1d8617d4

HDFS-14357 - 15/Jul/19
30a8f840f1572129fe7d02f8a784c47ab57ce89a

HDFS-14700 - 09/Aug/19
f6fa865d6fcb0ef0a25a00615f16f383e5032373

HDFS-14818 - 22/Sep/19
659c88801d008bb352d10a1cb3bd0e401486cc9b
{code}

> Backport HDFS persistent memory read cache support to branch-3.1
> 
>
> Key: HDFS-14745
> URL: https://issues.apache.org/jira/browse/HDFS-14745
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
>  Labels: cache, datanode
> Fix For: 3.3.0
>
> Attachments: HDFS-14745-branch-3.1-000.patch, 
> HDFS-14745-branch-3.1-001.patch, HDFS-14745-branch-3.1-002.patch, 
> HDFS-14745-branch-3.1-003.patch
>
>
> We are proposing to backport the patches for HDFS-13762, HDFS persistent 
> memory read cache support, to branch-3.1.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org