[
https://issues.apache.org/jira/browse/HDFS-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris Nauroth updated HDFS-5202:
--------------------------------
Attachment: HDFS-5202.1.patch
The attached patch gets DataNode caching working on Windows. This mixes
changes in Common and HDFS. I can spin off a separate HADOOP jira for the
Common changes after this gets reviewed and approved.
This is actually pretty simple stuff. We just need to swap out the POSIX
syscalls for some Windows specifics. The relevant Windows syscalls are:
*
[VirtualLock|http://msdn.microsoft.com/en-us/library/windows/desktop/aa366895(v=vs.85).aspx]
*
[GetCurrentProcess|http://msdn.microsoft.com/en-us/library/windows/desktop/ms683179(v=vs.85).aspx]
*
[GetProcessWorkingSetSize|http://msdn.microsoft.com/en-us/library/windows/desktop/ms683226(v=vs.85).aspx]
*
[SetProcessWorkingSetSizeEx|http://msdn.microsoft.com/en-us/library/windows/desktop/ms686237(v=vs.85).aspx]
Summary of changes:
# {{NativeIO}}: I added {{extendWorkingSetSize}}, which is a new Windows-only
JNI method that extends the minimum and maximum working set size of a Windows
process. Ultimately, this is what governs how much memory a Windows process is
allowed to lock. Full details are in the MSDN links above. I also implemented
{{mlock_1native}} to call {{VirtualLock}} on Windows.
# {{hdfs.cmd}}: I added cacheadmin to the supported commands on Windows.
# {{DataNode}}: Windows does not have a direct equivalent of {{ulimit -l}}.
Instead of looking for a ulimit and enforcing that our configuration doesn't
exceed it, we attempt to extend the working set size when running on Windows.
# {{CentralizedCacheManagement.apt.vm}}: I updated the documentation with a few
clarifications about how it works on Windows.
# {{TestFsDatasetCache}}: We no longer need to skip this test suite on Windows.
The tests had a few file descriptor leaks that caused test failures on
Windows, so I fixed that.
In addition to running the JUnit tests, I ran manual tests. I used
Systeinternals VMMap to confirm that the block files were getting memory-mapped
and locked into the virtual address space of the DataNode JVM process.
http://technet.microsoft.com/en-us/sysinternals/dd535533.aspx
> Support Centralized Cache Management on Windows.
> ------------------------------------------------
>
> Key: HDFS-5202
> URL: https://issues.apache.org/jira/browse/HDFS-5202
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: datanode
> Affects Versions: 3.0.0, 2.5.0
> Reporter: Colin Patrick McCabe
> Assignee: Chris Nauroth
> Attachments: HDFS-5202.1.patch
>
>
> HDFS caching currently is implemented using POSIX syscalls for checking
> ulimit and locking pages of memory into the process's address space. These
> POSIX syscalls do not exist on Windows. This issue will implement equivalent
> functionality so that Windows deployments can use Centralized Cache
> Management.
--
This message was sent by Atlassian JIRA
(v6.2#6252)