[
https://issues.apache.org/jira/browse/HDFS-6699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14065208#comment-14065208
]
Arpit Agarwal commented on HDFS-6699:
-------------------------------------
Hi Remus, an alternative approach you may have already considered is named file
mapping objects via {{CreateFileMapping}}. By scoping the security descriptor
the DataNode can theoretically restrict access to just the client user.
The advantage is that NameNode and any other processes don't need to act as
brokers.
> Secure Windows DFS read when client co-located on nodes with data
> (short-circuit reads)
> ---------------------------------------------------------------------------------------
>
> Key: HDFS-6699
> URL: https://issues.apache.org/jira/browse/HDFS-6699
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: datanode, hdfs-client, performance, security
> Reporter: Remus Rusanu
> Labels: windows
>
> HDFS-347 Introduced secure short-circuit HDFS reads based on linux domain
> sockets. Similar capability can be introduced in a secure Windows environment
> using
> [DuplicateHandle](http://msdn.microsoft.com/en-us/library/windows/desktop/ms724251(v=vs.85).aspx)
> Win32 API. When short-circuit is allowed the datanode would open the block
> file and then duplicate the handle into the hdfs client process and return to
> the process the handle value. The hdfs client can then open a Java stream on
> this handle and read the file. This is a secure mechanism, the HDFS acls are
> validated by the namenode and the process does not gets direct access to the
> file in a controlled manner (eg. read-only). The hdfs client process does not
> need to have OS level access privilege to the block file.
> A complication arises from the requirement to duplicate the handle in the
> hdfs client process. Ordinary processes (as we desire datanode to run) do not
> have the required privilege (SeDebugPrivilege). But with introduction of an
> elevated service helper for the nodemanager Windows Secure Container Executor
> (YARN-2198) we have at our disposal an elevated executor that can do the job
> of duplicating the handle. The datanode would communicate with this process
> using the same mechanism as the nodemanager, ie. LRPC.
> With my proposed implementation the sequence of actions is as follows:
> - the hdfs client requests Windows secure shortcircuit of a block in the
> data transfer protocol. It passes the block, the token and its own process ID.
> - datanode approves short-circuit. It opens the block file and obtains the
> handle.
> - datanode invokes the elevated privilege service to duplicate the handle
> into the hdfs client process. datanode invokes the service LRPC interface
> over JNI (LRPC being the Windows de-facto standard for interoperating with a
> service). It passes the handle valeu, its own process id and the hdfs client
> process id.
> - The elevated service duplicates the handle from the datanode process into
> the hdfs client proces. It returns the duplicate handle value to the datanode
> as output value from the LRPC call
> - x 2 for CRC file
> - the datanode responds to the short circuit datatransfer protocol request
> with a message that contains the duplicate handle value (handles actually, x2
> from CRC)
> - the hdfs-client creates a Java stream that wraps the handles and reads the
> block from this stream (ditto for CRC)
> datanode needs to exercise care not to duplicate the same handle to different
> clients (including the CRC handles) because a handle abstracts also the file
> position and clients would inadvertently move each other file pointer to
> chaos results.
> TBD a mitigation for process ID reuse (the hdfs client can be terminated
> immediately after the block request and a new process could reuse the same
> ID) . In theory an attacker could use this as a mechanism to obtain a handle
> to a block by killing the hdfs-client at the right moment and swing new
> processes until it gets one with the desired ID. I'm not sure is a realistic
> threat because the attacker already must have the privilege to kill the hdfs
> client process, and having such privilege he could obtain the handle by other
> means (eg. debug/inspect hdfs client process).
--
This message was sent by Atlassian JIRA
(v6.2#6252)