[
https://issues.apache.org/jira/browse/HDFS-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris Nauroth updated HDFS-2856:
--------------------------------
Attachment: HDFS-2856.1.patch
I'm uploading a patch that implements the ideas described in the past several
comments. I'm still in progress on more tests and several TODOs, but any
feedback at this point is welcome. Pinging [~owen.omalley], [~lmccay], [~jnp]
and [~atm] for potential feedback.
It's a big patch. I did a lot of refactoring to avoid code duplication between
the general-purpose SASL flow and our existing specialized encrypted SASL flow.
If this is too cumbersome to review at once, then I can split some of the
refactorings into separate patches on request.
Summary of changes:
* {{DataTransferEncryptor}}: I deleted this class. The code has been
refactored into various new classes in a new
{{org.apache.hadoop.hdfs.protocol.datatransfer.sasl}} sub-package. The
presence of the word "encrypt" in this class name would have been potentially
misleading, because we're now allowing DataTransferProtocol to support a
quality of protection different from auth-conf.
* {{SaslDataTransferClient}}: This class now implements the client side of SASL
negotiation, whether using the general-purpose SASL handshake or our existing
specialized encrypted handshake. This class is called by the HDFS client and
also by the DataNode when acting as a client to another DataNode. The logic
for deciding whether or not to do a SASL handshake, and if so which kind of
handshake, has become somewhat complex. By encapsulating it behind this class,
we avoid repeating that logic at multiple points in the rest of the code.
* {{SaslDataTransferServer}}: This class now implements the server side of SASL
negotiation. This is only called by the DataNode when receiving new
connections. Similar to the above, this is a single point for encapsulating
the logic of deciding which SASL handshake to use.
* {{DataTransferSaslUtil}}: This contains various helper functions needed by
the SASL classes.
* Various classes of the HDFS client and the DataNode have mechanical changes
to wire in the new SASL classes and call them.
* {{DateNode#checkSecureConfig}}: This is a new method for checking whether the
DataNode is starting in an acceptable secure configuration, either via
privileged ports or configuring SASL.
* hdfs-default.xml: I added documentation of the new properties for configuring
SASL on DataTransferProtocol.
* {{TestSaslDataTransfer}}: This is a new test that runs an embedded KDC,
starts a secured cluster and demonstrates that a client can request any of the
3 QOPs.
Here are a few discussion points I'd like to bring up:
* Our discussion up to this point has focused on the privileged port for
DataTransferProtocol. There is also the HTTP port to consider. My thinking on
this is that use of the new SASL configuration on a non-privileged port is only
acceptable if the configuration also uses SPNEGO for HTTP authentication. If
it was using token-based auth, then we'd be back to the same problem of sending
secret block access tokens to an unauthenticated process. (See TODO comment in
{{DataNode#checkSecureConfig}}.) My understanding is that SPNEGO establishes
mutual authentication, so checking for this ought to work fine. I'd love if
someone could confirm that independently.
* Previously, I mentioned renegotiating SASL between multiple block operations.
On further reflection, I no longer think this is necessary. The initial SASL
handshake establishes authentication of the server. For subsequent operations
on the same connection/underlying socket, I expect authentication of the remote
process wouldn't change. The privileged port check was intended to protect
against an attacker binding to the data transfer port after a DataNode process
stops. For an existing previously authenticated socket, we know that it's
still connected to the same process, so I don't think we need to renegotiate
SASL. Thoughts?
> Fix block protocol so that Datanodes don't require root or jsvc
> ---------------------------------------------------------------
>
> Key: HDFS-2856
> URL: https://issues.apache.org/jira/browse/HDFS-2856
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: datanode, security
> Reporter: Owen O'Malley
> Assignee: Chris Nauroth
> Attachments: Datanode-Security-Design.pdf,
> Datanode-Security-Design.pdf, Datanode-Security-Design.pdf,
> HDFS-2856.1.patch, HDFS-2856.prototype.patch
>
>
> Since we send the block tokens unencrypted to the datanode, we currently
> start the datanode as root using jsvc and get a secure (< 1024) port.
> If we have the datanode generate a nonce and send it on the connection and
> the sends an hmac of the nonce back instead of the block token it won't
> reveal any secrets. Thus, we wouldn't require a secure port and would not
> require root or jsvc.
--
This message was sent by Atlassian JIRA
(v6.2#6252)