[ 
https://issues.apache.org/jira/browse/HDFS-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-2856:
--------------------------------

    Attachment: HDFS-2856.1.patch

I'm uploading a patch that implements the ideas described in the past several 
comments.  I'm still in progress on more tests and several TODOs, but any 
feedback at this point is welcome.  Pinging [~owen.omalley], [~lmccay], [~jnp] 
and [~atm] for potential feedback.

It's a big patch.  I did a lot of refactoring to avoid code duplication between 
the general-purpose SASL flow and our existing specialized encrypted SASL flow. 
 If this is too cumbersome to review at once, then I can split some of the 
refactorings into separate patches on request.

Summary of changes:
* {{DataTransferEncryptor}}: I deleted this class.  The code has been 
refactored into various new classes in a new 
{{org.apache.hadoop.hdfs.protocol.datatransfer.sasl}} sub-package.  The 
presence of the word "encrypt" in this class name would have been potentially 
misleading, because we're now allowing DataTransferProtocol to support a 
quality of protection different from auth-conf.
* {{SaslDataTransferClient}}: This class now implements the client side of SASL 
negotiation, whether using the general-purpose SASL handshake or our existing 
specialized encrypted handshake.  This class is called by the HDFS client and 
also by the DataNode when acting as a client to another DataNode.  The logic 
for deciding whether or not to do a SASL handshake, and if so which kind of 
handshake, has become somewhat complex.  By encapsulating it behind this class, 
we avoid repeating that logic at multiple points in the rest of the code.
* {{SaslDataTransferServer}}: This class now implements the server side of SASL 
negotiation.  This is only called by the DataNode when receiving new 
connections.  Similar to the above, this is a single point for encapsulating 
the logic of deciding which SASL handshake to use.
* {{DataTransferSaslUtil}}: This contains various helper functions needed by 
the SASL classes.
* Various classes of the HDFS client and the DataNode have mechanical changes 
to wire in the new SASL classes and call them.
* {{DateNode#checkSecureConfig}}: This is a new method for checking whether the 
DataNode is starting in an acceptable secure configuration, either via 
privileged ports or configuring SASL.
* hdfs-default.xml: I added documentation of the new properties for configuring 
SASL on DataTransferProtocol.
* {{TestSaslDataTransfer}}: This is a new test that runs an embedded KDC, 
starts a secured cluster and demonstrates that a client can request any of the 
3 QOPs.

Here are a few discussion points I'd like to bring up:
* Our discussion up to this point has focused on the privileged port for 
DataTransferProtocol.  There is also the HTTP port to consider.  My thinking on 
this is that use of the new SASL configuration on a non-privileged port is only 
acceptable if the configuration also uses SPNEGO for HTTP authentication.  If 
it was using token-based auth, then we'd be back to the same problem of sending 
secret block access tokens to an unauthenticated process.  (See TODO comment in 
{{DataNode#checkSecureConfig}}.)  My understanding is that SPNEGO establishes 
mutual authentication, so checking for this ought to work fine.  I'd love if 
someone could confirm that independently.
* Previously, I mentioned renegotiating SASL between multiple block operations. 
 On further reflection, I no longer think this is necessary.  The initial SASL 
handshake establishes authentication of the server.  For subsequent operations 
on the same connection/underlying socket, I expect authentication of the remote 
process wouldn't change.  The privileged port check was intended to protect 
against an attacker binding to the data transfer port after a DataNode process 
stops.  For an existing previously authenticated socket, we know that it's 
still connected to the same process, so I don't think we need to renegotiate 
SASL.  Thoughts?


> Fix block protocol so that Datanodes don't require root or jsvc
> ---------------------------------------------------------------
>
>                 Key: HDFS-2856
>                 URL: https://issues.apache.org/jira/browse/HDFS-2856
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode, security
>            Reporter: Owen O'Malley
>            Assignee: Chris Nauroth
>         Attachments: Datanode-Security-Design.pdf, 
> Datanode-Security-Design.pdf, Datanode-Security-Design.pdf, 
> HDFS-2856.1.patch, HDFS-2856.prototype.patch
>
>
> Since we send the block tokens unencrypted to the datanode, we currently 
> start the datanode as root using jsvc and get a secure (< 1024) port.
> If we have the datanode generate a nonce and send it on the connection and 
> the sends an hmac of the nonce back instead of the block token it won't 
> reveal any secrets. Thus, we wouldn't require a secure port and would not 
> require root or jsvc.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to