[ 
https://issues.apache.org/jira/browse/HDFS-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13695627#comment-13695627
 ] 

Chris Nauroth commented on HDFS-2856:
-------------------------------------

Thanks for the comments, everyone.  Let's discuss the SASL point first, because 
it could shift the design and make the specific questions about the proposed 
protocol change irrelevant.

{quote}
Did you consider at all scrapping our custom authentication protocol and 
instead switching to using straight SASL MD5-DIGEST for the 
DataTransferProtocol?
{quote}

Thanks for pointing out HDFS-3637.  After further review of that patch, I see 
how we can iterate on that.  I think it also has some benefits over the 
proposal that I posted: 1) consistency with authentication in the rest of the 
codebase, and 2) enabling encryption would defeat a man-in-the-middle attack 
without causing harm to intermediate proxy deployments like source address 
validation might cause.  I'd like to explore the SASL solution further.

The only potential downside I see is that if we ever pipeline multiple 
operations over a single connection, then we'd need to renegotiate SASL per 
operation, because the authorization decision may be different per block.  This 
doesn't seem like an insurmountable problem though.

I have a question about the compatibility impact of HDFS-3637.  I see that an 
upgraded client can talk to an old cluster, and an old client can talk to an 
upgraded cluster if encryption is off.  It looks like if it's an upgraded 
cluster and encryption is on, then {{DataXceiver}} will not run operations sent 
from unencrypted client connections, including connections initiated from an 
old client.  This implies that all clients must be upgraded before it's safe to 
turn on encryption in the cluster.  Do I understand correctly?  If so, can we 
relax this logic a bit to allow for compatibility of an old client connected to 
an upgraded cluster with SASL on?  The design doc proposed checking whether or 
not the datanode port is < 1024, and if so, then allow the old connection.  The 
thinking here is that anyone continuing to run on a port < 1024 must still have 
a component that hasn't upgraded, so therefore it needs to support the old 
connection.  Once datanode has been reconfigured to run on a port >= 1024, then 
all non-encrypted connections can be rejected.

Also, I wasn't sure about how the HDFS-3637 patch impacts compatibility for 
inter-datanode connections.  Is it possible to have a mix of old and upgraded 
datanodes running, some with encryption on and some with encryption off, or 
does it require a coordinated push to turn on encryption across the whole 
cluster?

We wanted to be conscious of backwards compatibility with this change, 
particularly for a rolling upgrade scenario.
                
> Fix block protocol so that Datanodes don't require root or jsvc
> ---------------------------------------------------------------
>
>                 Key: HDFS-2856
>                 URL: https://issues.apache.org/jira/browse/HDFS-2856
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode, security
>            Reporter: Owen O'Malley
>            Assignee: Chris Nauroth
>         Attachments: Datanode-Security-Design.pdf, 
> Datanode-Security-Design.pdf, Datanode-Security-Design.pdf
>
>
> Since we send the block tokens unencrypted to the datanode, we currently 
> start the datanode as root using jsvc and get a secure (< 1024) port.
> If we have the datanode generate a nonce and send it on the connection and 
> the sends an hmac of the nonce back instead of the block token it won't 
> reveal any secrets. Thus, we wouldn't require a secure port and would not 
> require root or jsvc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to