[ 
https://issues.apache.org/jira/browse/HDFS-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14046317#comment-14046317
 ] 

Chris Nauroth commented on HDFS-2856:
-------------------------------------

Jitendra, thank you for taking a look at this patch.

bq. ...it seems the encrypted key is obtained from namenode via rpc for every 
block...

Actually, we cache the encryption key so that we don't need to keep repeating 
that RPC.  (This is true on current trunk and with my patch too.)  The key 
retrieval is now wrapped behind the {{DataEncryptionKeyFactory}} interface.  
There are 2 implementors of this: the {{DFSClient}} itself and the 
{{NameNodeConnector}} used by the balancer.  In both of those classes, if you 
look at the {{newDataEncryptionKey}} method, you'll see that they lazily fetch 
a key and cache it for as long as the key expiry.

bq. getEncryptedStreams doesn't use access token. IMO the user and the password 
should be derived from the accesstoken rather than the key.

Thanks for catching that.  This is a private method, so I can easily remove 
access token from the signature.  We can't change the user/password calculation 
for the encrypted case now without breaking compatibility.

bq. It might make sense to define the defaults for the new configuration 
variables in hdfs-default and/or as constants. It helps in code reading at 
times.

The patch documents the new properties dfs.data.transfer.protection and 
dfs.data.transfer.saslproperties.resolver.class in hdfs-default.xml.  The 
default values are set to empty/undefined.  I think this is what we want, 
because it's an opt-in feature.  Let me know if you had any other configuration 
properties in mind.

bq. Log.debug should be wrapped inside if (Log.isDebugEnabled()) condition.

The new classes use slf4j.  (There was some discussion on mailing lists a few 
months ago about starting to use this library in new classes.)  With slf4j, 
it's no longer necessary to check {{isDebugEnabled}}.  slf4j accepts string 
substitution variables using varargs, and it checks the log level internally 
first before doing any string concatenation.  Explicitly checking 
{{isDebugEnabled}} wouldn't provide any performance benefit.

bq. checkTrustAndSend obtains new encryption key, irrespective of the qop 
needed. I believe the encryption key is needed only for specialized encryption 
case.

The 2 implementations of {{DataEncryptionKeyFactory}} mentioned above only 
retrieve an encryption key if encryption is enabled (NameNode is configured 
with dfs.encrypt.data.transfer=true).  For a deployment configured with SASL on 
DataTransferProtocol, this will be false, so it won't actually get a key.  I'll 
put a comment in {{SaslDataTransferClient}} to clarify this.

bq. SaslDataTransferClient object in NameNodeConnector.java seems out of place, 
the NameNodeConnector is supposed to encapsulate only namenode connections. Can 
we avoid the saslClient in this class?

Yeah, what was I thinking there?  :-)  This is needed by the balancer for its 
DataNode communication when it needs to move blocks.  Let me see if I can move 
it right into the {{Balancer}} class.

bq. RemotePeerFactory.java: Javadoc needs update.

Will do.  Thanks for the catch.

bq. Minor nit: checkTrustAndSend returns null for skipping handshake which has 
to be checked in the caller. It could just return the same stream pair, which 
otherwise every caller has to do.

I actually need to use null as a sentinel value.  In {{peerSend}}, I need to 
know whether or not a SASL handshake was performed, and if so, wrap the peer in 
an instance of {{EncryptedPeer}} (which would be better named {{SaslPeer}} at 
this point, but we can refactor that later).  If I returned a non-null 
{{IOStreamPair}} always, then I wouldn't be able to do this check.

I'll get to work on a new revision that incorporates your feedback.  Thanks 
again!

> Fix block protocol so that Datanodes don't require root or jsvc
> ---------------------------------------------------------------
>
>                 Key: HDFS-2856
>                 URL: https://issues.apache.org/jira/browse/HDFS-2856
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode, security
>    Affects Versions: 3.0.0, 2.4.0
>            Reporter: Owen O'Malley
>            Assignee: Chris Nauroth
>         Attachments: Datanode-Security-Design.pdf, 
> Datanode-Security-Design.pdf, Datanode-Security-Design.pdf, 
> HDFS-2856-Test-Plan-1.pdf, HDFS-2856.1.patch, HDFS-2856.2.patch, 
> HDFS-2856.3.patch, HDFS-2856.4.patch, HDFS-2856.5.patch, 
> HDFS-2856.prototype.patch
>
>
> Since we send the block tokens unencrypted to the datanode, we currently 
> start the datanode as root using jsvc and get a secure (< 1024) port.
> If we have the datanode generate a nonce and send it on the connection and 
> the sends an hmac of the nonce back instead of the block token it won't 
> reveal any secrets. Thus, we wouldn't require a secure port and would not 
> require root or jsvc.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to