[ 
https://issues.apache.org/jira/browse/HDFS-6606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14125246#comment-14125246
 ] 

Yi Liu commented on HDFS-6606:
------------------------------

Thanks [~cnauroth] for review. You are right, this JIRA is fully compatible 
with the work in HDFS-2856.

{quote}
If I understand correctly, the DFSClient is still going to contact the NameNode 
to obtain an encryption key via ClientProtocol#getDataEncryptionKey when 
dfs.encrypt.data.transfer is true, but then the result wouldn't actually be 
used if a cipher is negotiated. It's a shame to keep around that extraneous 
RPC, but it's very small, and I don't see an easy way to change the code to 
avoid it. Maybe we could queue this up for future consideration.
{quote}
Right, the DFSClient is still going to contract the NN to obtain a key via 
{{ClientProtocol#getDataEncryptionKey}}. But the obtained key is still used 
even a cipher option is negotiated, we use it to encrypt the negotiated cipher 
key (using sasl wrap/unwrap); so the key obtained via {{getDataEncryptionKey}} 
is only used to encrypt the cipher key, and data is now encrypted by cipher key.

The proposed approach doesn't add extra RPC and works for original 
configuration: If {{dfs.encrypt.data.transfer}} is true or 
{{dfs.data.transfer.protection}} (HDFS-2856 style) is *privacy*, dfs client and 
datanode will negotiate cipher for encryption.

The tests you suggest are pretty good. We may only need to check the dfs client 
and datanode indeed negotiate cipher option for current implementation in these 
tests, since:
* {{TestSaslDataTransfer}} and {{TestBalancerWithSaslDataTransfer}} already use 
{{dfs.data.transfer.protection}} and cover encryption test (_privacy_), and  
{{TestBalancerWithEncryptedTransfer}} already includes end-to-end tests with 
the balancer and the proposed approach works with original configuration.

I will update the patch for your comments later.

> Optimize HDFS Encrypted Transport performance
> ---------------------------------------------
>
>                 Key: HDFS-6606
>                 URL: https://issues.apache.org/jira/browse/HDFS-6606
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode, hdfs-client, security
>            Reporter: Yi Liu
>            Assignee: Yi Liu
>         Attachments: HDFS-6606.001.patch, HDFS-6606.002.patch, 
> HDFS-6606.003.patch, OptimizeHdfsEncryptedTransportperformance.pdf
>
>
> In HDFS-3637, [~atm] added support for encrypting the DataTransferProtocol, 
> it was a great work.
> It utilizes SASL {{Digest-MD5}} mechanism (use Qop: auth-conf),  it supports 
> three security strength:
> * high                      3des   or rc4 (128bits)
> * medium             des or rc4(56bits)
> * low                       rc4(40bits)
> 3des and rc4 are slow, only *tens of MB/s*, 
> http://www.javamex.com/tutorials/cryptography/ciphers.shtml
> http://www.cs.wustl.edu/~jain/cse567-06/ftp/encryption_perf/
> I will give more detailed performance data in future. Absolutely it’s 
> bottleneck and will vastly affect the end to end performance. 
> AES(Advanced Encryption Standard) is recommended as a replacement of DES, 
> it’s more secure; with AES-NI support, the throughput can reach nearly 
> *2GB/s*, it won’t be the bottleneck any more, AES and CryptoCodec work is 
> supported in HADOOP-10150, HADOOP-10603 and HADOOP-10693 (We may need to add 
> a new mode support for AES). 
> This JIRA will use AES with AES-NI support as encryption algorithm for 
> DataTransferProtocol.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to