[ https://issues.apache.org/jira/browse/HDFS-6606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14125246#comment-14125246 ]
Yi Liu commented on HDFS-6606: ------------------------------ Thanks [~cnauroth] for review. You are right, this JIRA is fully compatible with the work in HDFS-2856. {quote} If I understand correctly, the DFSClient is still going to contact the NameNode to obtain an encryption key via ClientProtocol#getDataEncryptionKey when dfs.encrypt.data.transfer is true, but then the result wouldn't actually be used if a cipher is negotiated. It's a shame to keep around that extraneous RPC, but it's very small, and I don't see an easy way to change the code to avoid it. Maybe we could queue this up for future consideration. {quote} Right, the DFSClient is still going to contract the NN to obtain a key via {{ClientProtocol#getDataEncryptionKey}}. But the obtained key is still used even a cipher option is negotiated, we use it to encrypt the negotiated cipher key (using sasl wrap/unwrap); so the key obtained via {{getDataEncryptionKey}} is only used to encrypt the cipher key, and data is now encrypted by cipher key. The proposed approach doesn't add extra RPC and works for original configuration: If {{dfs.encrypt.data.transfer}} is true or {{dfs.data.transfer.protection}} (HDFS-2856 style) is *privacy*, dfs client and datanode will negotiate cipher for encryption. The tests you suggest are pretty good. We may only need to check the dfs client and datanode indeed negotiate cipher option for current implementation in these tests, since: * {{TestSaslDataTransfer}} and {{TestBalancerWithSaslDataTransfer}} already use {{dfs.data.transfer.protection}} and cover encryption test (_privacy_), and {{TestBalancerWithEncryptedTransfer}} already includes end-to-end tests with the balancer and the proposed approach works with original configuration. I will update the patch for your comments later. > Optimize HDFS Encrypted Transport performance > --------------------------------------------- > > Key: HDFS-6606 > URL: https://issues.apache.org/jira/browse/HDFS-6606 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, hdfs-client, security > Reporter: Yi Liu > Assignee: Yi Liu > Attachments: HDFS-6606.001.patch, HDFS-6606.002.patch, > HDFS-6606.003.patch, OptimizeHdfsEncryptedTransportperformance.pdf > > > In HDFS-3637, [~atm] added support for encrypting the DataTransferProtocol, > it was a great work. > It utilizes SASL {{Digest-MD5}} mechanism (use Qop: auth-conf), it supports > three security strength: > * high 3des or rc4 (128bits) > * medium des or rc4(56bits) > * low rc4(40bits) > 3des and rc4 are slow, only *tens of MB/s*, > http://www.javamex.com/tutorials/cryptography/ciphers.shtml > http://www.cs.wustl.edu/~jain/cse567-06/ftp/encryption_perf/ > I will give more detailed performance data in future. Absolutely it’s > bottleneck and will vastly affect the end to end performance. > AES(Advanced Encryption Standard) is recommended as a replacement of DES, > it’s more secure; with AES-NI support, the throughput can reach nearly > *2GB/s*, it won’t be the bottleneck any more, AES and CryptoCodec work is > supported in HADOOP-10150, HADOOP-10603 and HADOOP-10693 (We may need to add > a new mode support for AES). > This JIRA will use AES with AES-NI support as encryption algorithm for > DataTransferProtocol. -- This message was sent by Atlassian JIRA (v6.3.4#6332)