[
https://issues.apache.org/jira/browse/HBASE-16414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15434340#comment-15434340
]
Colin Ma commented on HBASE-16414:
----------------------------------
Hi, [~apurtell], thanks for your review and comments.
This change is target to improve the performance with SASL + Token
Authentication with encryption. For example, to access a secure HBase (Kerberos
enabled) in MapReduce, delegation token is used by getting from Kerberos token.
And token authentication is used by the tasks running on work node. For token
authentication, if the encryption is enable, the performance is bad because
DES, 3DES or RC4 is used for encryption. Replaced it with AES-NI, the
performance will be much improved, especially for scan operation. In my 3 nodes
cluster, YCSB shows 2.5X improvement with Scan. (The result is based on 100kb
transferred data, should be more improvement with more transferred data).
As to your question that why not use HBase's "AES", we had a look at the
current abstraction of the “AES” Cipher in HBase. For RPC data encryption, it
needs packet based encryption and decryption (usually a byte array) and every
packet needs the full packet data encrypted or decrypted. So a Cipher with
update interface is needed. Current “AES” Cipher interface in HBase focused to
provide a stream based (InputStream and OutputStream) API which is not good for
RPC packet based encryption. Considering Apache Commons Crypto can support both
JCE and OpenSSL with AES-NI optimization, we choose Apache Commons Crypto at
the Cipher layer.
For the implementation to replace SASL encryption, the Cipher negotiation for
AES key and IV is after the completion of the standard SASL negotiation. The
basic idea is to exchange key using the SASL wrap and unwrap (As SASL
establishment has been completed) and then use the new key to wrap and unwrap
further application data. So it doesn’t break the standard SASL negotiation. Of
course we can set a flag to turn on or off this optimization. If it is turn
off, HBase can still do the encryption with standard SASL wrap/unwrap for data
packet.
Thank you again for comments.
> Improve performance for RPC encryption with Apache Common Crypto
> ----------------------------------------------------------------
>
> Key: HBASE-16414
> URL: https://issues.apache.org/jira/browse/HBASE-16414
> Project: HBase
> Issue Type: Improvement
> Components: IPC/RPC
> Affects Versions: 2.0.0
> Reporter: Colin Ma
> Assignee: Colin Ma
> Attachments: HBASE-16414.001.patch, HbaseRpcEncryptionWithCrypoto.docx
>
>
> Hbase RPC encryption is enabled by setting “hbase.rpc.protection” to
> "privacy". With the token authentication, it utilized DIGEST-MD5 mechanisms
> for secure authentication and data protection. For DIGEST-MD5, it uses DES,
> 3DES or RC4 to do encryption and it is very slow, especially for Scan. This
> will become the bottleneck of the RPC throughput.
> Apache Commons Crypto is a cryptographic library optimized with AES-NI. It
> provides Java API for both cipher level and Java stream level. Developers can
> use it to implement high performance AES encryption/decryption with the
> minimum code and effort. Compare with the current implementation of
> org.apache.hadoop.hbase.io.crypto.aes.AES, Crypto supports both JCE Cipher
> and OpenSSL Cipher which is better performance than JCE Cipher. User can
> configure the cipher type and the default is JCE Cipher.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)