[
https://issues.apache.org/jira/browse/HADOOP-12725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15179231#comment-15179231
]
Jerry Chen commented on HADOOP-12725:
-------------------------------------
We need to aware that there are several aspects related to Hadoop RPC
encryption optimization. The current discussions by now focus on the GSSAPI
used in SASL Kerberos mechanism. And trying to optimize the GSSAPI internally.
For Hadoop client, Kerberos method is usually used as the first step of
authentication to gain the access to the system. While different use cases
follows a different pattern in the following steps. For example, A MapReduce
job do Kerberos authentication only at Job submission and do DIGEST-MD5
authentication using delegation token in all the tasks. HBase or other services
may follow a different pattern. While for MapReduce job case, it obviously more
important or at least the same important to optimize also DIGEST-MD5 auth-conf
implementation. From our experiment done on Spark RPC encryption, DIGEST-MD5
doesn't support AES algorithm. Also taking one of its algorithm 3DES for
example, 3DES is very slow and possible has only 10 - 20 Mb/s throughput.
So for RPC encryption optimization, it should be considered overall from the
beginning.
As far as I see now, there might be two approaches:
1. Optimize the individual mechanism separately. (GSSAPI, DIGEST-MD5, ...).
The current discussion above to fit into this but only for Kerberos related
GSSAPI.
2. Optimize on top of individual mechanisms and build its own auth-conf layer
with AES-NI optimization.
In HADOOP-10768, Andrew Purtell mentioned this approach: One could wrap the
initial payloads with whatever encryption was negotiated during connection
initiation until completing additional key exchange and negotiation steps, then
switch to an alternate means of applying a symmetric cipher to RPC payloads."
and HDFS-6606 also took this the approach to optimize data transfer encryption.
The #2 option has the advantage that Hadoop RPC implementation has control on
all the optimizations and will not depend on under-layer mechanism optimization.
> RPC encryption benchmark and optimization prototypes
> ----------------------------------------------------
>
> Key: HADOOP-12725
> URL: https://issues.apache.org/jira/browse/HADOOP-12725
> Project: Hadoop Common
> Issue Type: Improvement
> Reporter: Kai Zheng
> Assignee: Wei Zhou
>
> This would implement a benchmark tool to measure and compare the performance
> of Hadoop IPC/RPC call when security is enabled and different SASL
> QOP(Quality of Protection) is enforced. Given the data collected by this
> benchmark, it would then be able to know if any performance concern when
> considering to enforce privacy, integration, or authenticy protection level,
> and do optimization accordingly.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)