[
https://issues.apache.org/jira/browse/HADOOP-16119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16785731#comment-16785731
]
Wei-Chiu Chuang commented on HADOOP-16119:
------------------------------------------
Hi [~hexiaoqiao] really appreciate your insights!
Regarding delegation tokens – delegation tokens are stored in zookeeper, and
after HADOOP-14445, delegation tokens are shared among KMS instances.
Key store consistency – I am not sure how others use KMS. But within CDH, we
have a plugin that directs the requests to a backend server "Cloudera
KeyTrustee Server". Essentially, KMS serves as a proxy for the backend.
Therefore consistency is guaranteed.
Cloudera KeyTrustee Server is currently a proprietary component. But it sounds
like Cloudera will eventually become "100% open source", so that's an option
for you. I think your proposal makes sense. I am just not sure how much work
will it require. At Cloudera there is a team dedicated to Cloudera KeyTrustee
Server development, so I imagine it's a non-trivial amount of work.
IMHO, I am looking forward to a good persistent+consistent key store too, if we
can come up with a good design. In fact, I am concerned about CKTS performance
under the said load.
[~anu] [~xyao] does the Sentry KMS provide a persistent+consistent key store by
any chance?
Adding/removing a KMS instance requires client side change, that is correct.
Currently that requires a cluster-wide rolling restart. I imagine we could use
NameNode's FsServerDefaults to update that dynamically.
I am not clear about the HA argument. In the current design, a KMS connection
is not "sticky", meaning that regardless of the KMS status, _each KMS request_
would have an equal probability to attempt to reach a dead KMS. Is that what
you meant?
> KMS on Hadoop RPC Engine
> ------------------------
>
> Key: HADOOP-16119
> URL: https://issues.apache.org/jira/browse/HADOOP-16119
> Project: Hadoop Common
> Issue Type: New Feature
> Reporter: Jonathan Eagles
> Assignee: Wei-Chiu Chuang
> Priority: Major
> Attachments: Design doc_ KMS v2.pdf
>
>
> Per discussion on common-dev and text copied here for ease of reference.
> https://lists.apache.org/thread.html/0e2eeaf07b013f17fad6d362393f53d52041828feec53dcddff04808@%3Ccommon-dev.hadoop.apache.org%3E
> {noformat}
> Thanks all for the inputs,
> To offer additional information (while Daryn is working on his stuff),
> optimizing RPC encryption opens up another possibility: migrating KMS
> service to use Hadoop RPC.
> Today's KMS uses HTTPS + REST API, much like webhdfs. It has very
> undesirable performance (a few thousand ops per second) compared to
> NameNode. Unfortunately for each NameNode namespace operation you also need
> to access KMS too.
> Migrating KMS to Hadoop RPC greatly improves its performance (if
> implemented correctly), and RPC encryption would be a prerequisite. So
> please keep that in mind when discussing the Hadoop RPC encryption
> improvements. Cloudera is very interested to help with the Hadoop RPC
> encryption project because a lot of our customers are using at-rest
> encryption, and some of them are starting to hit KMS performance limit.
> This whole "migrating KMS to Hadoop RPC" was Daryn's idea. I heard this
> idea in the meetup and I am very thrilled to see this happening because it
> is a real issue bothering some of our customers, and I suspect it is the
> right solution to address this tech debt.
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]