[ https://issues.apache.org/jira/browse/HADOOP-10433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13984101#comment-13984101 ]
Andrew Wang commented on HADOOP-10433: -------------------------------------- Thanks for the rev here Tucu, I can tell this wasn't a small effort. Overall it looks real nice, the new REST API is fine by me. I have some more review comments though, but expect to +1 after this round (pending agreement by reviewers on the REST API details). Running the KMS: * Overall a more pleasant experience, tarball builds, audit logging has newlines, I didn't see stacktraces when running commands. * I know I asked for less spew in kms.sh, but defaulting KMS_SILENT to true is perhaps too little. Without it, running just "./sbin/kms.sh" shows nothing, not even help text, and if start/stop hit an error, we again see nothing (have to know where the logs are and hope it has the error there). This isn't a biggie, but maybe we can improve it a little more. JSON: * I see KMSMetadata and KMSKeyVersion classes, but they don't look like they do much. I'm guessing this is from an aborted attempt to do POJO<->JSON? That'd would be awesome, but I won't get too excited. * If you stick with the JSON helper methods, could we pull them all out into a separate helper class and make them all static, along with the field constants? Feels weird to have the Server digging into ClientProvider for field names, when theoretically a different client implementation (REST!) could be interfacing with the KMS. KMSServer and KMSClientProvider: * Noticed that KMSServer#getKeyMetadata includes the name, but it's unused on the client side, nor part of KeyProvider#Metadata. * I think we still miss audit logging if an exception is thrown. e.g. in #rolloverKey, getPrincipal, checkNotEmpty, provider.rollNewVersion can all throw something. This is why a try/finally might be easiest. * Can make many methods static: validateResponse, call, all the JSON functions, getPrincipal, assertAccess, removeKeyMaterial, getKeyURI * On the NoSuchAlgorithmException, #validateResponse casts everything to an IOException, so I don't think anything else comes out of this function. Not sure this cast is kosher. I think the catch for Exception there also needs to wrap the created String into an exception and throw it. * Given the discussion about UUIDs as versions, should we be exposing KMSClientProvider#buildVersionName publicly? This seems like an implementation detail of the backing KeyProvider. Misc things: * I think a rogue regex changed GET to GET_OP across the entire hadoop repo, which is why we're picking up all these test changes, and quite possibly why we have some weird test failures. * KMSACLs, the new comment about in #run probably should go in #loadACLs, or be removed. * KeyNotFoundException serialVersionUID, this is something Eclipse is warning about, so it'd be nice to just add it. * In the cache provider, #getCurrentKey and getKeyVersion, could set the cause once in a temp var to avoid having to call getCause() repeatedly. * KMSServer semantically should still be renamed to KMServer or just KMS > Key Management Server based on KeyProvider API > ---------------------------------------------- > > Key: HADOOP-10433 > URL: https://issues.apache.org/jira/browse/HADOOP-10433 > Project: Hadoop Common > Issue Type: Improvement > Components: security > Affects Versions: 3.0.0 > Reporter: Alejandro Abdelnur > Assignee: Alejandro Abdelnur > Attachments: HADOOP-10433.patch, HADOOP-10433.patch, > HADOOP-10433.patch, HADOOP-10433.patch, HADOOP-10433.patch, > HADOOP-10433.patch, HADOOP-10433.patch, HADOOP-10433.patch, > HadoopKMSDocsv2.pdf, KMS-doc.pdf > > > (from HDFS-6134 proposal) > Hadoop KMS is the gateway, for Hadoop and Hadoop clients, to the underlying > KMS. It provides an interface that works with existing Hadoop security > components (authenticatication, confidentiality). > Hadoop KMS will be implemented leveraging the work being done in HADOOP-10141 > and HADOOP-10177. > Hadoop KMS will provide an additional implementation of the Hadoop > KeyProvider class. This implementation will be a client-server implementation. > The client-server protocol will be secure: > * Kerberos HTTP SPNEGO (authentication) > * HTTPS for transport (confidentiality and integrity) > * Hadoop ACLs (authorization) > The Hadoop KMS implementation will not provide additional ACL to access > encrypted files. For sophisticated access control requirements, HDFS ACLs > (HDFS-4685) should be used. > Basic key administration will be supported by the Hadoop KMS via the, already > available, Hadoop KeyShell command line tool > There are minor changes that must be done in Hadoop KeyProvider functionality: > The KeyProvider contract, and the existing implementations, must be > thread-safe > KeyProvider API should have an API to generate the key material internally > JavaKeyStoreProvider should use, if present, a password provided via > configuration > KeyProvider Option and Metadata should include a label (for easier > cross-referencing) > To avoid overloading the underlying KeyProvider implementation, the Hadoop > KMS will cache keys using a TTL policy. > Scalability and High Availability of the Hadoop KMS can achieved by running > multiple instances behind a VIP/Load-Balancer. For High Availability, the > underlying KeyProvider implementation used by the Hadoop KMS must be High > Available. -- This message was sent by Atlassian JIRA (v6.2#6252)