[
https://issues.apache.org/jira/browse/HADOOP-14104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15903942#comment-15903942
]
Andrew Wang commented on HADOOP-14104:
--------------------------------------
Here's my review on v2, along with responses to some of Yongjun's questions.
Thanks for working on this Rushabh!
bq. in createWrappedOutputStream(, where conf is the configuration of local
cluster. There is a possibility that the local configuration is different than
remote cluster's. So it's possible to fail here.
The conf is only used to configure the CryptoCodec, which does not need a KMS
URI. I think this is okay, all the CC parameters are included in the {{feInfo}}.
bq. @awang would you please confirm if it's ok to do so since this class is
public), and use this constant at multiple places that current uses ""
Yea it's fine. I would like to improve this in this patch if possible, since it
removes redundancies.
bq. 3. Notice that "dfs.encryption.key.provider.uri" is deprecated and replaced
with hadoop.security.key.provider.path (see HDFS-10489). So suggest to replace
variable name keyProviderUri with keyProviderPath
I think we can ignore this for now like Rushabh said, can handle it in another
JIRA if necessary.
bq. Seems we need a similar change in WebHdfsFileSystem when calling
addDelegationTokens
The DN does the encryption/decryption in WebHDFS, so the client doesn't need to
do any KMS communication.
It does bring up a question regarding the DN DFSClient though. It looks like
WebHdfsHandler creates a new DFSClient each time, which means we won't benefit
from getServerDefaults caching.
Is the fix to make config preferred over getServerDefaults?
bq. <StandbyException>
Does this exception actually make it to clients? The HA RPC proxy normally
catches the StandbyException and fails over to the other NN. We can write a
unit test for this to verify if we're unsure.
My additional review comments:
* typo nameonde
* ServerDefaults#getKeyProviderUri needs javadoc explaining how to interpret
null and empty (IIUC null means not set, empty means means set but not enabled)
* In docs, "DFSClients" is an internal name. Please rename to say "HDFS
clients" or similar. Same for "dfs clients" in core-default.xml
* There are a lot of whitespace errors, please take a look at what's flagged by
checkstyle. Recommend using IDE autoformatting in the future.
* An actual mini-cluster test that mimics an MR job submitter and task's call
pattern would also be good.
TestEncryptionZones:
* Would like to see the test reversed so it covers the fallback behavior, i.e.
* set client config with kp1, check that it returns kp1
* mock getServerDefaults() with kp2, check it returns kp2
* set Credentials with kp3, check it returns kp3
* typo originalNameodeUri
* {{String lookUpKey = DFSClient.DFS_KMS_PREFIX +
originalNameodeUri.toString();}} should this be a {{getKey}} helper method in
DFSClient rather than having the test code also construct the key?
> Client should always ask namenode for kms provider path.
> --------------------------------------------------------
>
> Key: HADOOP-14104
> URL: https://issues.apache.org/jira/browse/HADOOP-14104
> Project: Hadoop Common
> Issue Type: Improvement
> Components: kms
> Reporter: Rushabh S Shah
> Assignee: Rushabh S Shah
> Attachments: HADOOP-14104-trunk.patch, HADOOP-14104-trunk-v1.patch,
> HADOOP-14104-trunk-v2.patch, HADOOP-14104-trunk-v3.patch
>
>
> According to current implementation of kms provider in client conf, there can
> only be one kms.
> In multi-cluster environment, if a client is reading encrypted data from
> multiple clusters it will only get kms token for local cluster.
> Not sure whether the target version is correct or not.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]