[
https://issues.apache.org/jira/browse/HADOOP-14808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
John Zhuge updated HADOOP-14808:
--------------------------------
Attachment: HADOOP-14808.001.patch
Patch 001
* Common code to load keychain from KeychainLoader services
* Add CredentialProviderKeychainLoader to load credentials from credential
providers. The JCEKS stores can be encrypted.
* Add AzureCLIKeychainLoader to load credentials from Azure CLI token cache
file. The probably should be committed in a separate JIRA.
Testing Done
* Preparation
** Add S3A credentials to an encrypted JCEKS store:
/Users/jzhuge/.config/hadoop/keychain.jceks
** Run Azure CLI to login as an end user. For details on Azure CLI 2.0, refer
to
https://docs.microsoft.com/en-us/cli/azure/install-azure-cli?view=azure-cli-latest.
* List an S3 bucket
* List an ADLS file system
* DistCp from S3 bucket to ADLS
{noformat}
$ hadoop credential list -provider
localjceks://file/Users/jzhuge/.config/hadoop/keychain.jceks
Listing aliases for CredentialProvider:
localjceks://file/Users/jzhuge/.config/hadoop/keychain.jceks
fs.s3a.secret.key
fs.s3a.access.key
$ az login -u [email protected]
Password:
$ hadoop fs -Dfs.adl.oauth2.access.token.provider.type=RefreshToken -ls
adl://store.azuredatalakestore.net/
2017-09-04 21:41:41,355 INFO security.CredentialProviderKeychainLoader: loading
credentials from localjceks://file/Users/jzhuge/.config/hadoop/keychain.jceks
2017-09-04 21:41:41,568 INFO security.CredentialProviderKeychainLoader: loading
secret fs.s3a.secret.key
2017-09-04 21:41:41,573 INFO security.CredentialProviderKeychainLoader: loading
secret fs.s3a.access.key
2017-09-04 21:41:41,576 INFO adl.AzureCLIKeychainLoader: loading the credential
from Azure CLI /Users/jzhuge/.azure/accessTokens.json
2017-09-04 21:41:41,749 INFO adl.AzureCLIKeychainLoader: loading refresh token
for [email protected]
Found 12 items
drwxr-xr-x+ - a8005e54-3276-4b9f-b500-0cf2272a0634
6c4d58c1-4e75-40e1-b7a2-e97ff15c6f31 0 2017-07-26 23:47
adl://store.azuredatalakestore.net/dict
drwxr-xr-x+ - a8005e54-3276-4b9f-b500-0cf2272a0634
6c4d58c1-4e75-40e1-b7a2-e97ff15c6f31 0 2017-08-03 00:43
adl://store.azuredatalakestore.net/keychain_test
drwxr-xr-x+ - a0c43012-fd2a-42a3-90e9-0649584176c0
6c4d58c1-4e75-40e1-b7a2-e97ff15c6f31 0 2017-08-13 00:35
adl://store.azuredatalakestore.net/test
-rw-r--r--+ 1 a0c43012-fd2a-42a3-90e9-0649584176c0
6c4d58c1-4e75-40e1-b7a2-e97ff15c6f31 0 2017-08-04 23:28
adl://store.azuredatalakestore.net/testRmRootRecursive
$ hadoop fs -ls s3a://bucket/
2017-09-04 21:41:52,947 INFO security.CredentialProviderKeychainLoader: loading
credentials from localjceks://file/Users/jzhuge/.config/hadoop/keychain.jceks
2017-09-04 21:41:53,200 INFO security.CredentialProviderKeychainLoader: loading
secret fs.s3a.secret.key
2017-09-04 21:41:53,204 INFO security.CredentialProviderKeychainLoader: loading
secret fs.s3a.access.key
2017-09-04 21:41:53,207 INFO adl.AzureCLIKeychainLoader: loading the credential
from Azure CLI /Users/jzhuge/.azure/accessTokens.json
2017-09-04 21:41:53,388 INFO adl.AzureCLIKeychainLoader: loading refresh token
for [email protected]
2017-09-04 21:41:55,033 INFO Configuration.deprecation:
fs.s3a.server-side-encryption-key is deprecated. Instead, use
fs.s3a.server-side-encryption.key
Found 9 items
drwxrwxrwx - jzhuge jzhuge 0 2017-09-04 21:41 s3a://bucket/Users
drwxrwxrwx - jzhuge jzhuge 0 2017-09-04 21:41
s3a://bucket/keychain_test
drwxrwxrwx - jzhuge jzhuge 0 2017-09-04 21:41 s3a://bucket/test
drwxrwxrwx - jzhuge jzhuge 0 2017-09-04 21:41 s3a://bucket/tg
$ hadoop distcp -Dfs.adl.oauth2.access.token.provider.type=RefreshToken
s3a://bucket/tg adl://store.azuredatalakestore.net/tg.cp
2017-09-04 21:53:35,082 INFO security.CredentialProviderKeychainLoader: loading
credentials from localjceks://file/Users/jzhuge/.config/hadoop/keychain.jceks
2017-09-04 21:53:35,210 INFO security.CredentialProviderKeychainLoader: loading
secret fs.s3a.secret.key
2017-09-04 21:53:35,214 INFO security.CredentialProviderKeychainLoader: loading
secret fs.s3a.access.key
2017-09-04 21:53:35,216 INFO adl.AzureCLIKeychainLoader: loading the credential
from Azure CLI /Users/jzhuge/.azure/accessTokens.json
2017-09-04 21:53:35,380 INFO adl.AzureCLIKeychainLoader: loading refresh token
for [email protected]
2017-09-04 21:53:36,541 INFO tools.DistCp: Input Options:
DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false,
ignoreFailures=false, overwrite=false, append=false, useDiff=false,
useRdiff=false, fromSnapshot=null, toSnapshot=null, skipCRC=false,
blocking=true, numListstatusThreads=0, maxMaps=20, mapBandwidth=0.0,
copyStrategy='uniformsize', preserveStatus=[BLOCKSIZE], atomicWorkPath=null,
logPath=null, sourceFileListing=null, sourcePaths=[s3a://bucket/tg],
targetPath=adl://store.azuredatalakestore.net/tg.cp, filtersFile='null',
blocksPerChunk=0, copyBufferSize=8192}, sourcePaths=[s3a://bucket/tg],
targetPathExists=false, preserveRawXattrsfalse
...
{noformat}
> Hadoop keychain
> ---------------
>
> Key: HADOOP-14808
> URL: https://issues.apache.org/jira/browse/HADOOP-14808
> Project: Hadoop Common
> Issue Type: New Feature
> Components: security
> Affects Versions: 2.7.0
> Reporter: John Zhuge
> Assignee: John Zhuge
> Attachments: HADOOP-14808.001.patch
>
>
> Extend the idea from HADOOP-6520 "UGI should load tokens from the
> environment" to a generic lightweight "keychain" design. Load keys (secrets)
> into a keychain in UGI (secret map) at startup. YARN will distribute them
> securely into each container. The Hadoop code running in the container can
> then retrieve the credentials from UGI.
> The use case is Bring Your Own Key (BYOK) credentials for cloud connectors
> (adl, wasb, s3a, etc.), while Hadoop authentication is still Kerberos. No
> configuration change, no admin involved. It will support YARN applications
> initially, e.g., DistCp, Tera Suite, Spark-on-Yarn, etc.
> Implementation is surprisingly simple because almost all pieces are in place:
> * Retrieve secrets from UGI using {{conf.getPassword}} backed by the existing
> Credential Provider class {{UserProvider}}
> * Reuse Credential Provider classes and interface to define local permanent
> or transient credential store, e.g., {{LocalJavaKeyStoreProvider}}
> * New: create a new transient Credential Provider that logs into AAD with
> username/password or device code, and then put the Client ID and Refresh
> Token into the keychain
> * New: create a new permanent Credential Provider based on Hadoop
> configuration XML, for dev/testing purpose.
> Links
> * HADOOP-11766 Generic token authentication support for Hadoop
> * HADOOP-11744 Support OAuth2 in Hadoop
> * HADOOP-10959 A Kerberos based token authentication approach
> * HADOOP-9392 Token based authentication and Single Sign On
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]