[ 
https://issues.apache.org/jira/browse/HADOOP-14808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HADOOP-14808:
--------------------------------
    Attachment: HADOOP-14808.001.patch

Patch 001
* Common code to load keychain from KeychainLoader services
* Add CredentialProviderKeychainLoader to load credentials from credential 
providers. The JCEKS stores can be encrypted.
* Add AzureCLIKeychainLoader to load credentials from Azure CLI token cache 
file. The probably should be committed in a separate JIRA.

Testing Done
* Preparation
** Add S3A credentials to an encrypted JCEKS store: 
/Users/jzhuge/.config/hadoop/keychain.jceks
** Run Azure CLI to login as an end user. For details on Azure CLI 2.0, refer 
to 
https://docs.microsoft.com/en-us/cli/azure/install-azure-cli?view=azure-cli-latest.
* List an S3 bucket
* List an ADLS file system
* DistCp from S3 bucket to ADLS

{noformat}
$ hadoop credential list -provider 
localjceks://file/Users/jzhuge/.config/hadoop/keychain.jceks
Listing aliases for CredentialProvider: 
localjceks://file/Users/jzhuge/.config/hadoop/keychain.jceks
fs.s3a.secret.key
fs.s3a.access.key

$ az login -u [email protected]
Password: 

$ hadoop fs -Dfs.adl.oauth2.access.token.provider.type=RefreshToken -ls 
adl://store.azuredatalakestore.net/
2017-09-04 21:41:41,355 INFO security.CredentialProviderKeychainLoader: loading 
credentials from localjceks://file/Users/jzhuge/.config/hadoop/keychain.jceks
2017-09-04 21:41:41,568 INFO security.CredentialProviderKeychainLoader: loading 
secret fs.s3a.secret.key
2017-09-04 21:41:41,573 INFO security.CredentialProviderKeychainLoader: loading 
secret fs.s3a.access.key
2017-09-04 21:41:41,576 INFO adl.AzureCLIKeychainLoader: loading the credential 
from Azure CLI /Users/jzhuge/.azure/accessTokens.json
2017-09-04 21:41:41,749 INFO adl.AzureCLIKeychainLoader: loading refresh token 
for [email protected]
Found 12 items
drwxr-xr-x+  - a8005e54-3276-4b9f-b500-0cf2272a0634 
6c4d58c1-4e75-40e1-b7a2-e97ff15c6f31          0 2017-07-26 23:47 
adl://store.azuredatalakestore.net/dict
drwxr-xr-x+  - a8005e54-3276-4b9f-b500-0cf2272a0634 
6c4d58c1-4e75-40e1-b7a2-e97ff15c6f31          0 2017-08-03 00:43 
adl://store.azuredatalakestore.net/keychain_test
drwxr-xr-x+  - a0c43012-fd2a-42a3-90e9-0649584176c0 
6c4d58c1-4e75-40e1-b7a2-e97ff15c6f31          0 2017-08-13 00:35 
adl://store.azuredatalakestore.net/test
-rw-r--r--+  1 a0c43012-fd2a-42a3-90e9-0649584176c0 
6c4d58c1-4e75-40e1-b7a2-e97ff15c6f31          0 2017-08-04 23:28 
adl://store.azuredatalakestore.net/testRmRootRecursive

$ hadoop fs -ls s3a://bucket/
2017-09-04 21:41:52,947 INFO security.CredentialProviderKeychainLoader: loading 
credentials from localjceks://file/Users/jzhuge/.config/hadoop/keychain.jceks
2017-09-04 21:41:53,200 INFO security.CredentialProviderKeychainLoader: loading 
secret fs.s3a.secret.key
2017-09-04 21:41:53,204 INFO security.CredentialProviderKeychainLoader: loading 
secret fs.s3a.access.key
2017-09-04 21:41:53,207 INFO adl.AzureCLIKeychainLoader: loading the credential 
from Azure CLI /Users/jzhuge/.azure/accessTokens.json
2017-09-04 21:41:53,388 INFO adl.AzureCLIKeychainLoader: loading refresh token 
for [email protected]
2017-09-04 21:41:55,033 INFO Configuration.deprecation: 
fs.s3a.server-side-encryption-key is deprecated. Instead, use 
fs.s3a.server-side-encryption.key
Found 9 items
drwxrwxrwx   - jzhuge jzhuge          0 2017-09-04 21:41 s3a://bucket/Users
drwxrwxrwx   - jzhuge jzhuge          0 2017-09-04 21:41 
s3a://bucket/keychain_test
drwxrwxrwx   - jzhuge jzhuge          0 2017-09-04 21:41 s3a://bucket/test
drwxrwxrwx   - jzhuge jzhuge          0 2017-09-04 21:41 s3a://bucket/tg

$ hadoop distcp -Dfs.adl.oauth2.access.token.provider.type=RefreshToken 
s3a://bucket/tg adl://store.azuredatalakestore.net/tg.cp
2017-09-04 21:53:35,082 INFO security.CredentialProviderKeychainLoader: loading 
credentials from localjceks://file/Users/jzhuge/.config/hadoop/keychain.jceks
2017-09-04 21:53:35,210 INFO security.CredentialProviderKeychainLoader: loading 
secret fs.s3a.secret.key
2017-09-04 21:53:35,214 INFO security.CredentialProviderKeychainLoader: loading 
secret fs.s3a.access.key
2017-09-04 21:53:35,216 INFO adl.AzureCLIKeychainLoader: loading the credential 
from Azure CLI /Users/jzhuge/.azure/accessTokens.json
2017-09-04 21:53:35,380 INFO adl.AzureCLIKeychainLoader: loading refresh token 
for [email protected]
2017-09-04 21:53:36,541 INFO tools.DistCp: Input Options: 
DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, 
ignoreFailures=false, overwrite=false, append=false, useDiff=false, 
useRdiff=false, fromSnapshot=null, toSnapshot=null, skipCRC=false, 
blocking=true, numListstatusThreads=0, maxMaps=20, mapBandwidth=0.0, 
copyStrategy='uniformsize', preserveStatus=[BLOCKSIZE], atomicWorkPath=null, 
logPath=null, sourceFileListing=null, sourcePaths=[s3a://bucket/tg], 
targetPath=adl://store.azuredatalakestore.net/tg.cp, filtersFile='null', 
blocksPerChunk=0, copyBufferSize=8192}, sourcePaths=[s3a://bucket/tg], 
targetPathExists=false, preserveRawXattrsfalse
...
{noformat}


> Hadoop keychain
> ---------------
>
>                 Key: HADOOP-14808
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14808
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: security
>    Affects Versions: 2.7.0
>            Reporter: John Zhuge
>            Assignee: John Zhuge
>         Attachments: HADOOP-14808.001.patch
>
>
> Extend the idea from HADOOP-6520 "UGI should load tokens from the 
> environment" to a generic lightweight "keychain" design. Load keys (secrets) 
> into a keychain in UGI (secret map) at startup. YARN will distribute them 
> securely into each container. The Hadoop code running in the container can 
> then retrieve the credentials from UGI.
> The use case is Bring Your Own Key (BYOK) credentials for cloud connectors 
> (adl, wasb, s3a, etc.), while Hadoop authentication is still Kerberos. No 
> configuration change, no admin involved. It will support YARN applications 
> initially, e.g., DistCp, Tera Suite, Spark-on-Yarn, etc.
> Implementation is surprisingly simple because almost all pieces are in place:
> * Retrieve secrets from UGI using {{conf.getPassword}} backed by the existing 
> Credential Provider class {{UserProvider}}
> * Reuse Credential Provider classes and interface to define local permanent 
> or transient credential store, e.g., {{LocalJavaKeyStoreProvider}}
> * New: create a new transient Credential Provider that logs into AAD with 
> username/password or device code, and then put the Client ID and Refresh 
> Token into the keychain
> * New: create a new permanent Credential Provider based on Hadoop 
> configuration XML, for dev/testing purpose.
> Links
> * HADOOP-11766 Generic token authentication support for Hadoop
> * HADOOP-11744 Support OAuth2 in Hadoop
> * HADOOP-10959 A Kerberos based token authentication approach
> * HADOOP-9392 Token based authentication and Single Sign On



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to