[ 
https://issues.apache.org/jira/browse/HADOOP-18610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17809571#comment-17809571
 ] 

ASF GitHub Bot commented on HADOOP-18610:
-----------------------------------------

creste commented on PR #5953:
URL: https://github.com/apache/hadoop/pull/5953#issuecomment-1904433062

   @sugibuchi - Thank you for the additional comments.
   
   >About the descriptions of the four properties, I think we can simply 
copy-paste the descriptions provided by ADD Workload identity documentation.
   > -    fs.azure.account.oauth2.msi.tenant: The tenant ID of the registered 
AAD application or user-assigned managed identity.
   > -    fs.azure.account.oauth2.client.id: The client ID of the AAD 
application or user-assigned managed identity.
   > -    fs.azure.account.oauth2.token.file: The path of the projected service 
account token file.
   
   
   The current descriptions of the properties were copied from other parts of 
the README. For example, see the property descriptions for 
[MSITokenProvider](https://github.com/apache/hadoop/blob/329e3c900bc2668651aa812fa075501c494652df/hadoop-tools/hadoop-azure/src/site/markdown/abfs.md?plain=1#L544).
  @steveloughran or @anmolanmol1234 - what descriptions should the README use 
for those properties?
   
   > 
   > About the description of the auth method:
   > 
   >>    OAuth 2.0 tokens are written to a file that is only accessible from 
the executing pod (`/var/run/secrets/azure/tokens/azure-identity-token`). The 
issued credentials can be used to authenticate.
   > 
   > This is not precise. The token files injected by the AAD workload identity 
webhook are files of "projected service account tokens" issued by Kubernetes 
clusters. They are not OAuth2 access tokens for accessing Azure resources.
   >
   > I propose to update the description of this auth method like:
   >>    With a projected service account token injected by the Azure Workload 
Identity webhook, make a request of the Azure Active Directry endpoint to 
retrieve access tokens.
   >>    The required properties for this authentication method are 
automatically injected into the executing pod as environment variables by the 
AAD Workload Identity webhook.
   
   
   I have no preference, but since this text was also based on other 
descriptions in the README I would appreciate input from a maintainer before 
making the change.  @steveloughran  or @anmolanmol1234 - any thoughts on this?




> ABFS OAuth2 Token Provider to support Azure Workload Identity for AKS
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-18610
>                 URL: https://issues.apache.org/jira/browse/HADOOP-18610
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: tools
>    Affects Versions: 3.3.4
>            Reporter: Haifeng Chen
>            Priority: Critical
>              Labels: pull-request-available
>         Attachments: HADOOP-18610-preview.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> In Jan 2023, Microsoft Azure AKS replaced its original pod-managed identity 
> with with [Azure Active Directory (Azure AD) workload 
> identities|https://learn.microsoft.com/en-us/azure/active-directory/develop/workload-identities-overview]
>  (preview), which integrate with the Kubernetes native capabilities to 
> federate with any external identity providers. This approach is simpler to 
> use and deploy.
> Refer to 
> [https://learn.microsoft.com/en-us/azure/aks/workload-identity-overview|https://learn.microsoft.com/en-us/azure/aks/workload-identity-overview.]
>  and [https://azure.github.io/azure-workload-identity/docs/introduction.html] 
> for more details.
> The basic use scenario is to access Azure cloud resources (such as cloud 
> storage) from Kubernetes (such as AKS) workload using Azure managed identity 
> federated with Kubernetes service account. The credential environment 
> variables in pod projected by Azure AD workload identity are like following:
> AZURE_AUTHORITY_HOST: (Injected by the webhook, 
> [https://login.microsoftonline.com/])
> AZURE_CLIENT_ID: (Injected by the webhook)
> AZURE_TENANT_ID: (Injected by the webhook)
> AZURE_FEDERATED_TOKEN_FILE: (Injected by the webhook, 
> /var/run/secrets/azure/tokens/azure-identity-token)
> The token in the file pointed by AZURE_FEDERATED_TOKEN_FILE is a JWT (JASON 
> Web Token) client assertion token which we can use to request to 
> AZURE_AUTHORITY_HOST (url is  AZURE_AUTHORITY_HOST + tenantId + 
> "/oauth2/v2.0/token")  for a AD token which can be used to directly access 
> the Azure cloud resources.
> This approach is very common and similar among cloud providers such as AWS 
> and GCP. Hadoop AWS integration has WebIdentityTokenCredentialProvider to 
> handle the same case.
> The existing MsiTokenProvider can only handle the managed identity associated 
> with Azure VM instance. We need to implement a WorkloadIdentityTokenProvider 
> which handle Azure Workload Identity case. For this, we need to add one 
> method (getTokenUsingJWTAssertion) in AzureADAuthenticator which will be used 
> by WorkloadIdentityTokenProvider.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to