[
https://issues.apache.org/jira/browse/HADOOP-18610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17805483#comment-17805483
]
ASF GitHub Bot commented on HADOOP-18610:
-----------------------------------------
saxenapranav commented on code in PR #5953:
URL: https://github.com/apache/hadoop/pull/5953#discussion_r1448576233
##########
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/oauth2/AzureADAuthenticator.java:
##########
@@ -103,14 +105,55 @@ public static AzureADToken
getTokenUsingClientCreds(String authEndpoint,
} else {
qp.add("resource", RESOURCE_NAME);
}
- qp.add("grant_type", "client_credentials");
+ qp.add("grant_type", CLIENT_CREDENTIALS);
qp.add("client_id", clientId);
qp.add("client_secret", clientSecret);
LOG.debug("AADToken: starting to fetch token using client creds for client
ID " + clientId);
return getTokenCall(authEndpoint, qp.serialize(), null, null);
}
+ /**
+ * Gets Azure Active Directory token using the user ID and a JWT assertion
+ * generated by a federated authentication process.
+ *
+ * The federation process uses a feature from Azure Active Directory
+ * called workload identity. A workload identity is an identity used
+ * by a software workload (such as an application, service, script,
+ * or container) to authenticate and access other services and resources.
+ *
+ *
+ * @param authEndpoint the OAuth 2.0 token endpoint associated
+ * with the user's directory (obtain from
+ * Active Directory configuration)
+ * @param clientId the client ID (GUID) of the client web app
+ * obtained from Azure Active Directory configuration
+ * @param clientAssertion the JWT assertion token
+ * @return {@link AzureADToken} obtained using the creds
+ * @throws IOException throws IOException if there is a failure in
connecting to Azure AD
+ */
+ public static AzureADToken getTokenUsingJWTAssertion(String authEndpoint,
+ String clientId, String clientAssertion) throws IOException {
+ Preconditions.checkNotNull(authEndpoint, "authEndpoint");
+ Preconditions.checkNotNull(clientId, "clientId");
+ Preconditions.checkNotNull(clientAssertion, "clientAssertion");
+ boolean isVersion2AuthenticationEndpoint =
authEndpoint.contains("/oauth2/v2.0/");
+
+ QueryParams qp = new QueryParams();
+ if (isVersion2AuthenticationEndpoint) {
+ qp.add("scope", SCOPE);
+ } else {
+ qp.add("resource", RESOURCE_NAME);
+ }
+ qp.add("grant_type", CLIENT_CREDENTIALS);
+ qp.add("client_id", clientId);
+ qp.add("client_assertion", clientAssertion);
+ qp.add("client_assertion_type", JWT_BEARER_ASSERTION);
+ LOG.debug("AADToken: starting to fetch token using client assertion for
client ID " + clientId);
+
+ return getTokenCall(authEndpoint, qp.serialize(), null, null);
Review Comment:
Is this a Get or Post call? Reason being, from
https://github.com/apache/hadoop/blob/f609460bda0c2bd87dd3580158e549e2f34f14d5/hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/oauth2/AzureADAuthenticator.java#L354-L356,
the queryParams are sent differently and also the httpMethod has to be correct
as per the API.
##########
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/oauth2/WorkloadIdentityTokenProvider.java:
##########
@@ -0,0 +1,108 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.azurebfs.oauth2;
+
+import java.io.File;
+import java.io.IOException;
+
+import org.apache.hadoop.thirdparty.com.google.common.base.Strings;
+import org.apache.hadoop.util.Preconditions;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.commons.io.FileUtils;
+
+
+/**
+ * Provides tokens based on Azure AD Workload Identity.
+ */
+public class WorkloadIdentityTokenProvider extends AccessTokenProvider {
+
+ private static final String OAUTH2_TOKEN_PATH = "/oauth2/v2.0/token";
+ private final String authEndpoint;
+
+ private final String clientId;
+
+ private final String tokenFile;
+
+ private long tokenFetchTime = -1;
+
+ private static final long ONE_HOUR = 3600 * 1000;
+
+ private static final Logger LOG =
LoggerFactory.getLogger(AccessTokenProvider.class);
+
+ public WorkloadIdentityTokenProvider(final String authority, final String
tenantId,
+ final String clientId, final String tokenFile) {
+ Preconditions.checkNotNull(authority, "authority");
+ Preconditions.checkNotNull(tenantId, "tenantId");
+ Preconditions.checkNotNull(clientId, "clientId");
+ Preconditions.checkNotNull(tokenFile, "tokenFile");
+
+ this.authEndpoint = authority + tenantId + OAUTH2_TOKEN_PATH;
+ this.clientId = clientId;
+ this.tokenFile = tokenFile;
+ }
+
+ @Override
+ protected AzureADToken refreshToken() throws IOException {
+ LOG.debug("AADToken: refreshing token from JWT Assertion");
+ String clientAssertion = getClientAssertion(tokenFile);
+ AzureADToken token = AzureADAuthenticator
+ .getTokenUsingJWTAssertion(authEndpoint, clientId, clientAssertion);
+ tokenFetchTime = System.currentTimeMillis();
+ return token;
+ }
+
+ /**
+ * Checks if the token is about to expire as per base expiry logic.
+ * Otherwise try to expire every 1 hour.
+ *
+ * @return true if the token is expiring in next 1 hour or if a token has
+ * never been fetched
+ */
+ @Override
+ protected boolean isTokenAboutToExpire() {
+ if (tokenFetchTime == -1 || super.isTokenAboutToExpire()) {
+ return true;
+ }
+
+ boolean expiring = false;
+ long elapsedTimeSinceLastTokenRefreshInMillis =
+ System.currentTimeMillis() - tokenFetchTime;
+ // In case token is not refreshed for 1 hr or any clock skew issues,
+ // refresh token.
+ expiring = elapsedTimeSinceLastTokenRefreshInMillis >= ONE_HOUR
+ || elapsedTimeSinceLastTokenRefreshInMillis < 0;
+ if (expiring) {
+ LOG.debug("JWTToken: token renewing. Time elapsed since last token
fetch:"
+ + " {} milliseconds", elapsedTimeSinceLastTokenRefreshInMillis);
+ }
+
+ return expiring;
+ }
+
+ private static String getClientAssertion(String tokenFile)
Review Comment:
any reason for having it static?
##########
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/oauth2/WorkloadIdentityTokenProvider.java:
##########
@@ -0,0 +1,108 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.azurebfs.oauth2;
+
+import java.io.File;
+import java.io.IOException;
+
+import org.apache.hadoop.thirdparty.com.google.common.base.Strings;
+import org.apache.hadoop.util.Preconditions;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.commons.io.FileUtils;
+
+
+/**
+ * Provides tokens based on Azure AD Workload Identity.
+ */
+public class WorkloadIdentityTokenProvider extends AccessTokenProvider {
+
+ private static final String OAUTH2_TOKEN_PATH = "/oauth2/v2.0/token";
+ private final String authEndpoint;
+
+ private final String clientId;
+
+ private final String tokenFile;
+
+ private long tokenFetchTime = -1;
+
+ private static final long ONE_HOUR = 3600 * 1000;
+
+ private static final Logger LOG =
LoggerFactory.getLogger(AccessTokenProvider.class);
+
+ public WorkloadIdentityTokenProvider(final String authority, final String
tenantId,
+ final String clientId, final String tokenFile) {
+ Preconditions.checkNotNull(authority, "authority");
+ Preconditions.checkNotNull(tenantId, "tenantId");
+ Preconditions.checkNotNull(clientId, "clientId");
+ Preconditions.checkNotNull(tokenFile, "tokenFile");
+
+ this.authEndpoint = authority + tenantId + OAUTH2_TOKEN_PATH;
+ this.clientId = clientId;
+ this.tokenFile = tokenFile;
+ }
+
+ @Override
+ protected AzureADToken refreshToken() throws IOException {
+ LOG.debug("AADToken: refreshing token from JWT Assertion");
+ String clientAssertion = getClientAssertion(tokenFile);
+ AzureADToken token = AzureADAuthenticator
+ .getTokenUsingJWTAssertion(authEndpoint, clientId, clientAssertion);
+ tokenFetchTime = System.currentTimeMillis();
+ return token;
+ }
+
+ /**
+ * Checks if the token is about to expire as per base expiry logic.
+ * Otherwise try to expire every 1 hour.
+ *
+ * @return true if the token is expiring in next 1 hour or if a token has
+ * never been fetched
+ */
+ @Override
+ protected boolean isTokenAboutToExpire() {
+ if (tokenFetchTime == -1 || super.isTokenAboutToExpire()) {
+ return true;
+ }
+
+ boolean expiring = false;
+ long elapsedTimeSinceLastTokenRefreshInMillis =
+ System.currentTimeMillis() - tokenFetchTime;
+ // In case token is not refreshed for 1 hr or any clock skew issues,
+ // refresh token.
+ expiring = elapsedTimeSinceLastTokenRefreshInMillis >= ONE_HOUR
+ || elapsedTimeSinceLastTokenRefreshInMillis < 0;
+ if (expiring) {
+ LOG.debug("JWTToken: token renewing. Time elapsed since last token
fetch:"
+ + " {} milliseconds", elapsedTimeSinceLastTokenRefreshInMillis);
+ }
+
+ return expiring;
+ }
+
+ private static String getClientAssertion(String tokenFile)
+ throws IOException {
+ File file = new File(tokenFile);
+ String clientAssertion = FileUtils.readFileToString(file, "UTF-8");
+ if (Strings.isNullOrEmpty(clientAssertion))
+ throw new IOException("Empty token file.");
Review Comment:
checkstyle: brackets.
> ABFS OAuth2 Token Provider to support Azure Workload Identity for AKS
> ---------------------------------------------------------------------
>
> Key: HADOOP-18610
> URL: https://issues.apache.org/jira/browse/HADOOP-18610
> Project: Hadoop Common
> Issue Type: Improvement
> Components: tools
> Affects Versions: 3.3.4
> Reporter: Haifeng Chen
> Priority: Critical
> Labels: pull-request-available
> Attachments: HADOOP-18610-preview.patch
>
> Original Estimate: 168h
> Remaining Estimate: 168h
>
> In Jan 2023, Microsoft Azure AKS replaced its original pod-managed identity
> with with [Azure Active Directory (Azure AD) workload
> identities|https://learn.microsoft.com/en-us/azure/active-directory/develop/workload-identities-overview]
> (preview), which integrate with the Kubernetes native capabilities to
> federate with any external identity providers. This approach is simpler to
> use and deploy.
> Refer to
> [https://learn.microsoft.com/en-us/azure/aks/workload-identity-overview|https://learn.microsoft.com/en-us/azure/aks/workload-identity-overview.]
> and [https://azure.github.io/azure-workload-identity/docs/introduction.html]
> for more details.
> The basic use scenario is to access Azure cloud resources (such as cloud
> storage) from Kubernetes (such as AKS) workload using Azure managed identity
> federated with Kubernetes service account. The credential environment
> variables in pod projected by Azure AD workload identity are like following:
> AZURE_AUTHORITY_HOST: (Injected by the webhook,
> [https://login.microsoftonline.com/])
> AZURE_CLIENT_ID: (Injected by the webhook)
> AZURE_TENANT_ID: (Injected by the webhook)
> AZURE_FEDERATED_TOKEN_FILE: (Injected by the webhook,
> /var/run/secrets/azure/tokens/azure-identity-token)
> The token in the file pointed by AZURE_FEDERATED_TOKEN_FILE is a JWT (JASON
> Web Token) client assertion token which we can use to request to
> AZURE_AUTHORITY_HOST (url is AZURE_AUTHORITY_HOST + tenantId +
> "/oauth2/v2.0/token") for a AD token which can be used to directly access
> the Azure cloud resources.
> This approach is very common and similar among cloud providers such as AWS
> and GCP. Hadoop AWS integration has WebIdentityTokenCredentialProvider to
> handle the same case.
> The existing MsiTokenProvider can only handle the managed identity associated
> with Azure VM instance. We need to implement a WorkloadIdentityTokenProvider
> which handle Azure Workload Identity case. For this, we need to add one
> method (getTokenUsingJWTAssertion) in AzureADAuthenticator which will be used
> by WorkloadIdentityTokenProvider.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]