[
https://issues.apache.org/jira/browse/HADOOP-19247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Steve Loughran updated HADOOP-19247:
------------------------------------
Component/s: fs/azure
> Authentification failed in Azure Kubernetes with HTTP1.1 and Chunked transfer
> encoding
> --------------------------------------------------------------------------------------
>
> Key: HADOOP-19247
> URL: https://issues.apache.org/jira/browse/HADOOP-19247
> Project: Hadoop Common
> Issue Type: Bug
> Components: auth, fs/azure
> Affects Versions: 3.4.0, 3.3.4, 3.3.6, 3.5.0
> Environment: Azure Kubernetes Services
> Azure Entra ID
> Azure Metadata Service
> Spark 3.3
> Reporter: Emeric
> Priority: Major
> Attachments: CodeResponse.png, TokenKO.png, TokenOK.png
>
>
>
> The problem is related to Azure authentication on Kubernetes.
> When I run my Spark program, I have this error when I try to authenticate the
> pod :
>
> {code:java}
> java.lang.NullPointerException
> at
> org.apache.hadoop.fs.azurebfs.oauth2.AzureADAuthenticator.consumeInputStream(AzureADAuthenticator.java:340)
> at
> org.apache.hadoop.fs.azurebfs.oauth2.AzureADAuthenticator.getTokenSingleCall(AzureADAuthenticator.java:270)
> at
> org.apache.hadoop.fs.azurebfs.oauth2.AzureADAuthenticator.getTokenCall(AzureADAuthenticator.java:211)
> at
> org.apache.hadoop.fs.azurebfs.oauth2.AzureADAuthenticator.getTokenFromMsi(AzureADAuthenticator.java:137)
> at
> org.apache.hadoop.fs.azurebfs.oauth2.MsiTokenProvider.refreshToken(MsiTokenProvider.java:45)
> at
> org.apache.hadoop.fs.azurebfs.oauth2.AccessTokenProvider.getToken(AccessTokenProvider.java:50)
> at
> org.apache.hadoop.fs.azurebfs.services.AbfsClient.getAccessToken(AbfsClient.java:554)
> at
> org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.executeHttpOperation(AbfsRestOperation.java:151)
> at
> org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.execute(AbfsRestOperation.java:125)
> at
> org.apache.hadoop.fs.azurebfs.services.AbfsClient.listPath(AbfsClient.java:181)
> at
> org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.listStatus(AzureBlobFileSystemStore.java:569)
> at
> org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.listStatus(AzureBlobFileSystemStore.java:536)
> at
> org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.listStatus(AzureBlobFileSystem.java:359)
> {code}
>
>
> My configuration is a spark-driver deployed on Azure kubernetes with managed
> identity.
> I used [this
> method|https://medium.com/datamindedbe/running-spark-3-on-aks-with-azure-ad-integration-c1fc0032c550]
> with aad-pod-identity.
>
> There are two different scenarios we can observe when trying to authenticate
> on Kubernetes to Azure Instance Metadata Service :
> * The returned token is short and its size is less than 2048 chars. The
> Token have all headers and explicitly the "Content-Length" header
> !TokenOK.png!
> * The returned token is long and its size is more than 2048 chars. The Token
> have the HTTP1.1 capacity with transfer encoding property in Response and
> don't have the "Content-length" header due to Chunked transfer encoding
> mechanism.
> !TokenKO.png!
>
> NB : I run a curl command in pod to generate these sceenshots according to
> the [Azure
> Documentation|https://learn.microsoft.com/en-us/azure/virtual-machines/instance-metadata-service?tabs=linux]
> In a GitHub repository I found my "AzureADAuthenticator.java" and this piece
> of code :
> !CodeResponse.png!
> The "Content-length" property is mandatory when the returned HTTP code is 200
> and it's not compatible with the HTTP1.1 Chunked transfer encoding
> fonctionality.
> Is it possible to update this authentification to support this mechanism
> implemented by Microsoft on kubernetes (and may be in virtual machine).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]