[jira] [Updated] (HADOOP-19247) Authentification failed in Azure Kubernetes with HTTP1.1 and Chunked transfer encoding

Emeric (Jira) Fri, 02 Aug 2024 06:19:42 -0700


     [ 
https://issues.apache.org/jira/browse/HADOOP-19247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Emeric updated HADOOP-19247:
----------------------------
    Description: 
 

The problem is related to Azure authentication on Kubernetes.

When I run my Spark program, I have this error when I try to authenticate the 
pod :

 
{code:java}
java.lang.NullPointerException
    at 
org.apache.hadoop.fs.azurebfs.oauth2.AzureADAuthenticator.consumeInputStream(AzureADAuthenticator.java:340)
    at 
org.apache.hadoop.fs.azurebfs.oauth2.AzureADAuthenticator.getTokenSingleCall(AzureADAuthenticator.java:270)
    at 
org.apache.hadoop.fs.azurebfs.oauth2.AzureADAuthenticator.getTokenCall(AzureADAuthenticator.java:211)
    at 
org.apache.hadoop.fs.azurebfs.oauth2.AzureADAuthenticator.getTokenFromMsi(AzureADAuthenticator.java:137)
    at 
org.apache.hadoop.fs.azurebfs.oauth2.MsiTokenProvider.refreshToken(MsiTokenProvider.java:45)
    at 
org.apache.hadoop.fs.azurebfs.oauth2.AccessTokenProvider.getToken(AccessTokenProvider.java:50)
    at 
org.apache.hadoop.fs.azurebfs.services.AbfsClient.getAccessToken(AbfsClient.java:554)
    at 
org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.executeHttpOperation(AbfsRestOperation.java:151)
    at 
org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.execute(AbfsRestOperation.java:125)
    at 
org.apache.hadoop.fs.azurebfs.services.AbfsClient.listPath(AbfsClient.java:181)
    at 
org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.listStatus(AzureBlobFileSystemStore.java:569)
    at 
org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.listStatus(AzureBlobFileSystemStore.java:536)
    at 
org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.listStatus(AzureBlobFileSystem.java:359)
 {code}
 

 

My configuration is a spark-driver deployed on Azure kubernetes with managed 
identity.

I used [this 
method|https://medium.com/datamindedbe/running-spark-3-on-aks-with-azure-ad-integration-c1fc0032c550]
 with aad-pod-identity.

 

There are two different scenarios we can observe when trying to authenticate on 
Kubernetes to Azure Instance Metadata Service :
 * The returned token is short and its size is less than 2048 chars. The Token 
have all headers and explicitly the "Content-Length" header

!TokenOK.png!
 * The returned token is long and its size is more than 2048 chars. The Token 
have [the HTTP1.1 capacity with transfer encoding 
property|https://en.wikipedia.org/wiki/Chunked_transfer_encoding] in Response 
and don't have the "Content-length" header due to Chunked transfer encoding 
mechanism.

!TokenKO.png!

 

NB : I run a curl command in pod to generate these sceenshots according to the 
[Azure 
Documentation|https://learn.microsoft.com/en-us/azure/virtual-machines/instance-metadata-service?tabs=linux]

In a GitHub repository I found my "AzureADAuthenticator.java" and this piece of 
code :

!CodeResponse.png!

The "Content-length" property is mandatory when the returned HTTP code is 200 
and it's not compatible with the HTTP1.1 Chunked transfer encoding 
fonctionality.

Is it possible to update this authentification to support this mechanism 
implemented by Microsoft on kubernetes (and may be in virtual machine).

  was:
 

The problem is related to Azure authentication on Kubernetes.

When I run my Spark program, I have this error when I try to authenticate the 
pod :

 
{code:java}
java.lang.NullPointerException
    at 
org.apache.hadoop.fs.azurebfs.oauth2.AzureADAuthenticator.consumeInputStream(AzureADAuthenticator.java:340)
    at 
org.apache.hadoop.fs.azurebfs.oauth2.AzureADAuthenticator.getTokenSingleCall(AzureADAuthenticator.java:270)
    at 
org.apache.hadoop.fs.azurebfs.oauth2.AzureADAuthenticator.getTokenCall(AzureADAuthenticator.java:211)
    at 
org.apache.hadoop.fs.azurebfs.oauth2.AzureADAuthenticator.getTokenFromMsi(AzureADAuthenticator.java:137)
    at 
org.apache.hadoop.fs.azurebfs.oauth2.MsiTokenProvider.refreshToken(MsiTokenProvider.java:45)
    at 
org.apache.hadoop.fs.azurebfs.oauth2.AccessTokenProvider.getToken(AccessTokenProvider.java:50)
    at 
org.apache.hadoop.fs.azurebfs.services.AbfsClient.getAccessToken(AbfsClient.java:554)
    at 
org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.executeHttpOperation(AbfsRestOperation.java:151)
    at 
org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.execute(AbfsRestOperation.java:125)
    at 
org.apache.hadoop.fs.azurebfs.services.AbfsClient.listPath(AbfsClient.java:181)
    at 
org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.listStatus(AzureBlobFileSystemStore.java:569)
    at 
org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.listStatus(AzureBlobFileSystemStore.java:536)
    at 
org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.listStatus(AzureBlobFileSystem.java:359)
 {code}
 

 

My configuration is a spark-driver deployed on Azure kubernetes with managed 
identity.

I used [this 
method|https://medium.com/datamindedbe/running-spark-3-on-aks-with-azure-ad-integration-c1fc0032c550]
 with aad-pod-identity.

 


There are two different scenarios we can observe when trying to authenticate on 
Kubernetes to Azure Instance Metadata Service :
 * The returned token is short and its size is less than 2048 chars. The Token 
have all headers and explicitly the "Content-Length" header

!TokenOK.png!
 * The returned token is long and its size is more than 2048 chars. The Token 
have the HTTP1.1 capacity with transfer encoding property in Response and don't 
have the "Content-length" header due to Chunked transfer encoding mechanism.

!TokenKO.png!

 

NB : I run a curl command in pod to generate these sceenshots according to the 
[Azure 
Documentation|https://learn.microsoft.com/en-us/azure/virtual-machines/instance-metadata-service?tabs=linux]

In a GitHub repository I found my "AzureADAuthenticator.java" and this piece of 
code :

!CodeResponse.png!

The "Content-length" property is mandatory when the returned HTTP code is 200 
and it's not compatible with the HTTP1.1 Chunked transfer encoding 
fonctionality.

Is it possible to update this authentification to support this mechanism 
implemented by Microsoft on kubernetes (and may be in virtual machine).


> Authentification failed in Azure Kubernetes with HTTP1.1 and Chunked transfer 
> encoding
> --------------------------------------------------------------------------------------
>
>                 Key: HADOOP-19247
>                 URL: https://issues.apache.org/jira/browse/HADOOP-19247
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: auth, fs/azure
>    Affects Versions: 3.4.0, 3.3.4, 3.3.6, 3.5.0
>         Environment: Azure Kubernetes Services
> Azure Entra ID
> Azure Metadata Service
> Spark 3.3
>            Reporter: Emeric
>            Priority: Major
>         Attachments: CodeResponse.png, TokenKO.png, TokenOK.png
>
>
>  
> The problem is related to Azure authentication on Kubernetes.
> When I run my Spark program, I have this error when I try to authenticate the 
> pod :
>  
> {code:java}
> java.lang.NullPointerException
>     at 
> org.apache.hadoop.fs.azurebfs.oauth2.AzureADAuthenticator.consumeInputStream(AzureADAuthenticator.java:340)
>     at 
> org.apache.hadoop.fs.azurebfs.oauth2.AzureADAuthenticator.getTokenSingleCall(AzureADAuthenticator.java:270)
>     at 
> org.apache.hadoop.fs.azurebfs.oauth2.AzureADAuthenticator.getTokenCall(AzureADAuthenticator.java:211)
>     at 
> org.apache.hadoop.fs.azurebfs.oauth2.AzureADAuthenticator.getTokenFromMsi(AzureADAuthenticator.java:137)
>     at 
> org.apache.hadoop.fs.azurebfs.oauth2.MsiTokenProvider.refreshToken(MsiTokenProvider.java:45)
>     at 
> org.apache.hadoop.fs.azurebfs.oauth2.AccessTokenProvider.getToken(AccessTokenProvider.java:50)
>     at 
> org.apache.hadoop.fs.azurebfs.services.AbfsClient.getAccessToken(AbfsClient.java:554)
>     at 
> org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.executeHttpOperation(AbfsRestOperation.java:151)
>     at 
> org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.execute(AbfsRestOperation.java:125)
>     at 
> org.apache.hadoop.fs.azurebfs.services.AbfsClient.listPath(AbfsClient.java:181)
>     at 
> org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.listStatus(AzureBlobFileSystemStore.java:569)
>     at 
> org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.listStatus(AzureBlobFileSystemStore.java:536)
>     at 
> org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.listStatus(AzureBlobFileSystem.java:359)
>  {code}
>  
>  
> My configuration is a spark-driver deployed on Azure kubernetes with managed 
> identity.
> I used [this 
> method|https://medium.com/datamindedbe/running-spark-3-on-aks-with-azure-ad-integration-c1fc0032c550]
>  with aad-pod-identity.
>  
> There are two different scenarios we can observe when trying to authenticate 
> on Kubernetes to Azure Instance Metadata Service :
>  * The returned token is short and its size is less than 2048 chars. The 
> Token have all headers and explicitly the "Content-Length" header
> !TokenOK.png!
>  * The returned token is long and its size is more than 2048 chars. The Token 
> have [the HTTP1.1 capacity with transfer encoding 
> property|https://en.wikipedia.org/wiki/Chunked_transfer_encoding] in Response 
> and don't have the "Content-length" header due to Chunked transfer encoding 
> mechanism.
> !TokenKO.png!
>  
> NB : I run a curl command in pod to generate these sceenshots according to 
> the [Azure 
> Documentation|https://learn.microsoft.com/en-us/azure/virtual-machines/instance-metadata-service?tabs=linux]
> In a GitHub repository I found my "AzureADAuthenticator.java" and this piece 
> of code :
> !CodeResponse.png!
> The "Content-length" property is mandatory when the returned HTTP code is 200 
> and it's not compatible with the HTTP1.1 Chunked transfer encoding 
> fonctionality.
> Is it possible to update this authentification to support this mechanism 
> implemented by Microsoft on kubernetes (and may be in virtual machine).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (HADOOP-19247) Authentification failed in Azure Kubernetes with HTTP1.1 and Chunked transfer encoding

Reply via email to