snjypl opened a new issue, #23266:
URL: https://github.com/apache/airflow/issues/23266

   ### Apache Airflow Provider(s)
   
   microsoft-azure
   
   ### Versions of Apache Airflow Providers
   
   apache-airflow-providers-microsoft-azure==3.8.0
   
   ### Apache Airflow version
   
   2.2.4
   
   ### Operating System
   
   Ubuntu 20.04.2 LTS
   
   ### Deployment
   
   Official Apache Airflow Helm Chart
   
   ### Deployment details
   
   Have deployed airflow using the official helm chart on aks cluster.  
   
   
   
   
   
   ### What happened
   
   I have deployed apache airflow using the official helm chart on an AKS 
cluster.
   The pod has multiple user assigned identity assigned to it. 
   i have set the AZURE_CLIENT_ID environment variable to the client id that i 
want to use for authentication. 
   
   _Airflow connection:_
   
   wasb_default = '{"login":"storageaccountname"}'
   
   **Env**
   AZURE_CLIENT_ID="user-managed-identity-client-id"
   
   _**code**_
   ```
   # suppress azure.core logs
   import logging 
   logger = logging.getLogger("azure.core")
   logger.setLevel(logging.ERROR)
   
   from airflow.providers.microsoft.azure.hooks.wasb import WasbHook
   conn_id = 'wasb-default'
   hook = WasbHook(conn_id)
   for blob_name in hook.get_blobs_list("testcontainer"):
           print(blob_name)
   
   ```
   **error**
   ```
   azure.core.exceptions.ClientAuthenticationError: Unexpected content type 
"text/plain; charset=utf-8"
   Content: failed to get service principal token, error: adal: Refresh request 
failed. Status Code = '400'. Response body: 
{"error":"invalid_request","error_description":"Multiple user assigned 
identities exist, please specify the clientId / resourceId of the identity in 
the token request"} Endpoint 
http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-01&resource=https%3A%2F%2Fstorage.azure.com
   
   ```
   
   
   **trace**
   ```
   
   [2022-04-26 16:37:23,446] {environment.py:103} WARNING - Incomplete 
environment configuration. These variables are set: AZURE_CLIENT_ID
   [2022-04-26 16:37:23,446] {managed_identity.py:89} INFO - 
ManagedIdentityCredential will use IMDS
   [2022-04-26 16:37:23,605] {chained.py:84} INFO - DefaultAzureCredential 
acquired a token from ManagedIdentityCredential
   
   #Note: azure key vault azure.secrets.key_vault.AzureKeyVaultBackend uses 
DefaultAzureCredential to get the connection 
   
   [2022-04-26 16:37:23,687] {base.py:68} INFO - Using connection ID 
'wasb-default' for task execution.
   [2022-04-26 16:37:23,687] {managed_identity.py:89} INFO - 
ManagedIdentityCredential will use IMDS
   [2022-04-26 16:37:23,688] {wasb.py:155} INFO - Using managed identity as 
credential
   Traceback (most recent call last):
     File 
"/home/airflow/.local/lib/python3.10/site-packages/azure/core/pipeline/policies/_universal.py",
 line 561, in deserialize_from_text
       return json.loads(data_as_str)
     File "/usr/local/lib/python3.10/json/__init__.py", line 346, in loads
       return _default_decoder.decode(s)
     File "/usr/local/lib/python3.10/json/decoder.py", line 337, in decode
       obj, end = self.raw_decode(s, idx=_w(s, 0).end())
     File "/usr/local/lib/python3.10/json/decoder.py", line 355, in raw_decode
       raise JSONDecodeError("Expecting value", s, err.value) from None
   json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
   
   During handling of the above exception, another exception occurred:
   
   Traceback (most recent call last):
     File 
"/home/airflow/.local/lib/python3.10/site-packages/azure/identity/_internal/managed_identity_client.py",
 line 51, in _process_response
       content = ContentDecodePolicy.deserialize_from_text(
     File 
"/home/airflow/.local/lib/python3.10/site-packages/azure/core/pipeline/policies/_universal.py",
 line 563, in deserialize_from_text
       raise DecodeError(message="JSON is invalid: {}".format(err), 
response=response, error=err)
   azure.core.exceptions.DecodeError: JSON is invalid: Expecting value: line 1 
column 1 (char 0)
   
   The above exception was the direct cause of the following exception:
   
   Traceback (most recent call last):
     File 
"/home/airflow/.local/lib/python3.10/site-packages/azure/identity/_credentials/imds.py",
 line 97, in _request_token
       token = self._client.request_token(*scopes, headers={"Metadata": "true"})
     File 
"/home/airflow/.local/lib/python3.10/site-packages/azure/identity/_internal/managed_identity_client.py",
 line 126, in request_token
       token = self._process_response(response, request_time)
     File 
"/home/airflow/.local/lib/python3.10/site-packages/azure/identity/_internal/managed_identity_client.py",
 line 59, in _process_response
       six.raise_from(ClientAuthenticationError(message=message, 
response=response.http_response), ex)
     File "<string>", line 3, in raise_from
   azure.core.exceptions.ClientAuthenticationError: Unexpected content type 
"text/plain; charset=utf-8"
   Content: failed to get service principal token, error: adal: Refresh request 
failed. Status Code = '400'. Response body: 
{"error":"invalid_request","error_description":"Multiple user assigned 
identities exist, please specify the clientId / resourceId of the identity in 
the token request"} Endpoint 
http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-01&resource=https%3A%2F%2Fstorage.azure.com
   
   
   The above exception was the direct cause of the following exception:
   
   Traceback (most recent call last):
     File "/tmp/test.py", line 7, in <module>
       for blob_name in hook.get_blobs_list("test_container"):
     File 
"/home/airflow/.local/lib/python3.10/site-packages/airflow/providers/microsoft/azure/hooks/wasb.py",
 line 231, in get_blobs_list
       for blob in blobs:
     File 
"/home/airflow/.local/lib/python3.10/site-packages/azure/core/paging.py", line 
129, in __next__
       return next(self._page_iterator)
     File 
"/home/airflow/.local/lib/python3.10/site-packages/azure/core/paging.py", line 
76, in __next__
       self._response = self._get_next(self.continuation_token)
     File 
"/home/airflow/.local/lib/python3.10/site-packages/azure/storage/blob/_list_blobs_helper.py",
 line 79, in _get_next_cb
       process_storage_error(error)
     File 
"/home/airflow/.local/lib/python3.10/site-packages/azure/storage/blob/_shared/response_handlers.py",
 line 89, in process_storage_error
       raise storage_error
     File 
"/home/airflow/.local/lib/python3.10/site-packages/azure/storage/blob/_list_blobs_helper.py",
 line 72, in _get_next_cb
       return self._command(
     File 
"/home/airflow/.local/lib/python3.10/site-packages/azure/storage/blob/_generated/operations/_container_operations.py",
 line 1572, in list_blob_hierarchy_segment
       pipeline_response = self._client._pipeline.run(request, stream=False, 
**kwargs)
     File 
"/home/airflow/.local/lib/python3.10/site-packages/azure/core/pipeline/_base.py",
 line 211, in run
       return first_node.send(pipeline_request)  # type: ignore
     File 
"/home/airflow/.local/lib/python3.10/site-packages/azure/core/pipeline/_base.py",
 line 71, in send
       response = self.next.send(request)
     File 
"/home/airflow/.local/lib/python3.10/site-packages/azure/core/pipeline/_base.py",
 line 71, in send
       response = self.next.send(request)
     File 
"/home/airflow/.local/lib/python3.10/site-packages/azure/core/pipeline/_base.py",
 line 71, in send
       response = self.next.send(request)
     [Previous line repeated 2 more times]
     File 
"/home/airflow/.local/lib/python3.10/site-packages/azure/core/pipeline/policies/_redirect.py",
 line 158, in send
       response = self.next.send(request)
     File 
"/home/airflow/.local/lib/python3.10/site-packages/azure/core/pipeline/_base.py",
 line 71, in send
       response = self.next.send(request)
     File 
"/home/airflow/.local/lib/python3.10/site-packages/azure/storage/blob/_shared/policies.py",
 line 515, in send
       raise err
     File 
"/home/airflow/.local/lib/python3.10/site-packages/azure/storage/blob/_shared/policies.py",
 line 489, in send
       response = self.next.send(request)
     File 
"/home/airflow/.local/lib/python3.10/site-packages/azure/core/pipeline/_base.py",
 line 71, in send
       response = self.next.send(request)
     File 
"/home/airflow/.local/lib/python3.10/site-packages/azure/core/pipeline/_base.py",
 line 71, in send
       response = self.next.send(request)
     File 
"/home/airflow/.local/lib/python3.10/site-packages/azure/core/pipeline/policies/_authentication.py",
 line 117, in send
       self.on_request(request)
     File 
"/home/airflow/.local/lib/python3.10/site-packages/azure/core/pipeline/policies/_authentication.py",
 line 94, in on_request
       self._token = self._credential.get_token(*self._scopes)
     File 
"/home/airflow/.local/lib/python3.10/site-packages/azure/identity/_internal/decorators.py",
 line 32, in wrapper
       token = fn(*args, **kwargs)
     File 
"/home/airflow/.local/lib/python3.10/site-packages/azure/identity/_credentials/managed_identity.py",
 line 123, in get_token
       return self._credential.get_token(*scopes, **kwargs)
     File 
"/home/airflow/.local/lib/python3.10/site-packages/azure/identity/_internal/get_token_mixin.py",
 line 76, in get_token
       token = self._request_token(*scopes, **kwargs)
     File 
"/home/airflow/.local/lib/python3.10/site-packages/azure/identity/_credentials/imds.py",
 line 111, in _request_token
       six.raise_from(ClientAuthenticationError(message=ex.message, 
response=ex.response), ex)
     File "<string>", line 3, in raise_from
   azure.core.exceptions.ClientAuthenticationError: Unexpected content type 
"text/plain; charset=utf-8"
   Content: failed to get service principal token, error: adal: Refresh request 
failed. Status Code = '400'. Response body: 
{"error":"invalid_request","error_description":"Multiple user assigned 
identities exist, please specify the clientId / resourceId of the identity in 
the token request"} Endpoint 
http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-01&resource=https%3A%2F%2Fstorage.azure.com
   ```
   
   
   ### What you think should happen instead
   
   The wasb hook should be able to authenticate using the user identity 
specified in the AZURE_CLIENT_ID and list the blobs
   
   ### How to reproduce
   
   In an environment with multiple user assigned identity. 
   
   ```
   import logging 
   logger = logging.getLogger("azure.core")
   logger.setLevel(logging.ERROR)
   from airflow.providers.microsoft.azure.hooks.wasb import WasbHook
   conn_id = 'wasb-default'
   hook = WasbHook(conn_id)
   for blob_name in hook.get_blobs_list("testcontainer"):
           print(blob_name)
   ```
   
   
   ### Anything else
   
   the issue is caused because we are not passing client_id to 
ManagedIdentityCredential in 
   
[azure.hooks.wasb.WasbHook](https://github.com/apache/airflow/blob/1d875a45994540adef23ad6f638d78c9945ef873/airflow/providers/microsoft/azure/hooks/wasb.py#L153-L160)
    ```
     if not credential:
               credential = ManagedIdentityCredential()
               self.log.info("Using managed identity as credential")
           return BlobServiceClient(
               account_url=f"https://{conn.login}.blob.core.windows.net/";,
               credential=credential,
               **extra,
           )
   ```
   
   solution 1:
   instead of ManagedIdentityCredential use 
[Azure.identity.DefaultAzureCredential](https://github.com/Azure/azure-sdk-for-python/blob/aa35d07aebf062393f14d147da54f0342e6b94a8/sdk/identity/azure-identity/azure/identity/_credentials/default.py#L32)
   
   solution 2:
   pass the client id from env [as done in 
DefaultAzureCredential](https://github.com/Azure/azure-sdk-for-python/blob/aa35d07aebf062393f14d147da54f0342e6b94a8/sdk/identity/azure-identity/azure/identity/_credentials/default.py#L104-L106):
 
   
   `ManagedIdentityCredential(client_id=os.environ.get("AZURE_CLIENT_ID")`
   
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to