jtv8 opened a new issue, #38762:
URL: https://github.com/apache/airflow/issues/38762
### Apache Airflow version
Other Airflow 2 version (please specify below)
### If "Other Airflow 2 version" selected, which one?
2.6.3
### What happened?
When trying to authenticate with an Azure managed identity, if more than one
managed identity exists on the virtual machine (this is always true when using
Azure Managed Airflow, and common when using Azure Kubernetes Service), the
connection will return the following error:
```
Response: {"error":"invalid_request","error_description":"Multiple user
assigned identities exist, please specify the clientId / resourceId of the
identity in the token request"}, Status Code: 400
```
### What you think should happen instead?
The solution to this problem is to allow the user to supply values to be
passed to the Azure Instance Metadata Service token endpoint as the
`object_id`, `client_id` and `msi_res_id` parameters, as documented here:
https://learn.microsoft.com/en-us/entra/identity/managed-identities-azure-resources/how-to-use-vm-token#get-a-token-using-http
Here's an example implementation showing how
[airflow/providers/databricks/hooks/databricks_base.py](https://github.com/apache/airflow/blob/0e8f108313d4af0b450581661aeb8ed35e82a8e6/airflow/providers/databricks/hooks/databricks_base.py#L305C1-L315C26)
could be changed to support this:
Before:
```
if self.databricks_conn.extra_dejson.get("use_azure_managed_identity",
False):
params = {
"api-version": "2018-02-01",
"resource": resource,
}
resp = requests.get(
AZURE_METADATA_SERVICE_TOKEN_URL,
params=params,
headers={**self.user_agent_header, "Metadata": "true"},
timeout=self.token_timeout_seconds,
)
```
After:
```
if self.databricks_conn.extra_dejson.get("use_azure_managed_identity",
False):
params = {
"api-version": "2018-02-01",
"resource": resource,
"object_id":
self.databricks_conn.extra_dejson.get("azure_managed_identity_object_id", None)
"client_id":
self.databricks_conn.extra_dejson.get("azure_managed_identity_client_id", None)
"msi_res_id":
self.databricks_conn.extra_dejson.get("azure_managed_identity_msi_res_id", None)
}
resp = requests.get(
AZURE_METADATA_SERVICE_TOKEN_URL,
params=params,
headers={**self.user_agent_header, "Metadata": "true"},
timeout=self.token_timeout_seconds,
)
```
### How to reproduce
* Create an Azure Managed Airflow instance, or an Azure virtual machine or
Kubernetes service with multiple managed identities
* In the Airflow UI, create a Databricks connection with
`use_azure_managed_identity` set to `true`
* Test the connection
### Operating System
n/a
### Versions of Apache Airflow Providers
_No response_
### Deployment
Microsoft ADF Managed Airflow
### Deployment details
_No response_
### Anything else?
_No response_
### Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]