scnerd opened a new issue, #35805:
URL: https://github.com/apache/airflow/issues/35805
### Apache Airflow version
2.7.3
### What happened
When RedshiftSQLHook attempts to auto-fetch credentials when `iam=True`, it
uses a cluster-specific approach to obtaining credentials, which fails for
Redshift Serverless.
```
Traceback (most recent call last):
File
"/usr/local/lib/python3.11/site-packages/airflow/providers/common/sql/operators/sql.py",
line 280, in execute
output = hook.run(
^^^^^^^^^
File
"/usr/local/lib/python3.11/site-packages/airflow/providers/common/sql/hooks/sql.py",
line 385, in run
with closing(self.get_conn()) as conn:
^^^^^^^^^^^^^^^
File
"/usr/local/lib/python3.11/site-packages/airflow/providers/amazon/aws/hooks/redshift_sql.py",
line 173, in get_conn
conn_params = self._get_conn_params()
^^^^^^^^^^^^^^^^^^^^^^^
File
"/usr/local/lib/python3.11/site-packages/airflow/providers/amazon/aws/hooks/redshift_sql.py",
line 84, in _get_conn_params
conn.login, conn.password, conn.port = self.get_iam_token(conn)
^^^^^^^^^^^^^^^^^^^^^^^^
File
"/usr/local/lib/python3.11/site-packages/airflow/providers/amazon/aws/hooks/redshift_sql.py",
line 115, in get_iam_token
cluster_creds = redshift_client.get_cluster_credentials(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/botocore/client.py", line
535, in _api_call
return self._make_api_call(operation_name, kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/botocore/client.py", line
980, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.errorfactory.ClusterNotFoundFault: An error occurred
(ClusterNotFound) when calling the GetClusterCredentials operation: Cluster ***
not found.
```
### What you think should happen instead
The operator should establish a connection to the serverless workgroup using
IAM-obtained credentials using `redshift_connector`.
### How to reproduce
Create a direct SQL connection to Redshift using IAM authentication,
something like:
```
{"conn_type":"redshift","extra":"{\"db_user\":\"USER\",\"iam\":true,\"user\":\"USER\"}","host":"WORKGROUP_NAME.ACCOUNT.REGION.redshift-serverless.amazonaws.com","login":"USER","port":5439,"schema":"DATABASE"}
```
Then use this connection for any `SQLExecuteQueryOperator`. The crash should
occur when establishing the connection.
### Operating System
Docker, `amazonlinux:2023` base
### Versions of Apache Airflow Providers
This report applies to apache-airflow-providers-amazon==8.7.1, and the
relevant code appears unchange in the master branch. The code I'm using worked
for Airflow 2.5.2 and version 7.1.0 of the provider.
### Deployment
Amazon (AWS) MWAA
### Deployment details
Local MWAA runner
### Anything else
The break seems to occur because the RedshiftSQLHook integrates the IAM ->
credential conversion, which used to occur inside `redshift_connector.connect`.
The logic is not as robust and assumes that the connection refers to a Redshift
cluster rather than a serverless workgroup. It's not clear to me why this logic
was pulled up and out of `redshift_connector`, but it seems like the easiest
solution is just to let `redshift_connector` handle IAM authentication and not
attempt to duplicate that logic in the airflow provider.
### Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]