vmtuan12 opened a new issue, #52697:
URL: https://github.com/apache/airflow/issues/52697
### Apache Airflow Provider(s)
amazon
### Versions of Apache Airflow Providers
apache-airflow-providers-amazon==9.2.0
### Apache Airflow version
2.10.5
### Operating System
Debian GNU/Linux 12 (bookworm)
### Deployment
Official Apache Airflow Helm Chart
### Deployment details
_No response_
### What happened
When using `S3KeySensor` from `airflow.providers.amazon.aws.sensors.s3`
```py
S3KeySensor(
task_id="task1",
bucket_key=f"s3://...",
wildcard_match=True,
aws_conn_id=conn_id,
mode="reschedule",
timeout=3 * 60 * 60,
dag=dag
)
```
After upgrading from Airflow 2.10.2 to 2.10.5, we accidentally faces an
incident with message
```py
...
File
"/home/airflow/.local/lib/python3.11/site-packages/botocore/paginate.py", line
357, in _make_request
return self._method(**current_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/airflow/.local/lib/python3.11/site-packages/botocore/client.py",
line 569, in _api_call
return self._make_api_call(operation_name, kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/airflow/.local/lib/python3.11/site-packages/botocore/client.py",
line 1023, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (InvalidAccessKeyId) when
calling the ListObjectsV2 operation: The AWS Access Key Id you provided does
not exist in our records.
```
This is very hard to find the root cause
### What you think should happen instead
After hours of investigating, we finally found out that the `Extra` field in
connection was
```
{
"host": "http://s3.sample.com:1234"
}
```
Due to `apache-airflow-providers-amazon==8.28.0` in Airflow 2.10.2. After
adding `"endpoint_url": "http://s3.sample.com:1234"` in the JSON, it has been
fixed.
In `providers/amazon/aws/utils/connection_wrapper.py`, function
`__post_init__` of class `AwsConnectionWrapper`, the line
```py
self.endpoint_url = extra.get("endpoint_url")
```
should not be like that. Must check if `endpoint_url` exists in the `extra`,
if not, then raise an Exception with clear message, like "Missing endpoint_url
in extra of connection."
### How to reproduce
Use `S3KeySensor`, and the `Extra` field in connection, do not set
`endpoint_url`
```py
from airflow.providers.amazon.aws.sensors.s3 import S3KeySensor
S3KeySensor(
task_id="task1",
bucket_key=f"s3://...",
wildcard_match=True,
aws_conn_id=conn_id,
mode="reschedule",
timeout=3 * 60 * 60,
dag=dag
)
```
### Anything else
The release note also needs to update this information, in my point of view
### Are you willing to submit PR?
- [x] Yes I am willing to submit a PR!
### Code of Conduct
- [x] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]