vmtuan12 opened a new issue, #52697:
URL: https://github.com/apache/airflow/issues/52697

   ### Apache Airflow Provider(s)
   
   amazon
   
   ### Versions of Apache Airflow Providers
   
   apache-airflow-providers-amazon==9.2.0
   
   ### Apache Airflow version
   
   2.10.5
   
   ### Operating System
   
   Debian GNU/Linux 12 (bookworm)
   
   ### Deployment
   
   Official Apache Airflow Helm Chart
   
   ### Deployment details
   
   _No response_
   
   ### What happened
   
   When using `S3KeySensor` from `airflow.providers.amazon.aws.sensors.s3`
   
   ```py
   S3KeySensor(
       task_id="task1",
       bucket_key=f"s3://...",
       wildcard_match=True,
       aws_conn_id=conn_id,
       mode="reschedule",
       timeout=3 * 60 * 60,
       dag=dag
   )
   ```
   
   After upgrading from Airflow 2.10.2 to 2.10.5, we accidentally faces an 
incident with message
   
   ```py
   ...
   File 
"/home/airflow/.local/lib/python3.11/site-packages/botocore/paginate.py", line 
357, in _make_request
     return self._method(**current_kwargs)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   File "/home/airflow/.local/lib/python3.11/site-packages/botocore/client.py", 
line 569, in _api_call
     return self._make_api_call(operation_name, kwargs)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   File "/home/airflow/.local/lib/python3.11/site-packages/botocore/client.py", 
line 1023, in _make_api_call
     raise error_class(parsed_response, operation_name)
   botocore.exceptions.ClientError: An error occurred (InvalidAccessKeyId) when 
calling the ListObjectsV2 operation: The AWS Access Key Id you provided does 
not exist in our records.
   ```
   
   This is very hard to find the root cause
   
   ### What you think should happen instead
   
   After hours of investigating, we finally found out that the `Extra` field in 
connection was
   
   ```
   {
     "host": "http://s3.sample.com:1234";
   }
   ```
   Due to `apache-airflow-providers-amazon==8.28.0` in Airflow 2.10.2. After 
adding `"endpoint_url": "http://s3.sample.com:1234"` in the JSON, it has been 
fixed.
   
   In `providers/amazon/aws/utils/connection_wrapper.py`, function 
`__post_init__` of class `AwsConnectionWrapper`, the line
   
   ```py
   self.endpoint_url = extra.get("endpoint_url")
   ```
   
   should not be like that. Must check if `endpoint_url` exists in the `extra`, 
if not, then raise an Exception with clear message, like "Missing endpoint_url 
in extra of connection."
   
   ### How to reproduce
   
   Use `S3KeySensor`, and the `Extra` field in connection, do not set 
`endpoint_url`
   
   ```py
   from airflow.providers.amazon.aws.sensors.s3 import S3KeySensor
   
   S3KeySensor(
       task_id="task1",
       bucket_key=f"s3://...",
       wildcard_match=True,
       aws_conn_id=conn_id,
       mode="reschedule",
       timeout=3 * 60 * 60,
       dag=dag
   )
   ```
   
   ### Anything else
   
   The release note also needs to update this information, in my point of view
   
   ### Are you willing to submit PR?
   
   - [x] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [x] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to