nrobinson-intelycare opened a new issue, #45622: URL: https://github.com/apache/airflow/issues/45622
### Apache Airflow Provider(s) amazon ### Versions of Apache Airflow Providers apache-airflow-providers-amazon==8.29.0 apache-airflow-providers-common-compat==1.3.0 apache-airflow-providers-common-io==1.5.0 apache-airflow-providers-common-sql==1.21.0 apache-airflow-providers-fab==1.5.2 apache-airflow-providers-ftp==3.12.0 apache-airflow-providers-http==5.0.0 apache-airflow-providers-imap==3.8.0 apache-airflow-providers-postgres==5.14.0 apache-airflow-providers-sendgrid==3.6.0 apache-airflow-providers-smtp==1.9.0 apache-airflow-providers-snowflake==5.8.1 apache-airflow-providers-sqlite==4.0.0 ### Apache Airflow version 2.10.4 ### Operating System Amazon Linux 2023.6.20241212 ### Deployment Virtualenv installation ### Deployment details Custom CDK stack with: - EC2 instance running Airflow, managed by systemd - IAM role granting permissions to AWS services - RDS instance running Postgres The Airflow virtualenv is managed by uv. ### What happened When running a DAG with a deferrable BatchOperator and using boto3 credential strategy (`{base_aws.py:180} INFO - No connection ID provided. Fallback on boto3 credential strategy (region_name='us-east-1')`) a deferrable BatchOperator task can have it's trigger immediately fail after submitting a batch job. Although the trigger fails immediately, the batch job had launched successfully, and executes until successful exit, unbeknownst to Airflow. Due to the scheduling of the DAG, there currently have not been any overlaps with the failed task's batch job and a subsequent task run yet, but having overlapping runs would be undesirable. This error happens about once a week. I believe it has something to do with amazon-ssm-agent not rotating the credentials quickly enough. ### What you think should happen instead `async_wait()` should catch the `NoCredentialsError` and continue to the next waiter attempt. https://github.com/apache/airflow/blob/main/providers/src/airflow/providers/amazon/aws/utils/waiter_with_logging.py#L133 ### How to reproduce Hard to reproduce, but invalidating AWS credentials right before the trigger initializes would likely produce a similar traceback. ### Anything else Traceback from task log: ``` [2025-01-10, 20:00:19 EST] {baseoperator.py:1806} ERROR - Trigger failed: Traceback (most recent call last): File "/opt/airflow/git/.venv/lib/python3.12/site-packages/airflow/jobs/triggerer_job_runner.py", line 558, in cleanup_finished_triggers result = details["task"].result() ^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/airflow/git/.venv/lib/python3.12/site-packages/airflow/jobs/triggerer_job_runner.py", line 630, in run_trigger async for event in trigger.run(): File "/opt/airflow/git/.venv/lib/python3.12/site-packages/airflow/providers/amazon/aws/triggers/base.py", line 143, in run await async_wait( File "/opt/airflow/git/.venv/lib/python3.12/site-packages/airflow/providers/amazon/aws/utils/waiter_with_logging.py", line 133, in async_wait await waiter.wait(**args, WaiterConfig={"MaxAttempts": 1}) File "/opt/airflow/git/.venv/lib/python3.12/site-packages/aiobotocore/waiter.py", line 49, in wait return await AIOWaiter.wait(self, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/airflow/git/.venv/lib/python3.12/site-packages/aiobotocore/waiter.py", line 95, in wait response = await self._operation_method(**kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/airflow/git/.venv/lib/python3.12/site-packages/aiobotocore/waiter.py", line 78, in __call__ return await self._client_method(**kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/airflow/git/.venv/lib/python3.12/site-packages/aiobotocore/client.py", line 394, in _make_api_call http, parsed_response = await self._make_request( ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/airflow/git/.venv/lib/python3.12/site-packages/aiobotocore/client.py", line 420, in _make_request return await self._endpoint.make_request( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/airflow/git/.venv/lib/python3.12/site-packages/aiobotocore/endpoint.py", line 96, in _send_request request = await self.create_request(request_dict, operation_model) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/airflow/git/.venv/lib/python3.12/site-packages/aiobotocore/endpoint.py", line 84, in create_request await self._event_emitter.emit( File "/opt/airflow/git/.venv/lib/python3.12/site-packages/aiobotocore/hooks.py", line 68, in _emit response = await resolve_awaitable(handler(**kwargs)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/airflow/git/.venv/lib/python3.12/site-packages/aiobotocore/_helpers.py", line 6, in resolve_awaitable return await obj ^^^^^^^^^ File "/opt/airflow/git/.venv/lib/python3.12/site-packages/aiobotocore/signers.py", line 24, in handler return await self.sign(operation_name, request) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/airflow/git/.venv/lib/python3.12/site-packages/aiobotocore/signers.py", line 90, in sign auth.add_auth(request) File "/opt/airflow/git/.venv/lib/python3.12/site-packages/botocore/auth.py", line 423, in add_auth raise NoCredentialsError() botocore.exceptions.NoCredentialsError: Unable to locate credentials [2025-01-10, 20:00:19 EST] {taskinstance.py:3311} ERROR - Task failed with exception Traceback (most recent call last): File "/opt/airflow/git/.venv/lib/python3.12/site-packages/airflow/models/taskinstance.py", line 767, in _execute_task result = _execute_callable(context=context, **execute_callable_kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/airflow/git/.venv/lib/python3.12/site-packages/airflow/models/taskinstance.py", line 733, in _execute_callable return ExecutionCallableRunner( ^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/airflow/git/.venv/lib/python3.12/site-packages/airflow/utils/operator_helpers.py", line 252, in run return self.func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/airflow/git/.venv/lib/python3.12/site-packages/airflow/models/baseoperator.py", line 1807, in resume_execution raise TaskDeferralError(next_kwargs.get("error", "Unknown")) airflow.exceptions.TaskDeferralError: Trigger failure ``` ### Are you willing to submit PR? - [x] Yes I am willing to submit a PR! ### Code of Conduct - [x] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org