SameerMesiah97 opened a new pull request, #61051:
URL: https://github.com/apache/airflow/pull/61051
**Description**
Added best-effort cleanup to `EcsRunTaskOperator` to ensure ECS tasks are
stopped when failures occur after a task has been successfully started.
Previously, the operator could successfully start an ECS task via `RunTask`
and then fail during post-start steps (for example, when waiting for task
completion with `wait_for_completion=True` and missing `ecs:DescribeTasks`
permissions). In these cases, the Airflow task failed while the ECS task
continued running in AWS.
The operator now attempts to stop any ECS task that was started by the
current task instance if an exception is raised after task start. Cleanup is
performed opportunistically and does not mask or replace the original exception
if stopping the task fails.
**Rationale**
`EcsRunTaskOperator` manages the lifecycle of an external resource whose
execution extends beyond the lifetime of the Airflow task. If task start
succeeds but subsequent execution steps fail, Airflow can no longer reliably
observe or manage the running ECS task, potentially leaving resources running
unexpectedly.
Failures after task start can occur for multiple reasons, including IAM
permission errors (for example, missing `ecs:DescribeTasks`) or loss of access
to systems used during task execution. Attempting best-effort cleanup in these
scenarios avoids leaving unmanaged ECS tasks running while preserving existing
failure semantics.
Cleanup is only attempted when the operator can confidently determine that
the ECS task was started by the current execution. This is achieved by tracking
whether the task was started during the current run and using the task ARN
returned by `RunTask`. This avoids interfering with pre-existing tasks in
reattach scenarios while still preventing resource leaks on post-start failures.
**Tests**
* Added a unit test verifying that an ECS task is stopped when a failure
occurs after task start.
* Added a unit test ensuring that failures during cleanup do not mask or
override the original exception.
**Backwards Compatibility**
No changes to the public API or operator parameters.
Closes: #61050
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]