kaxil commented on issue #60517: URL: https://github.com/apache/airflow/issues/60517#issuecomment-3749956419
This conflates two different timeout concepts: 1. **Execution timeout** (`execution_timeout` on a task) - limits how long a task process can run. When exceeded, `AirflowTaskTimeout` is raised and `on_kill()` is called because the task process is actively running. 2. **Deferral/Sensor timeout** (passed to `defer()`) - limits how long a trigger waits for a condition. When exceeded, `TaskDeferralTimeout` is raised. The task process isn't running at this point — it's deferred. The triggerer fires an event and the task resumes just to raise the exception. `TaskDeferralTimeout` was introduced in #33718 specifically to give deferrable sensors the same behavior as non-deferrable sensors w.r.t. sensor timeouts — i.e., the task fails without retry (via `AirflowSensorTimeout`). It was never intended to trigger `on_kill()`. If you need to cancel the EMR job when a deferral timeout occurs, that's operator-specific cleanup logic. The EMRContainers operator should handle this in its callback method or by overriding `resume_execution` to perform cleanup before re-raising. This isn't something the framework should do generically via `on_kill()` — `on_kill()` is for when a running task process is being terminated. The fix here would be to update the EMR operator to handle cleanup on deferral timeout, not to change the task runner. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
