seanghaeli opened a new pull request, #68647: URL: https://github.com/apache/airflow/pull/68647
### Problem `DeadlineReference.AVERAGE_RUNTIME` builds a deadline from the average duration of past DAG runs. The duration query (`SerializedReferenceModels.AverageRuntimeDeadline._evaluate_with`) filtered only on `dag_id` and "has both start and end date" — there was **no `DagRun.state` filter**, so **failed runs were folded into the average**. A failed run's duration is not representative of a normal runtime: - a history of fast failures makes the average too short → healthy runs miss the deadline constantly (alert fatigue), - a run that hung and then failed makes it too long → real slowness never trips the deadline. ### Fix Filter the duration query to `DagRun.state == DagRunState.SUCCESS`. Omitted runs keep the existing `min_runs`/`max_runs` semantics; the "not enough runs" skip now counts only successful runs. ### Tests - `test_average_runtime_excludes_non_successful_runs`: a mix of fast-successful and slow-failed runs averages to the successful duration only (fails on the old code, which averaged in the failures). - `test_average_runtime_skips_when_too_few_successful_runs`: a history of only failed runs yields no deadline (fewer than `min_runs` successful). --- ##### Was generative AI tooling used to co-author this PR? - [X] Yes — Claude Code (Opus 4.8) Generated-by: Claude Code (Opus 4.8) following [the guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#gen-ai-assisted-contributions) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
