potiuk opened a new pull request, #68652: URL: https://github.com/apache/airflow/pull/68652
The scheduled canary flaked on `test_ack_mode_queue_full_during_fanout_does_not_break_iteration` with a `TimeoutError` (e.g. the LowestDeps ARM run). Root cause is a race in the **test**, not the production code. The test pre-fills trigger 1's subscriber queue with a sentinel so the background poll task hits `QueueFull` and overflow-fails it at fan-out. But it then started draining `s_full` immediately, so depending on event-loop scheduling the drain could consume the sentinel *first* — leaving the queue no longer full when the burst was fanned out. The `put_nowait` then succeeded, the overflow under test never happened, and the iteration blocked forever (surfacing as the `TimeoutError`). It passed most of the time only because the poll task usually won the race. Fix: synchronise on the overflow — wait until trigger 1 is recorded in `_failed_subscribers` before draining `s_full`. The test is now deterministic (passes 6/6 locally in ~1s instead of hanging), and the production overflow path is unchanged. --- ##### Was generative AI tooling used to co-author this PR? - [X] Yes — Claude Code (Opus 4.8) Generated-by: Claude Code (Opus 4.8) following [the guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#gen-ai-assisted-contributions) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
