potiuk opened a new pull request, #68652:
URL: https://github.com/apache/airflow/pull/68652

   The scheduled canary flaked on
   `test_ack_mode_queue_full_during_fanout_does_not_break_iteration` with a
   `TimeoutError` (e.g. the LowestDeps ARM run).
   
   Root cause is a race in the **test**, not the production code. The test
   pre-fills trigger 1's subscriber queue with a sentinel so the background poll
   task hits `QueueFull` and overflow-fails it at fan-out. But it then started
   draining `s_full` immediately, so depending on event-loop scheduling the 
drain
   could consume the sentinel *first* — leaving the queue no longer full when 
the
   burst was fanned out. The `put_nowait` then succeeded, the overflow under 
test
   never happened, and the iteration blocked forever (surfacing as the
   `TimeoutError`). It passed most of the time only because the poll task 
usually
   won the race.
   
   Fix: synchronise on the overflow — wait until trigger 1 is recorded in
   `_failed_subscribers` before draining `s_full`. The test is now deterministic
   (passes 6/6 locally in ~1s instead of hanging), and the production overflow
   path is unchanged.
   
   ---
   
   ##### Was generative AI tooling used to co-author this PR?
   
   - [X] Yes — Claude Code (Opus 4.8)
   
   Generated-by: Claude Code (Opus 4.8) following [the 
guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#gen-ai-assisted-contributions)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to