ephraimbuddy opened a new pull request, #54666:
URL: https://github.com/apache/airflow/pull/54666
This commit adds a asyncio stall monitor, used in triggerer job to surface
poorly-behaved triggers that block the event loop. The monitor runs in a
separate thread which makes it possible to detect blocking triggers unlike the
block_watchdog loop that when the asycio loop is blocked, it won't report until
the end of the blocking.
Changes:
- StallSample and StallIncident to record stack samples and aggregate a
stall’s lifecycle.
- A background watchdog thread checks wall-clock gaps; when the gap
exceeds a threshold, it treats it as a stall.
- On stall start/updates, captures bounded loop-thread stacks and logs them
- On stall end, provides a summary and pointer to the task that caused the
stall
Testing:
- Added unit tests and Manually tested it too
<img width="803" height="679" alt="Screenshot 2025-08-19 at 14 52 05"
src="https://github.com/user-attachments/assets/978e56e0-f61d-4662-b684-9b083e04e276"
/>
<img width="803" height="679" alt="Screenshot 2025-08-19 at 14 52 19"
src="https://github.com/user-attachments/assets/73679baf-a929-4ff5-8084-79c5076bb27b"
/>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]