kaxil opened a new pull request, #60919:
URL: https://github.com/apache/airflow/pull/60919

   Add UvicornMonitor for rolling worker restarts (port of Airflow 2's 
GunicornMonitor).
   
   ## Why
   
   API server workers accumulate memory over time. Airflow 2 had 
`GunicornMonitor` with rolling restarts - spawn new workers, health check them, 
then kill old ones. This was lost when we switched to uvicorn.
   
   Uvicorn's built-in `limit_max_requests` doesn't help here - it kills old 
workers *before* spawning new ones, causing downtime.
   
   ## What
   
   New `UvicornMonitor` class that does rolling restarts:
   
   ```
   [n workers] → spawn batch → [n + batch workers] → health check → kill old 
batch → [n workers]
   ```
   
   Enabled via config:
   ```ini
   [api]
   worker_refresh_interval = 1800  # seconds, 0 = disabled (default)
   worker_refresh_batch_size = 1
   ```
   
   ## Gotchas
   
   **Single-worker mode**: uvicorn needs `workers >= 2` for multiprocess mode 
(SIGTTIN/SIGTTOU signals). With `--workers 1`, we start with 2 workers and 
scale down after startup. Works, but briefly has 2 workers during refresh.
   
   **Uvicorn kills newest worker, not oldest**: Unlike gunicorn which kills the 
oldest worker on SIGTTOU, uvicorn kills the newest (`processes.pop()` vs 
`workers.pop(0)` sorted by age). This would kill our fresh healthy workers 
instead of the old ones. Fixed by sending SIGTERM directly to specific old PIDs 
instead of using SIGTTOU.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to