kaxil opened a new pull request, #60940:
URL: https://github.com/apache/airflow/pull/60940

   This PR adds an optional `gunicorn` server type for the API server, 
providing:
   - **Memory sharing**: Gunicorn uses preload + fork, so workers share memory 
via copy-on-write (unlike uvicorn's multiprocess mode where each worker loads 
everything independently)
   - **Rolling worker restarts**: GunicornMonitor performs zero-downtime worker 
recycling to prevent memory accumulation
   - **Proper signal handling**: SIGTTOU kills oldest worker (FIFO), enabling 
true rolling restarts
   
   ## Usage
   
   ```bash
   # Enable gunicorn mode
   export AIRFLOW__API__SERVER_TYPE=gunicorn
   export AIRFLOW__API__WORKER_REFRESH_INTERVAL=43200  # 12 hours
   
   airflow api-server
   ```
   
   ## Configuration
   
   New `[api]` configuration options:
   - `server_type`: `uvicorn` (default) or `gunicorn`
   - `worker_refresh_interval`: Seconds between worker refresh cycles (0 = 
disabled)
   - `worker_refresh_batch_size`: Workers to refresh per cycle (default: 1)
   - `reload_on_plugin_change`: Reload on plugin file changes (default: False)
   
   ## Architecture
   
   ```
   ┌─────────────────────────────────────────────────────────┐
   │  airflow api-server (main process)                      │
   │  ├── subprocess: gunicorn master (PID management)       │
   │  │   ├── worker 1 (UvicornWorker)                       │
   │  │   ├── worker 2 (UvicornWorker)                       │
   │  │   └── worker N (UvicornWorker)                       │
   │  └── thread: GunicornMonitor (rolling restarts)         │
   └─────────────────────────────────────────────────────────┘
   ```
   
   ## Rolling Restart Flow
   
   1. Spawn `batch_size` new workers (SIGTTIN)
   2. Wait for new workers to be ready (process title check)
   3. HTTP health check (`/api/v2/monitor/health`)
   4. Kill `batch_size` old workers (SIGTTOU - kills oldest)
   5. Repeat until all original workers replaced
   
   ## Gotchas vs Uvicorn mode
   
   | Aspect | Gunicorn | Uvicorn |
   |--------|----------|---------|
   | Memory sharing | Yes (preload + fork COW) | No (independent workers) |
   | Rolling restarts | Yes (GunicornMonitor) | No |
   | Worker management | Master/worker architecture | Direct multiprocess |
   | macOS support | Limited (setproctitle issues) | Full |
   
   ## Why Gunicorn is Optional
   
   Gunicorn is an optional extra (`apache-airflow-core[gunicorn]`) rather than 
a required dependency because:
   
   1. **Windows incompatibility**: Gunicorn is Unix-only and doesn't work on 
Windows
   2. **Most users don't need it**: Default uvicorn mode is sufficient for:
      - Development environments
      - Single-worker deployments
      - Short-lived containers (K8s pods that get recycled anyway)
   3. **Dependency minimization**: Users who don't need rolling restarts or 
memory sharing shouldn't pay the dependency cost
   
   **When to use gunicorn:**
   - Long-running API server processes where memory accumulation is a concern
   - Multi-worker deployments where memory sharing matters
   - Production environments requiring zero-downtime worker recycling
   
   ## Known Limitations / Follow-up Items
   
   ### Log Format Inconsistency
   
   Gunicorn subprocess uses its own logging format which differs from Airflow's 
structlog format:
   
   ```
   # Gunicorn native logs:
   [2026-01-22 14:14:03 +0000] [433] [INFO] Handling signal: ttin
   
   # Airflow structlog logs:
   2026-01-22T14:14:03.115354Z [info     ] Rolling restart: spawning...  
[airflow.cli.commands.gunicorn_monitor]
   ```
   
   This means API server logs will have mixed formats when using gunicorn mode. 
A follow-up PR should add `--logconfig` to gunicorn to match Airflow's format 
for consistent log parsing/export.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to