arjav1528 opened a new pull request, #60596: URL: https://github.com/apache/airflow/pull/60596
**Closes:** #60261 ## Summary This PR fixes the intermittent `405 Method Not Allowed` error that edge workers encounter during startup in high-concurrency environments (e.g., 8 API servers × 64 worker processes). ## Root Cause The edge executor plugin was acquiring a global database lock (`DBLocks.MIGRATIONS`) **at class definition time** (plugin import) to create tables. When many API server processes started simultaneously, they all competed for the same lock, causing: 1. `psycopg2.errors.LockNotAvailable` errors on lock timeout 2. Plugin import failures → edge worker endpoints not registered 3. Edge workers receiving `405 Method Not Allowed` for `/edge_worker/v1/worker/` ## Solution - Moved table creation from plugin import time to a **FastAPI startup event** - Plugin import no longer blocks on database operations - Added retry mechanism (5 attempts) for lock acquisition - Exponential backoff with jitter to avoid thundering herd - Proper logging for debugging - Check if tables exist **before** attempting to acquire lock - Skip lock entirely during normal operation (tables already exist) - Double-check pattern after lock acquisition to handle race conditions -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
