AmosG commented on PR #56044:
URL: https://github.com/apache/airflow/pull/56044#issuecomment-3678604421

   @wjddn279 @potiuk 
   
   
   This PR seems critical for anyone running Airflow 3 on MySQL. But we've have 
also been tracking systemic `OperationalError` (Lost connection) and 
`PendingRollbackError` (500) issues in production on **3.1.5**, and this fix 
addresses the primary 'trigger' for those failures.
   
   **The Idea :**
   Based on :
   https://github.com/apache/airflow/pull/57815
   https://github.com/apache/airflow/issues/57065 
   and
   https://github.com/apache/airflow/issues/57981
   
   In Airflow 3, It seems to me that FastAPI server and legacy FAB UI share the 
same process and thread-local `scoped_session`, the 'COM_QUIT' sent by child 
processes during fork/GC (which this PR fixes) doesn't just cause a one-off 
error. Instead, it **poisons the entire worker thread**. 
   
   And FastAPI **currently lacks a global session teardown** (like Flask's 
`@app.teardown_appcontext`), once a connection is lost via the mechanism you've 
identified here, that thread's `settings.Session` remains in an invalid state. 
This leads to persistent 500 errors on completely unrelated requests (like 
`/login/` or `/api/v2/version`) until the process is restarted.
   
   **Impact of this PR:**
   By preventing the premature `COM_QUIT` on child process disposal, this PR 
removes the most frequent source of session poisoning for MySQL users. 
   
   **Recommendation:**
   We should definitely merge this to stop the 'bleeding' for MySQL 
deployments. 
   
   **BUT** 
   Additionally, we are recommending a systemic fix to add a global 
`SessionMiddleware` to FastAPI that calls `settings.Session.remove()` after 
every request. This would provide a second layer of defense, ensuring that even 
if a connection is lost for other reasons (network hiccup, timeout), the worker 
thread can 'self-heal' instead of entering a 500 loop.
   
   I can provide an initial PR to help explain 
   Please ACK
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to