ecodina commented on issue #65945:
URL: https://github.com/apache/airflow/issues/65945#issuecomment-4515356412

   I was now reading the Airflow 3.2.2 change log. Could this issue have been 
fixed?
   
   `Fix triggerer race condition and deadlock that caused deferred tasks to 
stall indefinitely
   
   Triggers that call synchronous SDK methods (e.g. get_task_states used by 
safe_to_cancel in several Google provider operators) could crash the 
triggerer's internal subprocess. The triggerer would then continue to heartbeat 
normally — appearing healthy to the scheduler — while silently processing zero 
triggers, causing every deferred task to time out. This was first reported in 
issue #64620; a partial fix shipped in Airflow 3.2.1 (#64882) but introduced a 
new deadlock with the same visible symptom under load.
   
   Both issues are fixed by replacing the lock-based serialization with 
response multiplexing: each request now carries a unique ID and the response is 
routed back to the correct caller, so concurrent requests from trigger threads 
no longer contend or deadlock regardless of how many triggers are running or 
what SDK methods they call. `


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to