dirrao commented on issue #22612:
URL: https://github.com/apache/airflow/issues/22612#issuecomment-2089951478

   We are seeing this issue in the airflow version 2.3.3. I strongly believe 
the issue is there in the latest airflow version 2.9.1 as well as per the 
latest code. I don't see any improvements in watcher performance. 
   The primary reason for this issue is due to the Kubernetes pod watcher is 
not fast enough to cope with the Kubernetes events rate. This leads to 
Kubernetes watcher failure/restart and adopt_complete_pods take over the 
completed pods. The adopt_complete_pods will take a couple of minutes, causing 
the scheduler delayed heartbeat, and then scheduler liveness failures, and then 
scheduler pod restart. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to