kristoffern commented on pull request #19769:
URL: https://github.com/apache/airflow/pull/19769#issuecomment-1033461316


   @ephraimbuddy and rest of the Airflow team monitoring this ticket.
   I'm about to test the two patches and will get back with results later today.
   
   But I also need to update this ticket with some other information as well.
   Last week we deployed the patch for 19769 and on Friday and Saturday morning 
the system/pipeline was incredibly delayed due to the scheduler running at a 
fraction of the normal speed. We increased the CPU allocation for the scheduler 
as mentioned and the pipeline finished in a better shape, but it still wasn't 
as fast as a "clean" Airflow 2.2.3 installation.
   I had also changed the settings to 1800 & 3600 seconds for clearing the 
Celery tasks as opposed to the standard 300 seconds.
   
   I left the system running with the patch in place and increased CPU 
allocation, and when I checked in on the system on Sunday morning all was 
calm... too calm it turned out as nothing was being scheduled by the scheduler. 
DAGs would go "green" in running, but no tasks whatsoever started. Of course 
this meant no errors since nothing ran. 
   
   In the end I had to redeploy our system back to a clean 2.2.3 installation 
and then the scheduler started working again. Of course with a few tasks that 
ended up in "Queued" state, but compared to the other mornings, nothing major.
   
   The next part is only speculation, but the scheduler felt like it got slower 
and slower the longer it ran. With the increased time in between runs it took 
longer for the problem to surface, but like I said, on Sunday morning the 
scheduler was ground to a halt. 
   Note also that the Airflow webserver **didn't** complain about the scheduler 
not being online, it could apparently communicate with the scheduler just fine. 
It was just that the scheduler didn't start any new tasks.
   
   Again I'll test the two patches today in our staging environment and get 
back with the results. I just thought it was important to get the information 
from this weekend to you as well.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to