kristoffern commented on pull request #19769: URL: https://github.com/apache/airflow/pull/19769#issuecomment-1029755436
Quick update on this: After I live-patched our GCP/Kubernetes cluster and the scheduler specifically to use more CPU the situation is better. We still see the warning that the scheduler does not appear to be running, just a lot less. Normally our airflow-scheduler (release 2.2.2) pod runs on "1.5" CPU in the kubernetes cluster, and increasing it to "2" CPU last night didn't help a lot. But when I increased it to "3" this morning the situation got a lot better and the scheduler was running a bit smoother when I followed both it's log and also looking at the DAGs/tasks in the webserver. Good news though, no stuck tasks in "Queued" and there was a lot of ramp-up and down on workers that normally would have occurred in this case. So this patch seems to have introduced a higher CPU load/requirement on the scheduler (In our context 1.5CPU => 3CPU). Below is the CPU graph from our morning run and when I increased the resources to 3CPU. As you can see it still hits the limit, but overall has some more breathing room, confirmed by the systems behavior.  -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
