GitHub user michaelosthege added a comment to the discussion: High CPU and 
database load caused by Airflow 3 dag-processor and scheduler

Thanks for suggesting some debugging steps.
My memory load looks OK, but looking at the DB with pgAdmin revealed a lot of 
transaction activity.
I stopped the containers one by one and found:
* `scheduler` doing about 60 transactions/second
* `dag-processor` doing about 80 transactions/second

According to `py-spy`, the DAG processor spends a lot of time in two calls 
making DB queries:

* 
[_fetch_callbacks](https://github.com/apache/airflow/blob/3.0.2/airflow-core/src/airflow/dag_processing/manager.py#L450)
* 
[_queue_requested_files_for_parsing](https://github.com/apache/airflow/blob/3.0.2/airflow-core/src/airflow/dag_processing/manager.py#L405)

Here's the flame graph:

![dag-processor](https://github.com/user-attachments/assets/7aee538e-f559-4f55-9420-b4e5443f78d9)

Looking at [this `while True` loop in 
`_run_parsing_loop`](https://github.com/apache/airflow/blob/3.0.2/airflow-core/src/airflow/dag_processing/manager.py#L331-L378)
 I don't see any delays, or sleeps.
Isn't this just looping at maximum speed?



GitHub link: 
https://github.com/apache/airflow/discussions/53177#discussioncomment-13734214

----
This is an automatically sent email for commits@airflow.apache.org.
To unsubscribe, please send an email to: commits-unsubscr...@airflow.apache.org

Reply via email to