GitHub user kenho811-b1 created a discussion: High CPU utilisation caused by 
frequent updates to dag_run table

### Apache Airflow version

Other Airflow 2 version (please specify below)

### If "Other Airflow 2 version" selected, which one?

2.9.1

### What happened?


**Setup**
We are self-hosting Airflow via Helm Chart. We use google's CloudSQL as our 
Airflow metadata store.

**Issue**

We observe that the below query has been called many times, causes database 
locks and is causing high CPU utilisation

```
UPDATE
  dag_run
SET
  last_scheduling_decision=$1::timestamptz,
  updated_at=$2::timestamptz
WHERE
  dag_run.id = $3
```

For instance, referring to the below screenshot,  in 1 day, 193,362 times has 
been called.

<img width="868" alt="image" 
src="https://github.com/user-attachments/assets/188d902f-1952-4305-a1f6-c532a0705a87";>



**Spec of the CLoudSQL instance**

The CLoudSQL instance has 4 vCPUs and 15GB ram.




### What you think should happen instead?

Updates to the dag_run on last_scheduling_decision should be reduced (or 
configurable?)

So we can test if database CPU utilisation can be reduced and Database locking 
problem can be resolved.



### How to reproduce

Not easily reproducible.

==========

At the time of the UPDATE SQLs, I observe that only 5 instances of DAG runs are 
running. 

We have a similar setup in another environment where the other CloudSQL 
instance also has only 4 vCPUs.

In the other environment, however, the number of times the UPDATE SQL is called 
is smaller. Also no huge database locks are observed in the other environment.






### Operating System

linux

### Versions of Apache Airflow Providers

_No response_

### Deployment

Official Apache Airflow Helm Chart

### Deployment details

_No response_

### Anything else?

_No response_

### Are you willing to submit PR?

- [ ] Yes I am willing to submit a PR!

### Code of Conduct

- [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)


GitHub link: https://github.com/apache/airflow/discussions/61003

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]

Reply via email to