[
https://issues.apache.org/jira/browse/AIRFLOW-2367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16448974#comment-16448974
]
John Arnold commented on AIRFLOW-2367:
--------------------------------------
This insert is taking the second-most time (about 1/3 of the above):
INSERT INTO log (dttm, dag_id, task_id, event, execution_date, owner, extra)
VALUES (?::timestamptz, ?, ?, ?, ?::timestamptz, ?, ?) RETURNING log.id
I verified there are no indexes, triggers or weird constraints that would make
it slow, it just has a high volume.
> High POSTGRES DB CPU utilization
> --------------------------------
>
> Key: AIRFLOW-2367
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2367
> Project: Apache Airflow
> Issue Type: Bug
> Components: scheduler
> Affects Versions: Airflow 2.0, 1.9.0
> Reporter: John Arnold
> Priority: Major
> Attachments: cpu.png, postgres.png
>
>
> We are seeing steady state 70-90% CPU utilization. It feels like a missing
> index kind of problem, as our TPS rate is really low, I'm not seeing any long
> running queries, connection counts are reasonable (low hundreds) and locks
> also look reasonable (not many exclusive / write locks)
> We shut down the webserver and it doesn't go away, so it doesn't seem to be
> in that part of the code. My guess is either the scheduler has an inefficient
> query, or the (Celery) executor code path does.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)