[ 
https://issues.apache.org/jira/browse/AIRFLOW-2367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16448974#comment-16448974
 ] 

John Arnold commented on AIRFLOW-2367:
--------------------------------------

This insert is taking the second-most time (about 1/3 of the above):

INSERT INTO log (dttm, dag_id, task_id, event, execution_date, owner, extra) 
VALUES (?::timestamptz, ?, ?, ?, ?::timestamptz, ?, ?) RETURNING log.id

I verified there are no indexes, triggers or weird constraints that would make 
it slow, it just has a high volume.

> High POSTGRES DB CPU utilization
> --------------------------------
>
>                 Key: AIRFLOW-2367
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-2367
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: scheduler
>    Affects Versions: Airflow 2.0, 1.9.0
>            Reporter: John Arnold
>            Priority: Major
>         Attachments: cpu.png, postgres.png
>
>
> We are seeing steady state 70-90% CPU utilization.  It feels like a missing 
> index kind of problem, as our TPS rate is really low, I'm not seeing any long 
> running queries, connection counts are reasonable (low hundreds) and locks 
> also look reasonable (not many exclusive / write locks)
> We shut down the webserver and it doesn't go away, so it doesn't seem to be 
> in that part of the code. My guess is either the scheduler has an inefficient 
> query, or the (Celery) executor code path does.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to