[ 
https://issues.apache.org/jira/browse/AIRFLOW-2367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16448953#comment-16448953
 ] 

John Arnold commented on AIRFLOW-2367:
--------------------------------------

[~bolke]  Any suggestions on what metrics or configuration options?  We've been 
looking over the database (top 10 queries etc) and there are no surprises that 
I can see. The top query by far is for task_instance table and all the 
conditionals are for indexed columns.  I went through basically every query in 
models.py looking for any that are using unindexed columns, and didn't find any.

I've attached a screenshot of the top 10 queries.

 

We played with our connection pool sizes, thinking that perhaps we were 
hammering the db with connections, but that didn't seem to make any difference.

> High POSTGRES DB CPU utilization
> --------------------------------
>
>                 Key: AIRFLOW-2367
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-2367
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: scheduler
>    Affects Versions: Airflow 2.0, 1.9.0
>            Reporter: John Arnold
>            Priority: Major
>         Attachments: cpu.png, postgres.png
>
>
> We are seeing steady state 70-90% CPU utilization.  It feels like a missing 
> index kind of problem, as our TPS rate is really low, I'm not seeing any long 
> running queries, connection counts are reasonable (low hundreds) and locks 
> also look reasonable (not many exclusive / write locks)
> We shut down the webserver and it doesn't go away, so it doesn't seem to be 
> in that part of the code. My guess is either the scheduler has an inefficient 
> query, or the (Celery) executor code path does.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to