Hi, I'm using Airflow 1.7 for some time now (~half a year in prod) and lately I stated to notice that the load time of the "recent statuses" part on the UI main page is starting to take a lot of time. Got to 10-13 seconds.
>From looking in Mysql I found that this is the query that takes the whole time: SELECT task_instance.dag_id AS task_instance_dag_id, task_instance.state AS task_instance_state, count(task_instance.task_id) AS count_1 FROM task_instance LEFT OUTER JOIN dag_run ON dag_run.dag_id = task_instance.dag_id AND dag_run.execution_date = task_instance.execution_date AND dag_run.state = 'running' LEFT OUTER JOIN (SELECT dag_run.dag_id AS dag_id, max(dag_run.execution_date) AS execution_date FROM dag_run GROUP BY dag_run.dag_id) AS last_dag_run ON last_dag_run.dag_id = task_instance.dag_id AND last_dag_run.execution_date = task_instance.execution_date WHERE task_instance.task_id IN ( <big list of all tasks> ) AND (dag_run.dag_id IS NOT NULL OR last_dag_run.dag_id IS NOT NULL) GROUP BY task_instance.dag_id, task_instance.state First, I think the second left outer join might be a bit redundant and can be replaced with a simple join. When I changed that, I got the same results in less than 2 secs. (down from 13 seconds) 1. Does that change make sense? 2. Is that something that was resolved in recent version? (Couldn't find any issue) 3. Is that something I can change by simply changing some sql resource files or is it generated by some kind of an ORM? Thanks!
