Anyone? On Wed, Oct 26, 2016 at 11:58 AM Or Sher <[email protected]> wrote:
> Hi, > > I'm using Airflow 1.7 for some time now (~half a year in prod) and lately > I stated to notice that the load time of the "recent statuses" part on the > UI main page is starting to take a lot of time. Got to 10-13 seconds. > > From looking in Mysql I found that this is the query that takes the whole > time: > > SELECT task_instance.dag_id AS task_instance_dag_id, task_instance.state > AS task_instance_state, count(task_instance.task_id) AS count_1 > FROM task_instance LEFT OUTER JOIN dag_run ON dag_run.dag_id = > task_instance.dag_id AND dag_run.execution_date = > task_instance.execution_date AND dag_run.state = 'running' LEFT OUTER JOIN > (SELECT dag_run.dag_id AS dag_id, max(dag_run.execution_date) AS > execution_date > FROM dag_run GROUP BY dag_run.dag_id) AS last_dag_run ON > last_dag_run.dag_id = task_instance.dag_id AND last_dag_run.execution_date > = task_instance.execution_date > WHERE task_instance.task_id IN ( <big list of all tasks> ) AND > (dag_run.dag_id IS NOT NULL OR last_dag_run.dag_id IS NOT NULL) GROUP BY > task_instance.dag_id, task_instance.state > > First, I think the second left outer join might be a bit redundant and can > be replaced with a simple join. When I changed that, I got the same results > in less than 2 secs. (down from 13 seconds) > > 1. Does that change make sense? > 2. Is that something that was resolved in recent version? (Couldn't find > any issue) > 3. Is that something I can change by simply changing some sql resource > files or is it generated by some kind of an ORM? > > Thanks! >
