Anyone?

On Wed, Oct 26, 2016 at 11:58 AM Or Sher <[email protected]> wrote:

> Hi,
>
> I'm using Airflow 1.7  for some time now (~half a year in prod) and lately
> I stated to notice that the load time of the "recent statuses" part on the
> UI main page is starting to take a lot of time. Got to 10-13 seconds.
>
> From looking in Mysql I found that this is the query that takes the whole
> time:
>
> SELECT task_instance.dag_id AS task_instance_dag_id, task_instance.state
> AS task_instance_state, count(task_instance.task_id) AS count_1
> FROM task_instance LEFT OUTER JOIN dag_run ON dag_run.dag_id =
> task_instance.dag_id AND dag_run.execution_date =
> task_instance.execution_date AND dag_run.state = 'running' LEFT OUTER JOIN
> (SELECT dag_run.dag_id AS dag_id, max(dag_run.execution_date) AS
> execution_date
> FROM dag_run GROUP BY dag_run.dag_id) AS last_dag_run ON
> last_dag_run.dag_id = task_instance.dag_id AND last_dag_run.execution_date
> = task_instance.execution_date
> WHERE task_instance.task_id IN ( <big list of all tasks> ) AND
> (dag_run.dag_id IS NOT NULL OR last_dag_run.dag_id IS NOT NULL) GROUP BY
> task_instance.dag_id, task_instance.state
>
> First, I think the second left outer join might be a bit redundant and can
> be replaced with a simple join. When I changed that, I got the same results
> in less than 2 secs. (down from 13 seconds)
>
> 1. Does that change make sense?
> 2. Is that something that was resolved in recent version? (Couldn't find
> any issue)
> 3. Is that something I can change by simply changing some sql resource
> files or is it generated by some kind of an ORM?
>
> Thanks!
>

Reply via email to