[
https://issues.apache.org/jira/browse/AIRFLOW-1495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Edgar Rodriguez closed AIRFLOW-1495.
------------------------------------
Resolution: Fixed
> TaskInstanceModelView using unindexed column for order_by on default
> --------------------------------------------------------------------
>
> Key: AIRFLOW-1495
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1495
> Project: Apache Airflow
> Issue Type: Bug
> Components: webserver
> Reporter: Edgar Rodriguez
> Assignee: Edgar Rodriguez
>
> Using unindexed columns for ordering query results triggers sorting in the
> database, which makes for slower queries.
> Solution is to use an indexed column or add an index on the desired columns
> if it doesn't exist, see:
> https://github.com/apache/incubator-airflow/blob/master/airflow/www/views.py#L2489
> When checking for indices on task_instance, none is found for {{job_id}}:
> {code}
> mysql> show index from task_instance;
> +---------------+------------+--------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
> | Table | Non_unique | Key_name | Seq_in_index | Column_name
> | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
> Index_comment |
> +---------------+------------+--------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
> | task_instance | 0 | PRIMARY | 1 | task_id
> | A | 71 | NULL | NULL | | BTREE | |
> |
> | task_instance | 0 | PRIMARY | 2 | dag_id
> | A | 80 | NULL | NULL | | BTREE | |
> |
> | task_instance | 0 | PRIMARY | 3 | execution_date
> | A | 30056 | NULL | NULL | | BTREE | |
> |
> | task_instance | 1 | ti_dag_state | 1 | dag_id
> | A | 18 | NULL | NULL | | BTREE | |
> |
> | task_instance | 1 | ti_dag_state | 2 | state
> | A | 28 | NULL | NULL | YES | BTREE | |
> |
> | task_instance | 1 | ti_pool | 1 | pool
> | A | 1 | NULL | NULL | YES | BTREE | |
> |
> | task_instance | 1 | ti_pool | 2 | state
> | A | 3 | NULL | NULL | YES | BTREE | |
> |
> | task_instance | 1 | ti_pool | 3 | priority_weight
> | A | 16 | NULL | NULL | YES | BTREE | |
> |
> | task_instance | 1 | ti_state_lkp | 1 | dag_id
> | A | 18 | NULL | NULL | | BTREE | |
> |
> | task_instance | 1 | ti_state_lkp | 2 | task_id
> | A | 82 | NULL | NULL | | BTREE | |
> |
> | task_instance | 1 | ti_state_lkp | 3 | execution_date
> | A | 28037 | NULL | NULL | | BTREE | |
> |
> | task_instance | 1 | ti_state_lkp | 4 | state
> | A | 29354 | NULL | NULL | YES | BTREE | |
> |
> | task_instance | 1 | ti_state | 1 | state
> | A | 3 | NULL | NULL | YES | BTREE | |
> |
> +---------------+------------+--------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
> {code}
> So, when executing a query similar to the one for the view paging, the query
> explain shows the required filesort:
> {code}
> mysql> explain select * from task_instance order by job_id limit 500 offset
> 500;
> +----+-------------+---------------+------------+------+---------------+------+---------+------+-------+----------+----------------+
> | id | select_type | table | partitions | type | possible_keys | key
> | key_len | ref | rows | filtered | Extra |
> +----+-------------+---------------+------------+------+---------------+------+---------+------+-------+----------+----------------+
> | 1 | SIMPLE | task_instance | NULL | ALL | NULL | NULL
> | NULL | NULL | 30119 | 100.00 | Using filesort |
> +----+-------------+---------------+------------+------+---------------+------+---------+------+-------+----------+----------------+
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)