Neil Hanlon created AIRFLOW-2059:
------------------------------------

             Summary: taskinstance query is awful, un-indexed, and does not 
scale
                 Key: AIRFLOW-2059
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-2059
             Project: Apache Airflow
          Issue Type: Bug
          Components: db, webserver
    Affects Versions: Airflow 1.8
         Environment: [nhanlon@<redacted> ~]$ nproc
4
[nhanlon@<redacted> ~]$ free -g
             total       used       free     shared    buffers     cached
Mem:             7          5          1          0          0          1
-/+ buffers/cache:          4          3 
Swap:            0          0          0 
[nhanlon@<redacted> ~]$ cat /etc/*release*
CentOS release 6.7 (Final)
CentOS release 6.7 (Final)
CentOS release 6.7 (Final)
cpe:/o:centos:linux:6:GA


[nhanlon@<redacted> ~]$ mysqld --version
mysqld  Ver 5.6.31-77.0 for Linux on x86_64 (Percona Server (GPL), Release 
77.0, Revision 5c1061c)

            Reporter: Neil Hanlon


 

The page at /admin/taskinstance/ can reach a point where it blocks loading the 
page and crushes the database. It appears this is because the 
task_instance.job_id column is unindexed. On our database, getting the results 
for this query took over four minutes, locking the table for the duration.

 

500 rows in set (4 min 8.93 sec)

 

Query:

 
{code:java}
SELECT task_instance.task_id AS task_instance_task_id, task_instance.dag_id AS 
task_instance_dag_id, task_instance.execution_date AS 
task_instance_execution_date, task_instance.start_date AS 
task_instance_start_date, task_instance.end_date AS task_instance_end_date, 
task_instance.duration AS task_instance_duration, task_instance.state AS 
task_instance_state, task_instance.try_number AS task_instance_try_number, 
task_instance.hostname AS task_instance_hostname, task_instance.unixname AS 
task_instance_unixname, task_instance.job_id AS task_instance_job_id, 
task_instance.pool AS task_instance_pool, task_instance.queue AS 
task_instance_queue, task_instance.priority_weight AS 
task_instance_priority_weight, task_instance.operator AS 
task_instance_operator, task_instance.queued_dttm AS task_instance_queued_dttm, 
task_instance.pid AS task_instance_pid 
FROM task_instance ORDER BY task_instance.job_id DESC 
LIMIT 500;
{code}
Profile, explain:

 
{code:java}
<redacted>:airflow> EXPLAIN SELECT task_instance.task_id AS 
task_instance_task_id, task_instance.dag_id AS task_instance_dag_id, 
task_instance.execution_date AS task_instance_execution_date, 
task_instance.start_date AS task_instance_start_date, task_instance.end_date AS 
task_instance_end_date, task_instance.duration AS task_instance_duration, 
task_instance.state AS task_instance_state, task_instance.try_number AS 
task_instance_try_number, task_instance.hostname AS task_instance_hostname, 
task_instance.unixname AS task_instance_unixname, task_instance.job_id AS 
task_instance_job_id, task_instance.pool AS task_instance_pool, 
task_instance.queue AS task_instance_queue, task_instance.priority_weight AS 
task_instance_priority_weight, task_instance.operator AS 
task_instance_operator, task_instance.queued_dttm AS task_instance_queued_dttm, 
task_instance.pid AS task_instance_pid 
-> FROM task_instance ORDER BY task_instance.job_id DESC 
-> LIMIT 500;
+----+-------------+---------------+------+---------------+------+---------+------+---------+----------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows 
| Extra |
+----+-------------+---------------+------+---------------+------+---------+------+---------+----------------+
| 1 | SIMPLE | task_instance | ALL | NULL | NULL | NULL | NULL | 2542776 | 
Using filesort |
+----+-------------+---------------+------+---------------+------+---------+------+---------+----------------+
1 row in set (0.00 sec)

<redacted>:airflow> select count(*) from task_instance;
+----------+
| count(*) |
+----------+
| 2984749 |
+----------+
1 row in set (1.67 sec)


<redacted>:airflow> show profile for query 2;
+----------------------+------------+
| Status | Duration |
+----------------------+------------+
| starting | 0.000157 |
| checking permissions | 0.000017 |
| Opening tables | 0.000033 |
| init | 0.000046 |
| System lock | 0.000017 |
| optimizing | 0.000010 |
| statistics | 0.000022 |
| preparing | 0.000020 |
| Sorting result | 0.000010 |
| executing | 0.000008 |
| Sending data | 0.000151 |
| Creating sort index | 248.955841 |
| end | 0.015358 |
| query end | 0.000012 |
| closing tables | 0.000019 |
| freeing items | 0.000549 |
| logging slow query | 0.000007 |
| logging slow query | 0.000901 |
| cleaning up | 0.000030 |
+----------------------+------------+
19 rows in set, 1 warning (0.00 sec)

{code}
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to