vchiapaikeo opened a new issue, #40674:
URL: https://github.com/apache/airflow/issues/40674

   ### Apache Airflow version
   
   2.9.2
   
   ### If "Other Airflow 2 version" selected, which one?
   
   _No response_
   
   ### What happened?
   
   We are planning to upgrade from 2.7.3 to 2.9.2. However, we've observed 
issues in our sandbox MySQL db while doing so. Specifically, [this 
query](https://github.com/apache/airflow/blob/main/airflow/models/trigger.py#L306-L314)
 fails to use the ti_trigger_id index on the task_instance table. This is 
likely a result of the addition of the `coalesce(TaskInstance.priority_weight, 
0).desc()` which forces MySQL to need to perform lookups after the fact.
   
   Slow Query Log (showing that this query takes over 11s to run):
   
   ```
   # Query_time: 11.081314  Lock_time: 0.000299 Rows_sent: 0  Rows_examined: 
2819837
   SET timestamp=1720544726;
   SELECT `trigger`.id 
   FROM `trigger` INNER JOIN task_instance ON `trigger`.id = 
task_instance.trigger_id 
   WHERE `trigger`.triggerer_id IS NULL OR (`trigger`.triggerer_id NOT IN 
(SELECT job.id 
   FROM job 
   WHERE job.end_date IS NULL AND job.latest_heartbeat > '2024-07-09 
17:04:56.519611' AND job.job_type = 'TriggererJob')) ORDER BY 
coalesce(task_instance.priority_weight, 0) DESC, `trigger`.created_date 
    LIMIT 9968 FOR UPDATE SKIP LOCKED;
   ```
   
   ### What you think should happen instead?
   
   The index on trigger_id should be modified to include priority_weight so 
that the query chooses to use the index. Currently, the index is defined as 
   
   ```
   CREATE INDEX ti_trigger_id
      ON task_instance (trigger_id ASC);
   ```
   
   However, it should be defined as the following so that the index gets used:
   
   ```
   CREATE INDEX idx_trigger_priority ON task_instance(trigger_id, 
priority_weight);
   ```
   
   ### How to reproduce
   
   Run Airflow w/ a MySQL backend and turn on slow query logging. Airflow 
should be bootstrapped with a signficant number of tasks in the task_instance 
table
   
   ### Operating System
   
   Debian 11
   
   ### Versions of Apache Airflow Providers
   
   
https://raw.githubusercontent.com/apache/airflow/constraints-2.9.2/constraints-3.11.txt
   
   ### Deployment
   
   Official Apache Airflow Helm Chart
   
   ### Deployment details
   
   KubernetesExecutor on GKE
   
   ### Anything else?
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to