[GitHub] [airflow] vemikhaylov commented on a change in pull request #14500: Clear tasks by task ids in REST API

GitBox Wed, 03 Mar 2021 04:01:31 -0800


vemikhaylov commented on a change in pull request #14500:
URL: https://github.com/apache/airflow/pull/14500#discussion_r584191749




##########
File path: airflow/models/dag.py
##########
@@ -1227,6 +1230,8 @@ def clear(
             tis = tis.filter(or_(TI.state == State.FAILED, TI.state == 
State.UPSTREAM_FAILED))
         if only_running:
             tis = tis.filter(TI.state == State.RUNNING)
+        if task_ids:
+            tis = tis.filter(TI.task_id.in_(task_ids))

Review comment:
       Actually the conditions are just added with conjunction:
   
   ```python
   # tst_double_in_query.py
   from sqlalchemy.orm import Session
   
   from airflow.models import TaskInstance
   
   session = Session()
   query = 
session.query(TaskInstance.task_id).filter(TaskInstance.task_id.in_(["foo"]))
   print(str(query.statement))
   query = query.filter(TaskInstance.task_id.in_(["bar"]))
   print(str(query.statement))
   ```
   
   ```
   $ python tst_double_in_query.py
   First query:
   SELECT task_instance.task_id
   FROM task_instance
   WHERE task_instance.task_id IN (:task_id_1)
   
   Second query:
   SELECT task_instance.task_id
   FROM task_instance
   WHERE task_instance.task_id IN (:task_id_1) AND task_instance.task_id IN 
(:task_id_2)
   ```
   
   So the second filter narrows down the search space if `task_ids` are 
provided.
   
   Naturally we can intersect the sets preliminary and apply the filter once, 
it can make the generated SQL code a little more efficient. Would it be better, 
how do you feel?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [airflow] vemikhaylov commented on a change in pull request #14500: Clear tasks by task ids in REST API

Reply via email to