ashb commented on a change in pull request #6792: [AIRFLOW-5930] Use cached-SQL 
query building for hot-path queries
URL: https://github.com/apache/airflow/pull/6792#discussion_r357106226
 
 

 ##########
 File path: airflow/ti_deps/deps/trigger_rule_dep.py
 ##########
 @@ -34,9 +35,38 @@ class TriggerRuleDep(BaseTIDep):
     IGNOREABLE = True
     IS_TASK_DEP = True
 
+    @staticmethod
+    def bake_dep_status_query():
+        TI = airflow.models.TaskInstance
+        # TODO(unknown): this query becomes quite expensive with dags that 
have many
+        # tasks. It should be refactored to let the task report to the dag run 
and get the
+        # aggregates from there.
+        q = BAKED_QUERIES(lambda session: session.query(
+            func.coalesce(func.sum(case([(TI.state == State.SUCCESS, 1)], 
else_=0)), 0),
 
 Review comment:
   Yeah, I was wondering about having some DB specific optimizations in places. 
Somewhere else in the Scheduler we can speed it up by doing `UPDATE 
task_instance ... RETURNING *` to avoid a second query, I think it helped a bit.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to