ashb commented on a change in pull request #6792: [AIRFLOW-5930] Use cached-SQL
query building for hot-path queries
URL: https://github.com/apache/airflow/pull/6792#discussion_r357106226
##########
File path: airflow/ti_deps/deps/trigger_rule_dep.py
##########
@@ -34,9 +35,38 @@ class TriggerRuleDep(BaseTIDep):
IGNOREABLE = True
IS_TASK_DEP = True
+ @staticmethod
+ def bake_dep_status_query():
+ TI = airflow.models.TaskInstance
+ # TODO(unknown): this query becomes quite expensive with dags that
have many
+ # tasks. It should be refactored to let the task report to the dag run
and get the
+ # aggregates from there.
+ q = BAKED_QUERIES(lambda session: session.query(
+ func.coalesce(func.sum(case([(TI.state == State.SUCCESS, 1)],
else_=0)), 0),
Review comment:
Yeah, I was wondering about having some DB specific optimizations in places.
Somewhere else in the Scheduler we can speed it up by doing `UPDATE
task_instance ... RETURNING *` to avoid a second query, I think it helped a bit.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services