amichai07 commented on a change in pull request #4751: [AIRFLOW-3607] collected
trigger rule dep check per dag run
URL: https://github.com/apache/airflow/pull/4751#discussion_r347493137
##########
File path: airflow/ti_deps/deps/trigger_rule_dep.py
##########
@@ -49,33 +75,33 @@ def _get_dep_statuses(self, ti, session, dep_context):
yield self._passing_status(reason="The task had a dummy trigger
rule set.")
return
- # TODO(unknown): this query becomes quite expensive with dags that
have many
- # tasks. It should be refactored to let the task report to the dag run
and get the
- # aggregates from there.
- qry = (
- session
- .query(
- func.coalesce(func.sum(
- case([(TI.state == State.SUCCESS, 1)], else_=0)), 0),
- func.coalesce(func.sum(
- case([(TI.state == State.SKIPPED, 1)], else_=0)), 0),
- func.coalesce(func.sum(
- case([(TI.state == State.FAILED, 1)], else_=0)), 0),
- func.coalesce(func.sum(
- case([(TI.state == State.UPSTREAM_FAILED, 1)], else_=0)),
0),
- func.count(TI.task_id),
+ if dep_context.finished_tasks is None:
Review comment:
Ok I checked it, I suggest a change that will solve this and the following
comment concerning "dag_run_finished_ti_map" collection. I think it will be
both cleaner and more efficient.
LMK what you think about this direction and will arrange it a bit better if
you think its good.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services