ashb commented on a change in pull request #4751: [AIRFLOW-3607] collected
trigger rule dep check per dag run
URL: https://github.com/apache/airflow/pull/4751#discussion_r346970934
##########
File path: airflow/jobs/scheduler_job.py
##########
@@ -717,7 +718,10 @@ def _process_task_instances(self, dag,
task_instances_list, session=None):
run.dag = dag
# todo: preferably the integrity check happens at dag collection
time
run.verify_integrity(session=session)
- run.update_state(session=session)
+ finished_tasks = run.get_task_instances(state=State.finished() +
[State.UPSTREAM_FAILED],
+ session=session)
Review comment:
This works, but it asks for a lot more columns and rows than we need.
We could try changing the return inside this function from `return
tis.all()` to just `return tis`, and this line could become:
```python
finished_tasks = run.get_task_instances(state=State.finished() +
[State.UPSTREAM_FAILED],
session=session).options(load_only("task_id", "state"))
```
https://docs.sqlalchemy.org/en/13/orm/loading_columns.html#load-only-and-wildcard-options
Do you think this is worth it or not worth it?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services