blag commented on code in PR #26347:
URL: https://github.com/apache/airflow/pull/26347#discussion_r974652668
##########
airflow/jobs/scheduler_job.py:
##########
@@ -1227,6 +1222,19 @@ def _update_state(dag: DAG, dag_run: DagRun):
active_runs_of_dags[dag_run.dag_id] += 1
_update_state(dag, dag_run)
+ @retry_db_transaction
+ def _schedule_all_dag_runs(self, guard, dag_runs, session):
+ """Makes scheduling decisions for all `dag_runs`"""
+ callback_tuples = []
+ callback_to_run = None
+ for dag_run in dag_runs:
+ callback_to_run = self._schedule_dag_run(dag_run, session)
+ callback_tuples.append((dag_run, callback_to_run))
+
+ guard.commit()
+
+ return callback_tuples, callback_to_run
Review Comment:
Is there a reason you're including `callback_to_run` in the return statement
here?? It seems cleaner to just `return callback_tuples`, especially because
the last `callback_to_run` is going to be included in the last tuple of
`callback_tuples` anyway.
The next line of code overwrites `callback_to_run` as well, so it just seems
unnecessary to pass it back:
```python
callback_tuples, callback_to_run =
self._schedule_all_dag_runs(guard, dag_runs, session)
# ...
for dag_run, callback_to_run in callback_tuples: # <--
callback_to_run overwritten immediately
```
Also, would a dictionary be a better option to map dag runs to callbacks?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]