pingzh commented on code in PR #21877:
URL: https://github.com/apache/airflow/pull/21877#discussion_r843054923
##########
airflow/jobs/scheduler_job.py:
##########
@@ -779,6 +779,22 @@ def _execute(self) -> None:
self.log.exception("Exception when executing
DagFileProcessorAgent.end")
self.log.info("Exited execute loop")
+ def _update_dag_run_state_for_paused_dags(self):
Review Comment:
good point on having the LocalTaskJob to mark the dag run quickly. my
concern is that it leaks the responsibility of the scheduler, also there is
chance that the LTJ fails before marking the dag run state, which leaves the
unhandled dag run states (this could be rare). also, since the dag is paused,
it would be a good deal if the dag run state is updated a little bit late.
as we discussed in this email thread: `[DISCUSSION] let scheduler heal tasks
stuck in queued state`, we will need to define the responsibility of each
component in airflow (scheduler, executor, LTJ (airflow run --local), `airflow
run --raw`) in terms of the state machine.
we can have a thorough discussion there. let me know your thoughts.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]