ashb commented on a change in pull request #20349:
URL: https://github.com/apache/airflow/pull/20349#discussion_r781064291
##########
File path: airflow/jobs/scheduler_job.py
##########
@@ -402,6 +402,14 @@ def _executable_task_instances_to_queued(self, max_tis:
int, session: Session =
# Many dags don't have a task_concurrency, so where we can
avoid loading the full
# serialized DAG the better.
serialized_dag = self.dagbag.get_dag(dag_id,
session=session)
+ # If the dag is missing, continue to the next task.
+ if not serialized_dag:
+ self.log.error(
+ "DAG '%s' for taskinstance %s not found in
serialized_dag table",
+ dag_id,
+ task_instance,
+ )
+ continue
Review comment:
This error/reproduction step is not quite right, but the same idea can
trigger this behaviour -- if the dag is deleted at the "right" time, this bit
of the scheduler will fail.
I think in that case though we should fail the task instances as the DAG
doesn't exist anymore, and as TP said, it can't run successfully.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]