uranusjr commented on a change in pull request #20349:
URL: https://github.com/apache/airflow/pull/20349#discussion_r782689415



##########
File path: airflow/jobs/scheduler_job.py
##########
@@ -402,6 +402,14 @@ def _executable_task_instances_to_queued(self, max_tis: 
int, session: Session =
                     # Many dags don't have a task_concurrency, so where we can 
avoid loading the full
                     # serialized DAG the better.
                     serialized_dag = self.dagbag.get_dag(dag_id, 
session=session)
+                    # If the dag is missing, continue to the next task.
+                    if not serialized_dag:
+                        self.log.error(
+                            "DAG '%s' for taskinstance %s not found in 
serialized_dag table",
+                            dag_id,
+                            task_instance,
+                        )
+                        continue

Review comment:
       Yeah, if the DAG is missing entirely, we should stop the entire run from 
continuing because nothing afterwards would run. If I understand correctly, 
skipping the task instance (as this PR currently implements) means the ti would 
stay queued and unnecessarily be tried again (and again…), which seems 
suboptimal.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to