juroVee commented on issue #18843: URL: https://github.com/apache/airflow/issues/18843#issuecomment-944521156
> I'm experiencing a similar issue. If a task is in the `scheduled` state and the DAG code is temporarily removed (e.g. part of DAG CICD), the DAG processor will delete the associated information in `serialized_dag` while the task still exists. Then if the scheduler tries to transition the task from `scheduled` to `queued` before the new code is serialized, it will crash the scheduler. Upon reboot, one of the first things the scheduler tries to do is adopt orphan tasks, and adoption is attempted before DAG serialization, resulting in a crashloop. Can confirm this kind of behavior on our deployment as well. 2.1.4, 3.9 - Scheduler went down after DAG file deletion from DAGs folder. Error messages went from: airflow.exceptions.SerializedDagNotFound: DAG 'some_dag' not found in serialized_dag table to AttributeError: 'NoneType' object has no attribute 'dag_id'. We are also noticing ~10x higher CPU usage from single Scheduler on 2.1.4 compared with 2.1.2 (same number of DAGs, same settings), but this seems unrelated and we will probably create a new issue for that, if not already answered. Posting it just in case someone experiences the same. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
