juroVee commented on issue #18843:
URL: https://github.com/apache/airflow/issues/18843#issuecomment-944521156


   > I'm experiencing a similar issue. If a task is in the `scheduled` state 
and the DAG code is temporarily removed (e.g. part of DAG CICD), the DAG 
processor will delete the associated information in `serialized_dag` while the 
task still exists. Then if the scheduler tries to transition the task from 
`scheduled` to `queued` before the new code is serialized, it will crash the 
scheduler. Upon reboot, one of the first things the scheduler tries to do is 
adopt orphan tasks, and adoption is attempted before DAG serialization, 
resulting in a crashloop.
   
   Can confirm this kind of behavior on our deployment as well. 2.1.4, 3.9 - 
Scheduler went down after DAG file deletion from DAGs folder. Error messages 
went from:
   
   airflow.exceptions.SerializedDagNotFound: DAG 'some_dag' not found in 
serialized_dag table
   
   to 
   
   AttributeError: 'NoneType' object has no attribute 'dag_id'.
   
   We are also noticing ~10x higher CPU usage from single Scheduler on 2.1.4 
compared with 2.1.2 (same number of DAGs, same settings), but this seems 
unrelated and we will probably create a new issue for that, if not already 
answered. Posting it just in case someone experiences the same.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to