boushphong opened a new issue, #35231:
URL: https://github.com/apache/airflow/issues/35231

   ### Description
   
   Basically, we'd want the dag-processor component to runs all its operations 
within the `_run_parsing_loop` method only once and then the process would 
gracefully terminate after it saves the parsing results to the metadata db.
   
   ### Use case/motivation
   
   We have an Airflow deployment that parses a large number of DAGs (some are 
dynamic DAGs) and the parsing often would take a very long time. So it would be 
nice to make the dag-processor parse all the DAG files only once and then it 
would terminate itself instead of running it continuously to save resources.
   
   The use case would be:
   - Users push their code to the DAG repo. 
   - The dag processor would run in the CI/CD process and saves the DAG parsing 
results to the metadata database. This can happen incrementally.
   
   Currently, we run the airflow scheduler and the airflow dag-processor 
separately. However, I've noticed that the scheduler also deactivate stale dags 
when `standalone_dag_processor=True` so we could not implement the 
dag-processor to run in the CI process yet. There is also a workaround to set  
`dag_stale_not_seen_duration` to a very big number so that the scheduler would 
never deactivate stale dags.
   
   Modifications would be:
   - Make the scheduler to have an option to not deactivate stale dags.
   - Enable the dag-processor to do all its operations in `_run_parsing_loop` 
only once and then it would gracefully terminate itself.
   
   ### Related issues
   
   _No response_
   
   ### Are you willing to submit a PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to