BasPH commented on PR #34487: URL: https://github.com/apache/airflow/pull/34487#issuecomment-2120314813
Hi @jscheffl, unfortunately life got in the way of investing time in OSS work the last year. The essentials are there but I was working on integrating this code nicely with Airflow some time ago. My idea was to make the DAG processor a configurable class, so that the user could choose between the "current" DAG processor, or this "new" DAG processor using watchdog for handling DAG code changes. This turned out to be troublesome because the DAG processor doesn't just process DAGs but includes other responsibilities such as handling task & SLA callbacks, the `DagFileProcessorManager` is actually a `DagFileProcessorManagerAndTaskAndSlaCallbackHandler`... I assume this was the result of "hacking" the DAG processor to perform recurring system operations because it's a recurring process, but architecturally I don't think the DAG processor should handle callbacks. I see two options at the moment: 1. Extract all the non-DAG-processing logic (handling callbacks, SLAs, etc.) into a new dedicated component so that the DAG processor _only_ processes DAG files. This would allow us to deprecate/remove current settings related to the DAG processor such as `dag_dir_list_interval` and `min_file_process_interval`. I think this is architecturally better but requires more work. 2. Keep the current DAG processor and add watchdog alongside the current implementation. The result is that the current settings such as `dag_dir_list_interval` and `min_file_process_interval` are kept, but this requires less work. Curious about your thoughts. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
