huozhanfeng edited a comment on issue #17437: URL: https://github.com/apache/airflow/issues/17437#issuecomment-895702439
> Raise the `min_file_process_interval` to `600` (10 mins) or even `6000` (100 mins). Newly added or modified files will already be parsed, the dag parser will skip the `min_file_process_interval` check if a file is recently modified. > > We had benchmarked this with more than 10k dag files The PR is good but I wonder whether it can solve this problem. Suppose there are 10k dags and it needs 10mins to consume and process the whole 10k tasks in `_file_path_queue`, can the first dag be parsed in time when it is just be put into `_file_path_queue` by calling `prepare_file_path_queue` and after that, we modify the first dag file? The logic this PR improved is in method `prepare_file_path_queue`, but this method is called only when `_file_path_queue` is empty in method `_run_parsing_loop`. So maybe it will delay almost 10mins to parse the modified first dag. @ashb could you please help to take look when you have free time? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
