kaxil commented on a change in pull request #15046:
URL: https://github.com/apache/airflow/pull/15046#discussion_r603359429
##########
File path: airflow/configuration.py
##########
@@ -253,6 +253,18 @@ def _validate_config_dependencies(self):
+ ", ".join(start_method_options)
)
+ if self.has_option("scheduler", "file_parsing_sort_mode"):
+ list_mode = self.get("scheduler", "file_parsing_sort_mode")
+ file_parser_modes = {"modified_time", "random_seeded_by_host",
"alphabetical"}
+
+ if list_mode not in file_parser_modes:
+ raise AirflowConfigException(
+ "`[scheduler] file_parsing_sort_mode` should not be "
+ + list_mode
Review comment:
fixed in
https://github.com/apache/airflow/pull/15046/commits/e394febf64baffbeee34bc0383c042262659b424
##########
File path: airflow/config_templates/config.yml
##########
@@ -1834,6 +1834,22 @@
type: string
example: ~
default: "2"
+ - name: file_parsing_sort_mode
+ description: |
+ One of ``modified_time``, ``random_seeded_by_host`` and
``alphabetical``.
+ The scheduler will list and sort the dag files to decide the parsing
order.
+
+ * ``modified_time``: Sort by modified time of the files. This is
useful on large scale to parse the
+ recently modified DAGs first.
+ * ``random_seeded_by_host``: Sort randomly across multiple Schedulers
but with same order on the
+ same host. This is useful when running with Scheduler in HA mode
where each scheduler can
+ parse different DAG files.
+ * ``alphabetical``: Sort by filename
+
+ version_added: 2.0.2
Review comment:
fixed in
https://github.com/apache/airflow/pull/15046/commits/e394febf64baffbeee34bc0383c042262659b424
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]