o-nikolas commented on code in PR #32604:
URL: https://github.com/apache/airflow/pull/32604#discussion_r1265730580


##########
TESTING.rst:
##########
@@ -55,8 +55,33 @@ Follow the guidelines when writing unit tests:
   tests, so we run Pytest with ``--disable-warnings`` but instead we have 
``pytest-capture-warnings`` plugin that
   overrides ``recwarn`` fixture behaviour.
 
-**NOTE:** We plan to convert all unit tests to standard "asserts" 
semi-automatically, but this will be done later
-in Airflow 2.0 development phase. That will include setUp/tearDown/context 
managers and decorators.
+
+.. note::
+
+  We are in the process of convert all unit tests to standard "asserts" and 
pytest fixtures
+  so if you find some tests that are still using classic setUp/tearDown 
approach or unittest asserts, feel
+  free to convert them to pytest.
+
+Airflow configuration for unit tests
+------------------------------------
+
+Some of the unit tests require special configuration set as "default" one. 
This is done automatically by
+automatically adding ``AIRFLOW__CORE__UNIT_TEST_MODE=True`` to the environment 
variables in Pytest auto-used

Review Comment:
   ```suggestion
   Some of the unit tests require special configuration set as the "default". 
This is done automatically by
    adding ``AIRFLOW__CORE__UNIT_TEST_MODE=True`` to the environment variables 
in Pytest auto-used
   ```



##########
TESTING.rst:
##########
@@ -55,8 +55,33 @@ Follow the guidelines when writing unit tests:
   tests, so we run Pytest with ``--disable-warnings`` but instead we have 
``pytest-capture-warnings`` plugin that
   overrides ``recwarn`` fixture behaviour.
 
-**NOTE:** We plan to convert all unit tests to standard "asserts" 
semi-automatically, but this will be done later
-in Airflow 2.0 development phase. That will include setUp/tearDown/context 
managers and decorators.
+
+.. note::
+
+  We are in the process of convert all unit tests to standard "asserts" and 
pytest fixtures
+  so if you find some tests that are still using classic setUp/tearDown 
approach or unittest asserts, feel
+  free to convert them to pytest.
+
+Airflow configuration for unit tests
+------------------------------------
+
+Some of the unit tests require special configuration set as "default" one. 
This is done automatically by
+automatically adding ``AIRFLOW__CORE__UNIT_TEST_MODE=True`` to the environment 
variables in Pytest auto-used
+fixture. This in turn makes airflow configuration load test configuration from
+``airflow/config_templates/config_unit_test.yml`` - defaults from the test 
configuration replace the original
+defaults stored in ``airflow/config_templates/config.yml``. If you want to add 
some test-only configuration,
+you should copy the option definition to ``config_unit_test.yml`` and change 
default value there.
+
+.. note::
+
+  The test configuration for Airflow before July 2023 was automatically 
generated in a file named
+  ``AIRFLOW_HOME/unittest.cfg`` and template for it was stored in 
"config_templates" next to the yaml file,

Review Comment:
   ```suggestion
     ``AIRFLOW_HOME/unittest.cfg`` and the template for it was stored in 
"config_templates" next to the yaml file,
   ```



##########
TESTING.rst:
##########
@@ -55,8 +55,33 @@ Follow the guidelines when writing unit tests:
   tests, so we run Pytest with ``--disable-warnings`` but instead we have 
``pytest-capture-warnings`` plugin that
   overrides ``recwarn`` fixture behaviour.
 
-**NOTE:** We plan to convert all unit tests to standard "asserts" 
semi-automatically, but this will be done later
-in Airflow 2.0 development phase. That will include setUp/tearDown/context 
managers and decorators.
+
+.. note::
+
+  We are in the process of convert all unit tests to standard "asserts" and 
pytest fixtures
+  so if you find some tests that are still using classic setUp/tearDown 
approach or unittest asserts, feel
+  free to convert them to pytest.
+
+Airflow configuration for unit tests
+------------------------------------
+
+Some of the unit tests require special configuration set as "default" one. 
This is done automatically by
+automatically adding ``AIRFLOW__CORE__UNIT_TEST_MODE=True`` to the environment 
variables in Pytest auto-used
+fixture. This in turn makes airflow configuration load test configuration from
+``airflow/config_templates/config_unit_test.yml`` - defaults from the test 
configuration replace the original
+defaults stored in ``airflow/config_templates/config.yml``. If you want to add 
some test-only configuration,
+you should copy the option definition to ``config_unit_test.yml`` and change 
default value there.
+
+.. note::
+
+  The test configuration for Airflow before July 2023 was automatically 
generated in a file named
+  ``AIRFLOW_HOME/unittest.cfg`` and template for it was stored in 
"config_templates" next to the yaml file,
+  however this was only done for the first time you run airflow and you had to 
manually maintain the file,
+  and it was pretty cumbrsome and arcane knowledge, also this file has been 
overwritten in Breeze environment

Review Comment:
   ```suggestion
     however this was only done for the first time you run airflow and you had 
to manually maintain the file,
     and it was pretty cumbersome and arcane knowledge, also this file has been 
overwritten in Breeze environment
   ```



##########
airflow/cli/cli_config.py:
##########
@@ -90,7 +91,7 @@ def _check_value(self, action, value):
                         f"executors derived from them, your current executor: 
{executor}, subclassed from: "
                         f'{", ".join([base_cls.__qualname__ for base_cls in 
executor_cls.__bases__])}'
                     )
-                    raise ArgumentError(action, message)
+                    warnings.warn(message)

Review Comment:
   Most of this code will leave soon anyways when I finish the CLI abstraction 
for Airflow CLI (see our [convo 
here](https://github.com/apache/airflow/pull/29055#discussion_r1082091328)).



##########
airflow/cli/cli_config.py:
##########
@@ -793,6 +814,13 @@ def string_lower_type(val):
     action="store_true",
 )
 
+# IMPORTANT NOTE!
+#
+# Celery configs below have explicit fallback values because celery provider 
defaults are not yet loaded
+# via provider at the time we parse the command line, so in case it is not 
set, we need to have manual
+# fallback
+# DO NOT REMOVE THE FALLBACKS even if you are tempted to.
+# TODO: possibly move the commands to providers but that could be big 
performance hit on the CLI

Review Comment:
   @potiuk 
   
   AIP-51 PR https://github.com/apache/airflow/pull/29055 will move the 
commands to the executor modules (which will be in providers now) so this will 
actually be done by that.
   
   I've spent time optimizing the import times of the executor modules so it 
should not slow down the CLI much at all.
   
   Was there another concern you had with regards to performance?



##########
airflow/config_templates/default_airflow.cfg:
##########
@@ -16,1507 +15,22 @@
 # specific language governing permissions and limitations
 # under the License.
 
-# This is the template for Airflow's default configuration. When Airflow is
-# imported, it looks for a configuration file at $AIRFLOW_HOME/airflow.cfg. If
-# it doesn't exist, Airflow uses this template to generate it by replacing
-# variables in curly braces with their global values from configuration.py.
-
-# Users should not modify this file; they should customize the generated
-# airflow.cfg instead.
-
-
-# ----------------------- TEMPLATE BEGINS HERE -----------------------
-
-[core]
-# The folder where your airflow pipelines live, most likely a
-# subfolder in a code repository. This path must be absolute.
-dags_folder = {AIRFLOW_HOME}/dags
-
-# Hostname by providing a path to a callable, which will resolve the hostname.
-# The format is "package.function".
 #
-# For example, default value "airflow.utils.net.getfqdn" means that result 
from patched
-# version of socket.getfqdn() - see 
https://github.com/python/cpython/issues/49254.
+# NOTE IF YOU ARE LOOKING FOR DEFAULT CONFIGURATION FILE HERE !!! LOOK NO 
MORE. READ NOTE BELOW!
 #
-# No argument should be required in the function specified.
-# If using IP address as hostname is preferred, use value 
``airflow.utils.net.get_host_ip_address``
-hostname_callable = airflow.utils.net.getfqdn
-
-# A callable to check if a python file has airflow dags defined or not
-# with argument as: `(file_path: str, zip_file: zipfile.ZipFile | None = None)`
-# return True if it has dags otherwise False
-# If this is not provided, Airflow uses its own heuristic rules.
-might_contain_dag_callable = 
airflow.utils.file.might_contain_dag_via_default_heuristic
-
-# Default timezone in case supplied date times are naive
-# can be utc (default), system, or any IANA timezone string (e.g. 
Europe/Amsterdam)
-default_timezone = utc
-
-# The executor class that airflow should use. Choices include
-# ``SequentialExecutor``, ``LocalExecutor``, ``CeleryExecutor``, 
``DaskExecutor``,
-# ``KubernetesExecutor``, ``CeleryKubernetesExecutor`` or the
-# full import path to the class when using a custom executor.
-executor = SequentialExecutor
-
-# The auth manager class that airflow should use. Full import path to the auth 
manager class.
-auth_manager = airflow.auth.managers.fab.fab_auth_manager.FabAuthManager
-
-# This defines the maximum number of task instances that can run concurrently 
per scheduler in
-# Airflow, regardless of the worker count. Generally this value, multiplied by 
the number of
-# schedulers in your cluster, is the maximum number of task instances with the 
running
-# state in the metadata database.
-parallelism = 32
-
-# The maximum number of task instances allowed to run concurrently in each 
DAG. To calculate
-# the number of tasks that is running concurrently for a DAG, add up the 
number of running
-# tasks for all DAG runs of the DAG. This is configurable at the DAG level 
with ``max_active_tasks``,
-# which is defaulted as ``max_active_tasks_per_dag``.
+# This file used to have something that reminded default Airflow configuration 
but it was really a template
+# That was used to generate it and it was confusing if you copied it to your 
configuration and some
+# configuration keys were wrong.
 #
-# An example scenario when this would be useful is when you want to stop a new 
dag with an early
-# start date from stealing all the executor slots in a cluster.
-max_active_tasks_per_dag = 16
-
-# Are DAGs paused by default at creation
-dags_are_paused_at_creation = True
-
-# The maximum number of active DAG runs per DAG. The scheduler will not create 
more DAG runs
-# if it reaches the limit. This is configurable at the DAG level with 
``max_active_runs``,
-# which is defaulted as ``max_active_runs_per_dag``.
-max_active_runs_per_dag = 16
-
-# The name of the method used in order to start Python processes via the 
multiprocessing module.
-# This corresponds directly with the options available in the Python docs:
-# 
https://docs.python.org/3/library/multiprocessing.html#multiprocessing.set_start_method.
-# Must be one of the values returned by:
-# 
https://docs.python.org/3/library/multiprocessing.html#multiprocessing.get_all_start_methods.
-# Example: mp_start_method = fork
-# mp_start_method =
-
-# Whether to load the DAG examples that ship with Airflow. It's good to
-# get started, but you probably want to set this to ``False`` in a production
-# environment
-load_examples = True
-
-# Path to the folder containing Airflow plugins
-plugins_folder = {AIRFLOW_HOME}/plugins
-
-# Should tasks be executed via forking of the parent process ("False",
-# the speedier option) or by spawning a new python process ("True" slow,
-# but means plugin changes picked up by tasks straight away)
-execute_tasks_new_python_interpreter = False
-
-# Secret key to save connection passwords in the db
-fernet_key = {FERNET_KEY}
-
-# Whether to disable pickling dags
-donot_pickle = True
-
-# How long before timing out a python file import
-dagbag_import_timeout = 30.0
-
-# Should a traceback be shown in the UI for dagbag import errors,
-# instead of just the exception message
-dagbag_import_error_tracebacks = True
-
-# If tracebacks are shown, how many entries from the traceback should be shown
-dagbag_import_error_traceback_depth = 2
-
-# How long before timing out a DagFileProcessor, which processes a dag file
-dag_file_processor_timeout = 50
-
-# The class to use for running task instances in a subprocess.
-# Choices include StandardTaskRunner, CgroupTaskRunner or the full import path 
to the class
-# when using a custom task runner.
-task_runner = StandardTaskRunner
-
-# If set, tasks without a ``run_as_user`` argument will be run with this user
-# Can be used to de-elevate a sudo user running Airflow when executing tasks
-default_impersonation =
-
-# What security module to use (for example kerberos)
-security =
-
-# Turn unit test mode on (overwrites many configuration options with test
-# values at runtime)
-unit_test_mode = False
-
-# Whether to enable pickling for xcom (note that this is insecure and allows 
for
-# RCE exploits).
-enable_xcom_pickling = False
-
-# What classes can be imported during deserialization. This is a multi line 
value.
-# The individual items will be parsed as regexp. Python built-in classes (like 
dict)
-# are always allowed. Bare "." will be replaced so you can set airflow.* .
-allowed_deserialization_classes = airflow\..*
-
-# When a task is killed forcefully, this is the amount of time in seconds that
-# it has to cleanup after it is sent a SIGTERM, before it is SIGKILLED
-killed_task_cleanup_time = 60
-
-# Whether to override params with dag_run.conf. If you pass some key-value 
pairs
-# through ``airflow dags backfill -c`` or
-# ``airflow dags trigger -c``, the key-value pairs will override the existing 
ones in params.
-dag_run_conf_overrides_params = True
-
-# If enabled, Airflow will only scan files containing both ``DAG`` and 
``airflow`` (case-insensitive).
-dag_discovery_safe_mode = True
-
-# The pattern syntax used in the ".airflowignore" files in the DAG 
directories. Valid values are
-# ``regexp`` or ``glob``.
-dag_ignore_file_syntax = regexp
-
-# The number of retries each task is going to have by default. Can be 
overridden at dag or task level.
-default_task_retries = 0
-
-# The number of seconds each task is going to wait by default between retries. 
Can be overridden at
-# dag or task level.
-default_task_retry_delay = 300
-
-# The maximum delay (in seconds) each task is going to wait by default between 
retries.
-# This is a global setting and cannot be overridden at task or DAG level.
-max_task_retry_delay = 86400
-
-# The weighting method used for the effective total priority weight of the task
-default_task_weight_rule = downstream
-
-# The default task execution_timeout value for the operators. Expected an 
integer value to
-# be passed into timedelta as seconds. If not specified, then the value is 
considered as None,
-# meaning that the operators are never timed out by default.
-default_task_execution_timeout =
-
-# Updating serialized DAG can not be faster than a minimum interval to reduce 
database write rate.
-min_serialized_dag_update_interval = 30
-
-# If True, serialized DAGs are compressed before writing to DB.
-# Note: this will disable the DAG dependencies view
-compress_serialized_dags = False
-
-# Fetching serialized DAG can not be faster than a minimum interval to reduce 
database
-# read rate. This config controls when your DAGs are updated in the Webserver
-min_serialized_dag_fetch_interval = 10
-
-# Maximum number of Rendered Task Instance Fields (Template Fields) per task 
to store
-# in the Database.
-# All the template_fields for each of Task Instance are stored in the Database.
-# Keeping this number small may cause an error when you try to view 
``Rendered`` tab in
-# TaskInstance view for older tasks.
-max_num_rendered_ti_fields_per_task = 30
-
-# On each dagrun check against defined SLAs
-check_slas = True
-
-# Path to custom XCom class that will be used to store and resolve operators 
results
-# Example: xcom_backend = path.to.CustomXCom
-xcom_backend = airflow.models.xcom.BaseXCom
-
-# By default Airflow plugins are lazily-loaded (only loaded when required). 
Set it to ``False``,
-# if you want to load plugins whenever 'airflow' is invoked via cli or loaded 
from module.
-lazy_load_plugins = True
-
-# By default Airflow providers are lazily-discovered (discovery and imports 
happen only when required).
-# Set it to False, if you want to discover providers whenever 'airflow' is 
invoked via cli or
-# loaded from module.
-lazy_discover_providers = True
-
-# Hide sensitive Variables or Connection extra json keys from UI and task logs 
when set to True
-#
-# (Connection passwords are always hidden in logs)
-hide_sensitive_var_conn_fields = True
-
-# A comma-separated list of extra sensitive keywords to look for in variables 
names or connection's
-# extra JSON.
-sensitive_var_conn_names =
-
-# Task Slot counts for ``default_pool``. This setting would not have any 
effect in an existing
-# deployment where the ``default_pool`` is already created. For existing 
deployments, users can
-# change the number of slots using Webserver, API or the CLI
-default_pool_task_slot_count = 128
-
-# The maximum list/dict length an XCom can push to trigger task mapping. If 
the pushed list/dict has a
-# length exceeding this value, the task pushing the XCom will be failed 
automatically to prevent the
-# mapped tasks from clogging the scheduler.
-max_map_length = 1024
-
-# The default umask to use for process when run in daemon mode (scheduler, 
worker,  etc.)
-#
-# This controls the file-creation mode mask which determines the initial value 
of file permission bits
-# for newly created files.
+# Airflow will generate default configuration for you when you run it for the 
first time in

Review Comment:
   ```suggestion
   # Airflow will generate the default configuration for you when you run it 
for the first time in
   ```



##########
airflow/config_templates/default_airflow.cfg:
##########
@@ -16,1507 +15,22 @@
 # specific language governing permissions and limitations
 # under the License.
 
-# This is the template for Airflow's default configuration. When Airflow is
-# imported, it looks for a configuration file at $AIRFLOW_HOME/airflow.cfg. If
-# it doesn't exist, Airflow uses this template to generate it by replacing
-# variables in curly braces with their global values from configuration.py.
-
-# Users should not modify this file; they should customize the generated
-# airflow.cfg instead.
-
-
-# ----------------------- TEMPLATE BEGINS HERE -----------------------
-
-[core]
-# The folder where your airflow pipelines live, most likely a
-# subfolder in a code repository. This path must be absolute.
-dags_folder = {AIRFLOW_HOME}/dags
-
-# Hostname by providing a path to a callable, which will resolve the hostname.
-# The format is "package.function".
 #
-# For example, default value "airflow.utils.net.getfqdn" means that result 
from patched
-# version of socket.getfqdn() - see 
https://github.com/python/cpython/issues/49254.
+# NOTE IF YOU ARE LOOKING FOR DEFAULT CONFIGURATION FILE HERE !!! LOOK NO 
MORE. READ NOTE BELOW!
 #
-# No argument should be required in the function specified.
-# If using IP address as hostname is preferred, use value 
``airflow.utils.net.get_host_ip_address``
-hostname_callable = airflow.utils.net.getfqdn
-
-# A callable to check if a python file has airflow dags defined or not
-# with argument as: `(file_path: str, zip_file: zipfile.ZipFile | None = None)`
-# return True if it has dags otherwise False
-# If this is not provided, Airflow uses its own heuristic rules.
-might_contain_dag_callable = 
airflow.utils.file.might_contain_dag_via_default_heuristic
-
-# Default timezone in case supplied date times are naive
-# can be utc (default), system, or any IANA timezone string (e.g. 
Europe/Amsterdam)
-default_timezone = utc
-
-# The executor class that airflow should use. Choices include
-# ``SequentialExecutor``, ``LocalExecutor``, ``CeleryExecutor``, 
``DaskExecutor``,
-# ``KubernetesExecutor``, ``CeleryKubernetesExecutor`` or the
-# full import path to the class when using a custom executor.
-executor = SequentialExecutor
-
-# The auth manager class that airflow should use. Full import path to the auth 
manager class.
-auth_manager = airflow.auth.managers.fab.fab_auth_manager.FabAuthManager
-
-# This defines the maximum number of task instances that can run concurrently 
per scheduler in
-# Airflow, regardless of the worker count. Generally this value, multiplied by 
the number of
-# schedulers in your cluster, is the maximum number of task instances with the 
running
-# state in the metadata database.
-parallelism = 32
-
-# The maximum number of task instances allowed to run concurrently in each 
DAG. To calculate
-# the number of tasks that is running concurrently for a DAG, add up the 
number of running
-# tasks for all DAG runs of the DAG. This is configurable at the DAG level 
with ``max_active_tasks``,
-# which is defaulted as ``max_active_tasks_per_dag``.
+# This file used to have something that reminded default Airflow configuration 
but it was really a template
+# That was used to generate it and it was confusing if you copied it to your 
configuration and some
+# configuration keys were wrong.

Review Comment:
   ```suggestion
   # This file used to have something that was similar to the default Airflow 
configuration but it was really just a template.
   # It was used to generate the final configuration and it was confusing if 
you copied it to your configuration and some of 
   # keys were wrong.
   ```



##########
TESTING.rst:
##########
@@ -55,8 +55,33 @@ Follow the guidelines when writing unit tests:
   tests, so we run Pytest with ``--disable-warnings`` but instead we have 
``pytest-capture-warnings`` plugin that
   overrides ``recwarn`` fixture behaviour.
 
-**NOTE:** We plan to convert all unit tests to standard "asserts" 
semi-automatically, but this will be done later
-in Airflow 2.0 development phase. That will include setUp/tearDown/context 
managers and decorators.
+
+.. note::
+
+  We are in the process of convert all unit tests to standard "asserts" and 
pytest fixtures
+  so if you find some tests that are still using classic setUp/tearDown 
approach or unittest asserts, feel
+  free to convert them to pytest.
+
+Airflow configuration for unit tests
+------------------------------------
+
+Some of the unit tests require special configuration set as "default" one. 
This is done automatically by
+automatically adding ``AIRFLOW__CORE__UNIT_TEST_MODE=True`` to the environment 
variables in Pytest auto-used
+fixture. This in turn makes airflow configuration load test configuration from
+``airflow/config_templates/config_unit_test.yml`` - defaults from the test 
configuration replace the original
+defaults stored in ``airflow/config_templates/config.yml``. If you want to add 
some test-only configuration,
+you should copy the option definition to ``config_unit_test.yml`` and change 
default value there.

Review Comment:
   ```suggestion
   you should copy the option definition to ``config_unit_test.yml`` and change 
the default value there.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to