jedcunningham commented on code in PR #32604:
URL: https://github.com/apache/airflow/pull/32604#discussion_r1269951356
##########
airflow/__main__.py:
##########
@@ -45,6 +45,13 @@ def main():
parser = cli_parser.get_parser()
argcomplete.autocomplete(parser)
args = parser.parse_args()
+
+ # Here we ensure that the default configuration is written if needed
before running any command
+ # that might need it. This used to be done during configuration
initialization but having it
+ # im main ensures that it is not done during tests and other ways airflow
imports are used
Review Comment:
```suggestion
# in main ensures that it is not done during tests and other ways
airflow imports are used
```
##########
docs/providers-configurations-ref.rst:
##########
Review Comment:
Since this is basically the same as the core page, should we add comments to
remind folks to keep them in sync? Is there a way we can reuse it instead of
duplicating it?
##########
docs/providers-configurations-ref.rst:
##########
@@ -0,0 +1,99 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements. See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership. The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License. You may obtain a copy of the License at
+
+ .. http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied. See the License for the
+ specific language governing permissions and limitations
+ under the License.
+
+.. jinja:: config_ctx
+
+ Configuration Reference
+ .......................
+
+ This page contains the list of all available Airflow configurations for the
+ ``{{ package_name }}`` provider that can be set in ``airflow.cfg`` file or
using environment variables.
Review Comment:
```suggestion
``{{ package_name }}`` provider that can be set in the ``airflow.cfg``
file or using environment variables.
```
##########
docs/providers-configurations-ref.rst:
##########
@@ -0,0 +1,99 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements. See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership. The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License. You may obtain a copy of the License at
+
+ .. http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied. See the License for the
+ specific language governing permissions and limitations
+ under the License.
+
+.. jinja:: config_ctx
+
+ Configuration Reference
+ .......................
+
+ This page contains the list of all available Airflow configurations for the
+ ``{{ package_name }}`` provider that can be set in ``airflow.cfg`` file or
using environment variables.
+
+ .. note::
+ For more information on setting the configuration, see
:doc:`apache-airflow:howto/set-config`
Review Comment:
```suggestion
For more information see :doc:`apache-airflow:howto/set-config`.
```
I think this would render better: "For more information see <Setting
Configuration Options>."
##########
docs/apache-airflow/configurations-ref.rst:
##########
@@ -32,6 +32,13 @@ generated using the secret key has a short expiry time
though - make sure that t
that you run airflow components on is synchronized (for example using ntpd)
otherwise you might get
"forbidden" errors when the logs are accessed.
+Some of the providers have their own configuration options, you will find
details of their configuration
+in the provider's documentation. The pre-installed providers that you likely
want to configure are:
Review Comment:
```suggestion
in the provider's documentation. The pre-installed providers that you may
want to configure are:
```
I think we should soften this, 2 of the 3 common executors wouldn't need to
touch this.
##########
airflow/configuration.py:
##########
@@ -1644,6 +1765,61 @@ def remove_all_read_configurations(self):
for section in self.sections():
self.remove_section(section)
+ @property
+ def providers_configuration_loaded(self) -> bool:
+ """Checks if providers have been loaded."""
+ return self._providers_configuration_loaded
+
+ def load_providers_configuration(self):
+ """
+ Loads configuration for providers.
+
+ This should be done after initial configuration have been performed.
Initializing and discovering
+ providers is an expensive operation and cannot be performed when we
load configuration for the first
+ time when airflow starts, because we initialize configuration very
early, during importing of the
+ `airflow` package and the module is not yet ready to be used when it
happens and until configuration
+ and settings are loaded. Therefore, in order to reload provider
configuration we need to additionally
+ load provider - specific configuration.
+ """
+ log.debug("Loading providers configuration")
+ from airflow.providers_manager import ProvidersManager
+
+ for provider, config in
ProvidersManager().already_initialized_provider_configs:
+ for provider_section, provider_section_content in config.items():
+ provider_options = provider_section_content["options"]
+ section_in_current_config =
self.configuration_description.get(provider_section)
+ if not section_in_current_config:
+ # merge whole sections into existing config
+ self.configuration_description[provider_section] =
deepcopy(provider_section_content)
+ section_in_current_config =
self.configuration_description.get(provider_section)
+ for option in provider_options:
+ section_in_current_config["options"][option]["source"]
= f"default-{provider}"
+ else:
+ for option in provider_options:
+ option_in_current_config =
section_in_current_config["options"].get(option)
+ if not option_in_current_config:
+ # merge new options into existing sections
+
self.configuration_description[provider_section]["options"][option] = deepcopy(
+ provider_options[option]
+ )
+ option_in_current_config =
section_in_current_config["options"].get(option)
+ option_in_current_config["source"] =
f"default-{provider}"
+ else:
+ # replace default values for existing options
Review Comment:
Yeah, this feels wrong to me.
Is this not a race condition too, if more than 1 provider decided to
override a certain value?
I'd rather we raise here.
##########
airflow/cli/commands/config_command.py:
##########
@@ -48,6 +49,14 @@ def show_config(args):
def get_value(args):
"""Get one value from configuration."""
+ # while this will make get_value quite a bit slower we must initialize
configuration
+ # for providers because we do not know what sections and options will be
available after
+ # providers are initialized. Theoretically Providers might add new
sections and options
+ # but also override defaults for existing options, so without loading all
providers we
Review Comment:
Allowing providers to override core defaults feels risky... Do we really
need that behavior?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]