This is an automated email from the ASF dual-hosted git repository. potiuk pushed a commit to branch v2-6-test in repository https://gitbox.apache.org/repos/asf/airflow.git
commit 1cc4c091f5e2b64603a120b04356dc66b7d583f6 Author: Jarek Potiuk <[email protected]> AuthorDate: Fri Jun 30 11:15:17 2023 +0200 Separate out advanced logging configuration (#32131) The "advanced logging configuration" applies not only to task logs but also to component logs and you can use it not only to configure custom way how task logs are created but also custom way how "regular" component logs are created. This has been a source of confusion for those who wanted to configure (for example) elasticsearch or opensearch for the whole airflow deployment, because the "advanced configuration" and how to modify standard configuration chapter was a small section in "task logging". This change extracts "advanced logging configuration" to separate page right under the "logging and monitoring" and directs the user from the "task" logging section to this page. It also adds a bit more explanation on how standard Python logging framework is leveraged here and links to Python logging documentation for those who never used it before, to understand more about Loggers, Handlers and Formatters. Co-authored-by: Akash Sharma <[email protected]> Co-authored-by: Tzu-ping Chung <[email protected]> (cherry picked from commit ead2530d3500dd27df54383a0802b6c94828c359) --- .../logging/stackdriver.rst | 3 +- .../core-extensions/logging.rst | 2 +- .../core-extensions/secrets-backends.rst | 2 +- .../advanced-logging-configuration.rst | 90 ++++++++++++++++++++++ .../logging-monitoring/index.rst | 1 + .../logging-monitoring/logging-architecture.rst | 10 ++- .../logging-monitoring/logging-tasks.rst | 53 +++---------- docs/spelling_wordlist.txt | 1 + 8 files changed, 113 insertions(+), 49 deletions(-) diff --git a/docs/apache-airflow-providers-google/logging/stackdriver.rst b/docs/apache-airflow-providers-google/logging/stackdriver.rst index fa3f3391af..12be70cf65 100644 --- a/docs/apache-airflow-providers-google/logging/stackdriver.rst +++ b/docs/apache-airflow-providers-google/logging/stackdriver.rst @@ -66,7 +66,8 @@ be used. Make sure that with those credentials, you can read and write the logs. the logs. For security reasons, limiting the access of the log reader to only allow log reading and writing is an important security measure. -By using the ``logging_config_class`` option you can get :ref:`advanced features <write-logs-advanced>` of +By using the ``logging_config_class`` option you can get +:doc:`advanced features <apache-airflow:administration-and-deployment/logging-monitoring/advanced-logging-configuration>` of this handler. Details are available in the handler's documentation - :class:`~airflow.providers.google.cloud.log.stackdriver_task_handler.StackdriverTaskHandler`. diff --git a/docs/apache-airflow-providers/core-extensions/logging.rst b/docs/apache-airflow-providers/core-extensions/logging.rst index 822cf54d2b..0fa10d98c4 100644 --- a/docs/apache-airflow-providers/core-extensions/logging.rst +++ b/docs/apache-airflow-providers/core-extensions/logging.rst @@ -20,7 +20,7 @@ Writing logs This is a summary of all Apache Airflow Community provided implementations of writing task logs exposed via community-managed providers. You can also see logging options available in the core Airflow in -:doc:`apache-airflow:administration-and-deployment/logging-monitoring/logging-tasks` and here you can see those +:doc:`/administration-and-deployment/logging-monitoring/logging-tasks` and here you can see those provided by the community-managed providers: .. airflow-logging:: diff --git a/docs/apache-airflow-providers/core-extensions/secrets-backends.rst b/docs/apache-airflow-providers/core-extensions/secrets-backends.rst index 22fae4b8a6..c1cc5a0c05 100644 --- a/docs/apache-airflow-providers/core-extensions/secrets-backends.rst +++ b/docs/apache-airflow-providers/core-extensions/secrets-backends.rst @@ -28,7 +28,7 @@ via providers that implement secrets backends for services Airflow integrates wi You can also take a look at Secret backends available in the core Airflow in -:doc:`apache-airflow:administration-and-deployment/security/secrets/secrets-backend/index` and here you can see the ones +:doc:`/administration-and-deployment/security/secrets/secrets-backend/index` and here you can see the ones provided by the community-managed providers: .. airflow-secrets-backends:: diff --git a/docs/apache-airflow/administration-and-deployment/logging-monitoring/advanced-logging-configuration.rst b/docs/apache-airflow/administration-and-deployment/logging-monitoring/advanced-logging-configuration.rst new file mode 100644 index 0000000000..739b5380e3 --- /dev/null +++ b/docs/apache-airflow/administration-and-deployment/logging-monitoring/advanced-logging-configuration.rst @@ -0,0 +1,90 @@ + .. Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + .. http://www.apache.org/licenses/LICENSE-2.0 + + .. Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. + + + +Advanced logging configuration +------------------------------ + +Not all configuration options are available from the ``airflow.cfg`` file. The config file describes +how to configure logging for tasks, because the logs generated by tasks are not only logged in separate +files by default but has to be also accessible via the webserver. + +By default standard airflow component logs are written to the ``$AIRFLOW_HOME/logs`` directory, but you +can also customize it and configure it as you want by overriding Python logger configuration that can +be configured by providing custom logging configuration object. Some configuration options require +that the logging config class be overwritten. You can do it by copying the default +configuration of Airflow and modifying it to suit your needs. The default configuration can be seen in the +`airflow_local_settings.py template <https://github.com/apache/airflow/blob/|airflow_version|/airflow/config_templates/airflow_local_settings.py>`_ +and you can see the loggers and handlers used there. Except the custom loggers and handlers configurable there +via the ``airflow.cfg``, the logging methods in Airflow follow the usual Python logging convention, +that Python objects log to loggers that follow naming convention of ``<package>.<module_name>``. + +You can read more about standard python logging classes (Loggers, Handlers, Formatters) in the +`Python logging documentation <https://docs.python.org/library/logging.html>`_. + +Configuring your logging classes can be done via the ``logging_config_class`` option in ``airflow.cfg`` file. +This configuration should specify the import path to a configuration compatible with +:func:`logging.config.dictConfig`. If your file is a standard import location, then you should set a +:envvar:`PYTHONPATH` environment variable. + +Follow the steps below to enable custom logging config class: + +#. Start by setting environment variable to known directory e.g. ``~/airflow/`` + + .. code-block:: bash + + export PYTHONPATH=~/airflow/ + +#. Create a directory to store the config file e.g. ``~/airflow/config`` +#. Create file called ``~/airflow/config/log_config.py`` with following the contents: + + .. code-block:: python + + from copy import deepcopy + from airflow.config_templates.airflow_local_settings import DEFAULT_LOGGING_CONFIG + + LOGGING_CONFIG = deepcopy(DEFAULT_LOGGING_CONFIG) + +#. At the end of the file, add code to modify the default dictionary configuration. +#. Update ``$AIRFLOW_HOME/airflow.cfg`` to contain: + + .. code-block:: ini + + [logging] + logging_config_class = log_config.LOGGING_CONFIG + +You can also use the ``logging_config_class`` together with remote logging if you plan to just extend/update +the configuration with remote logging enabled. Then the deep-copied dictionary will contain the remote logging +configuration generated for you and your modification will apply after remote logging configuration has +been added: + + .. code-block:: ini + + [logging] + remote_logging = True + logging_config_class = log_config.LOGGING_CONFIG + + +#. Restart the application. + +See :doc:`../modules_management` for details on how Python and Airflow manage modules. + + +.. note:: + + You can override the way both standard logs of the components and "task" logs are handled. diff --git a/docs/apache-airflow/administration-and-deployment/logging-monitoring/index.rst b/docs/apache-airflow/administration-and-deployment/logging-monitoring/index.rst index 58edec4be0..850bfa77a9 100644 --- a/docs/apache-airflow/administration-and-deployment/logging-monitoring/index.rst +++ b/docs/apache-airflow/administration-and-deployment/logging-monitoring/index.rst @@ -30,6 +30,7 @@ In addition to the standard logging and metrics capabilities, Airflow supports t logging-architecture logging-tasks + advanced-logging-configuration metrics callbacks diff --git a/docs/apache-airflow/administration-and-deployment/logging-monitoring/logging-architecture.rst b/docs/apache-airflow/administration-and-deployment/logging-monitoring/logging-architecture.rst index 07cd3f42cc..ae000ddfe0 100644 --- a/docs/apache-airflow/administration-and-deployment/logging-monitoring/logging-architecture.rst +++ b/docs/apache-airflow/administration-and-deployment/logging-monitoring/logging-architecture.rst @@ -26,9 +26,15 @@ Airflow supports a variety of logging and monitoring mechanisms as shown below. By default, Airflow supports logging into the local file system. These include logs from the Web server, the Scheduler, and the Workers running tasks. This is suitable for development environments and for quick debugging. -For cloud deployments, Airflow also has handlers contributed by the Community for logging to cloud storage such as AWS, Google Cloud, and Azure. +For cloud deployments, Airflow also has task handlers contributed by the Community for +logging to cloud storage such as AWS, Google Cloud, and Azure. -The logging settings and options can be specified in the Airflow Configuration file, which as usual needs to be available to all the Airflow process: Web server, Scheduler, and Workers. +The logging settings and options can be specified in the Airflow Configuration file, +which as usual needs to be available to all the Airflow process: Web server, Scheduler, and Workers. + +You can customize the logging settings for each of the Airflow components by specifying the logging settings +in the Airflow Configuration file, or for advanced configuration by using +:doc:`advanced features </administration-and-deployment/logging-monitoring/advanced-logging-configuration>`. For production deployments, we recommend using FluentD to capture logs and send it to destinations such as ElasticSearch or Splunk. diff --git a/docs/apache-airflow/administration-and-deployment/logging-monitoring/logging-tasks.rst b/docs/apache-airflow/administration-and-deployment/logging-monitoring/logging-tasks.rst index 45fcb1eaca..d4ba6fd2df 100644 --- a/docs/apache-airflow/administration-and-deployment/logging-monitoring/logging-tasks.rst +++ b/docs/apache-airflow/administration-and-deployment/logging-monitoring/logging-tasks.rst @@ -117,53 +117,11 @@ the example below. The output of ``airflow info`` above is truncated to only display the section that pertains to the logging configuration. You can also run ``airflow config list`` to check that the logging configuration options have valid values. -.. _write-logs-advanced: - Advanced configuration ---------------------- -Not all configuration options are available from the ``airflow.cfg`` file. Some configuration options require -that the logging config class be overwritten. This can be done via the ``logging_config_class`` option -in ``airflow.cfg`` file. This option should specify the import path to a configuration compatible with -:func:`logging.config.dictConfig`. If your file is a standard import location, then you should set a :envvar:`PYTHONPATH` environment variable. - -Follow the steps below to enable custom logging config class: - -#. Start by setting environment variable to known directory e.g. ``~/airflow/`` - - .. code-block:: bash - - export PYTHONPATH=~/airflow/ - -#. Create a directory to store the config file e.g. ``~/airflow/config`` -#. Create file called ``~/airflow/config/log_config.py`` with following the contents: - - .. code-block:: python - - from copy import deepcopy - from airflow.config_templates.airflow_local_settings import DEFAULT_LOGGING_CONFIG - - LOGGING_CONFIG = deepcopy(DEFAULT_LOGGING_CONFIG) - -#. At the end of the file, add code to modify the default dictionary configuration. -#. Update ``$AIRFLOW_HOME/airflow.cfg`` to contain: - - .. code-block:: ini - - [logging] - remote_logging = True - logging_config_class = log_config.LOGGING_CONFIG - -#. Restart the application. - -See :doc:`../modules_management` for details on how Python and Airflow manage modules. - -External Links --------------- - -When using remote logging, you can configure Airflow to show a link to an external UI within the Airflow Web UI. Clicking the link redirects you to the external UI. - -Some external systems require specific configuration in Airflow for redirection to work but others do not. +You can configure :doc:`advanced features </administration-and-deployment/logging-monitoring/advanced-logging-configuration>` +- including adding your own custom task log handlers (but also log handlers for all airflow components). .. _serving-worker-trigger-logs: @@ -197,3 +155,10 @@ To accomplish this we have a few attributes that may be set on the handler, eith - ``trigger_should_queue``: Controls whether the triggerer should put a QueueListener between the event loop and the handler, to ensure blocking IO in the handler does not disrupt the event loop. - ``trigger_send_end_marker``: Controls whether an END signal should be sent to the logger when trigger completes. It is used to tell the wrapper to close and remove the individual file handler specific to the trigger that just completed. - ``trigger_supported``: If ``trigger_should_wrap`` and ``trigger_should_queue`` are not True, we generally assume that the handler does not support triggers. But if in this case the handler has ``trigger_supported`` set to True, then we'll still move the handler to root at triggerer start so that it will process trigger messages. Essentially, this should be true for handlers that "natively" support triggers. One such example of this is the StackdriverTaskHandler. + +External Links +-------------- + +When using remote logging, you can configure Airflow to show a link to an external UI within the Airflow Web UI. Clicking the link redirects you to the external UI. + +Some external systems require specific configuration in Airflow for redirection to work but others do not. diff --git a/docs/spelling_wordlist.txt b/docs/spelling_wordlist.txt index 5a386cf3ea..c046d7c53c 100644 --- a/docs/spelling_wordlist.txt +++ b/docs/spelling_wordlist.txt @@ -591,6 +591,7 @@ fn fo followsa formatter +formatters Formaturas forwardability forwardable
