JacobHayes commented on a change in pull request #3952: [AIRFLOW-XXX] Update
GCS logging docs for latest code
URL: https://github.com/apache/incubator-airflow/pull/3952#discussion_r220782677
##########
File path: docs/howto/write-logs.rst
##########
@@ -89,54 +89,21 @@ Writing Logs to Google Cloud Storage
Follow the steps below to enable Google Cloud Storage logging.
-#. Airflow's logging system requires a custom .py file to be located in the
``PYTHONPATH``, so that it's importable from Airflow. Start by creating a
directory to store the config file. ``$AIRFLOW_HOME/config`` is recommended.
-#. Create empty files called ``$AIRFLOW_HOME/config/log_config.py`` and
``$AIRFLOW_HOME/config/__init__.py``.
-#. Copy the contents of ``airflow/config_templates/airflow_local_settings.py``
into the ``log_config.py`` file that was just created in the step above.
-#. Customize the following portions of the template:
-
- .. code-block:: bash
-
- # Add this variable to the top of the file. Note the trailing slash.
- GCS_LOG_FOLDER = 'gs://<bucket where logs should be persisted>/'
-
- # Rename DEFAULT_LOGGING_CONFIG to LOGGING CONFIG
- LOGGING_CONFIG = ...
-
- # Add a GCSTaskHandler to the 'handlers' block of the LOGGING_CONFIG
variable
- 'gcs.task': {
- 'class': 'airflow.utils.log.gcs_task_handler.GCSTaskHandler',
- 'formatter': 'airflow.task',
- 'base_log_folder': os.path.expanduser(BASE_LOG_FOLDER),
- 'gcs_log_folder': GCS_LOG_FOLDER,
- 'filename_template': FILENAME_TEMPLATE,
- },
-
- # Update the airflow.task and airflow.task_runner blocks to be
'gcs.task' instead of 'file.task'.
- 'loggers': {
- 'airflow.task': {
- 'handlers': ['gcs.task'],
- ...
- },
- 'airflow.task_runner': {
- 'handlers': ['gcs.task'],
- ...
- },
- 'airflow': {
- 'handlers': ['console'],
- ...
- },
- }
-
-#. Make sure a Google Cloud Platform connection hook has been defined in
Airflow. The hook should have read and write access to the Google Cloud Storage
bucket defined above in ``GCS_LOG_FOLDER``.
-
-#. Update ``$AIRFLOW_HOME/airflow.cfg`` to contain:
+To enable this feature, ``airflow.cfg`` must be configured as in this
+example:
- .. code-block:: bash
+.. code-block:: bash
- task_log_reader = gcs.task
- logging_config_class = log_config.LOGGING_CONFIG
- remote_log_conn_id = <name of the Google cloud platform hook>
+ [core]
+ # Airflow can store logs remotely in AWS S3. Users must supply a remote
+ # location URL (starting with either 's3://...') and an Airflow connection
+ # id that provides access to the storage location.
+ remote_logging_enabled = True
Review comment:
This variable should be `remote_logging`
https://github.com/apache/incubator-airflow/blob/53b89b98371c7bb993b242c341d3941e9ce09f9a/airflow/config_templates/airflow_local_settings.py#L173
Also, the comment above this line should probably be changed to reference
GCS (and `gs://`) instead of S3.
Other than that, this seems to match my working setup with airflow GCS
logging.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services