kaxil closed pull request #3952: [AIRFLOW-3159] Update GCS logging docs for 
latest code
URL: https://github.com/apache/incubator-airflow/pull/3952
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/docs/howto/write-logs.rst b/docs/howto/write-logs.rst
index ac30e84f39..272985a7b1 100644
--- a/docs/howto/write-logs.rst
+++ b/docs/howto/write-logs.rst
@@ -89,54 +89,22 @@ Writing Logs to Google Cloud Storage
 
 Follow the steps below to enable Google Cloud Storage logging.
 
-#. Airflow's logging system requires a custom .py file to be located in the 
``PYTHONPATH``, so that it's importable from Airflow. Start by creating a 
directory to store the config file. ``$AIRFLOW_HOME/config`` is recommended.
-#. Create empty files called ``$AIRFLOW_HOME/config/log_config.py`` and 
``$AIRFLOW_HOME/config/__init__.py``.
-#. Copy the contents of ``airflow/config_templates/airflow_local_settings.py`` 
into the ``log_config.py`` file that was just created in the step above.
-#. Customize the following portions of the template:
-
-    .. code-block:: bash
-
-        # Add this variable to the top of the file. Note the trailing slash.
-        GCS_LOG_FOLDER = 'gs://<bucket where logs should be persisted>/'
-
-        # Rename DEFAULT_LOGGING_CONFIG to LOGGING CONFIG
-        LOGGING_CONFIG = ...
-
-        # Add a GCSTaskHandler to the 'handlers' block of the LOGGING_CONFIG 
variable
-        'gcs.task': {
-            'class': 'airflow.utils.log.gcs_task_handler.GCSTaskHandler',
-            'formatter': 'airflow.task',
-            'base_log_folder': os.path.expanduser(BASE_LOG_FOLDER),
-            'gcs_log_folder': GCS_LOG_FOLDER,
-            'filename_template': FILENAME_TEMPLATE,
-        },
-
-        # Update the airflow.task and airflow.task_runner blocks to be 
'gcs.task' instead of 'file.task'.
-        'loggers': {
-            'airflow.task': {
-                'handlers': ['gcs.task'],
-                ...
-            },
-            'airflow.task_runner': {
-                'handlers': ['gcs.task'],
-                ...
-            },
-            'airflow': {
-                'handlers': ['console'],
-                ...
-            },
-        }
-
-#. Make sure a Google Cloud Platform connection hook has been defined in 
Airflow. The hook should have read and write access to the Google Cloud Storage 
bucket defined above in ``GCS_LOG_FOLDER``.
-
-#. Update ``$AIRFLOW_HOME/airflow.cfg`` to contain:
-
-    .. code-block:: bash
+To enable this feature, ``airflow.cfg`` must be configured as in this
+example:
 
-        task_log_reader = gcs.task
-        logging_config_class = log_config.LOGGING_CONFIG
-        remote_log_conn_id = <name of the Google cloud platform hook>
+.. code-block:: bash
 
+    [core]
+    # Airflow can store logs remotely in AWS S3, Google Cloud Storage or 
Elastic Search.
+    # Users must supply an Airflow connection id that provides access to the 
storage
+    # location. If remote_logging is set to true, see UPDATING.md for 
additional
+    # configuration requirements.
+    remote_logging = True
+    remote_base_log_folder = gs://my-bucket/path/to/logs
+    remote_log_conn_id = MyGCSConn
+
+#. Install the ``gcp_api`` package first, like so: ``pip install 
apache-airflow[gcp_api]``.
+#. Make sure a Google Cloud Platform connection hook has been defined in 
Airflow. The hook should have read and write access to the Google Cloud Storage 
bucket defined above in ``remote_base_log_folder``.
 #. Restart the Airflow webserver and scheduler, and trigger (or wait for) a 
new task execution.
 #. Verify that logs are showing up for newly executed tasks in the bucket 
you've defined.
 #. Verify that the Google Cloud Storage viewer is working in the UI. Pull up a 
newly executed task, and verify that you see something like:
@@ -150,10 +118,3 @@ Follow the steps below to enable Google Cloud Storage 
logging.
         [2017-10-03 21:57:51,306] {base_task_runner.py:98} INFO - Subtask: 
[2017-10-03 21:57:51,306] {models.py:186} INFO - Filling up the DagBag from 
/airflow/dags/example_dags/example_bash_operator.py
 
 Note the top line that says it's reading from the remote log file.
-
-Please be aware that if you were persisting logs to Google Cloud Storage
-using the old-style airflow.cfg configuration method, the old logs will no
-longer be visible in the Airflow UI, though they'll still exist in Google
-Cloud Storage. This is a backwards incompatbile change. If you are unhappy
-with it, you can change the ``FILENAME_TEMPLATE`` to reflect the old-style
-log filename format.


 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to