[
https://issues.apache.org/jira/browse/AIRFLOW-2970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16774953#comment-16774953
]
Ash Berlin-Taylor commented on AIRFLOW-2970:
--------------------------------------------
PRs welcome! Easy option that doesn't need a code change is to configure
Airflow to write task logs to S3/GCS -
https://airflow.apache.org/howto/write-logs.html#
> Kubernetes logging is broken
> ----------------------------
>
> Key: AIRFLOW-2970
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2970
> Project: Apache Airflow
> Issue Type: Bug
> Reporter: Jon Davies
> Assignee: Daniel Imberman
> Priority: Major
>
> I'm using Airflow with the Kubernetes executor and pod operator. And my DAGs
> are configured to do get_log=True and all my DAGs are set to log to stdout
> and I can see all the logs in kubectl logs.
> I can see that the scheduler logs things to:
> $AIRFLOW_HOME/logs/scheduler/2018-08-28/*
> However, this just consists of:
> {code:java}
> [2018-08-28 13:03:27,695] {jobs.py:385} INFO - Started process (PID=16994) to
> work on /home/airflow/dags/dag.py
> [2018-08-28 13:03:27,697] {jobs.py:1782} INFO - Processing file
> /home/airflow/dags/dag.py for tasks to queue
> [2018-08-28 13:03:27,697] {logging_mixin.py:95} INFO - [2018-08-28
> 13:03:27,697] {models.py:258} INFO - Filling up the DagBag from
> /home/airflow/dags/dag.py
> {code}
> If I quickly exec into the executor the scheduler spins up, I can see that
> things are properly logged to:
> {code:java}
> /home/airflow/logs/dag$ tail -f
> dag-downloader/2018-08-28T13\:05\:07.704072+00\:00/1.log
> [2018-08-28 13:05:24,399] {logging_mixin.py:95} INFO - [2018-08-28
> 13:05:24,399] {pod_launcher.py:112} INFO - Event: dag-downloader-015ca48c had
> an event of type Pending
> ...
> [2018-08-28 13:05:37,193] {logging_mixin.py:95} INFO - [2018-08-28
> 13:05:37,193] {pod_launcher.py:95} INFO -
> b'INFO:botocore.vendored.requests.packages.urllib3.connectionpool:Starting
> new HTTPS connection (7): blah-blah.s3.eu-west-1.amazonaws.com\n'
> ...
> ...all other log lines from pod...
> {code}
> However, this executor pod only exists for the duration of the lifetime of
> the task pod so the logs are lost pretty much immediately after the task
> runs. There is nothing that ships the logs back to the scheduler and/or web
> UI.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)