chrisrfu opened a new issue #15954: URL: https://github.com/apache/airflow/issues/15954
**Apache Airflow version**: v2.0.2 **Environment**: - **Cloud provider or hardware configuration**: on local machine using docker-compose - **OS** (e.g. from /etc/os-release): macOS Catalina 10.15.7 - **Kernel** (e.g. `uname -a`): Darwin CFU.local 19.6.0 Darwin Kernel Version 19.6.0: Tue Jan 12 22:13:05 PST 2021; root:xnu-6153.141.16~1/RELEASE_X86_64 x86_64 **What happened**: configured remote logging to S3 bucket. Ran a DAG which consists of 3 tasks. Sometimes only log of last task appears in the bucket. Sometimes no log appears at all in the bucket. **What you expected to happen**: everytime I run the DAG, i expect log files to appear in the S3 bucket, each task with its own folder for logs **How to reproduce it**: 1. Using a combination of [**_Writing Logs to Amazon S3_** from v1.10.10 docs](https://airflow.apache.org/docs/apache-airflow/1.10.10/howto/write-logs.html) and [**_[logging] docs_** from v2.0.2](https://airflow.apache.org/docs/apache-airflow/stable/configurations-ref.html#config-logging-base-log-folder), i have this in `airflow.cfg` > [logging] base_log_folder = /opt/airflow/logs remote_logging = True remote_log_conn_id = s3_logs remote_base_log_folder = s3://my-bucket/logs 2. `docker build -t name:tag -f Dockerfile .` 3. `docker-compose --env-file ./config/.env.local up airflow-init` 4. `docker-compose --env-file ./config/.env.local up` 5. In the airflow UI, go to _**Admin**_ --> _**Connections**_ > Conn Id = s3_logs Conn Type = S3 Extra = {"aws_access_key_id":"<key_here>","aws_secret_access_key":"<secret_here>"} 6. Run DAG Log of first task: > *** Falling back to local log *** Log file does not exist: /opt/airflow/logs/update_org_2_accounts_deploy/update_accounts_activity/2021-05-19T20:00:39.231872+00:00/1.log *** Fetching from: http://4371804b14d3:8793/log/update_org_2_accounts_deploy/update_accounts_activity/2021-05-19T20:00:39.231872+00:00/1.log *** Failed to fetch log file from worker. HTTPConnectionPool(host='4371804b14d3', port=8793): Max retries exceeded with url: /log/update_org_2_accounts_deploy/update_accounts_activity/2021-05-19T20:00:39.231872+00:00/1.log (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff65926a160>: Failed to establish a new connection: [Errno -2] Name or service not known',)) Log of second task: > *** Falling back to local log *** Log file does not exist: /opt/airflow/logs/update_org_2_accounts_deploy/update_vintage/2021-05-19T20:00:39.231872+00:00/1.log *** Fetching from: http://4371804b14d3:8793/log/update_org_2_accounts_deploy/update_vintage/2021-05-19T20:00:39.231872+00:00/1.log *** Failed to fetch log file from worker. HTTPConnectionPool(host='4371804b14d3', port=8793): Max retries exceeded with url: /log/update_org_2_accounts_deploy/update_vintage/2021-05-19T20:00:39.231872+00:00/1.log (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff65925e438>: Failed to establish a new connection: [Errno -2] Name or service not known',)) Log of third task: > *** Reading remote log from s3://convene-dw-industry/logs/update_org_2_accounts_deploy/update_industry/2021-05-19T20:00:39.231872+00:00/1.log. [2021-05-19 20:00:54,158] {taskinstance.py:877} INFO - Dependencies all met for <TaskInstance: update_org_2_accounts_deploy.update_industry 2021-05-19T20:00:39.231872+00:00 [queued]> [2021-05-19 20:00:54,711] {taskinstance.py:877} INFO - Dependencies all met for <TaskInstance: update_org_2_accounts_deploy.update_industry 2021-05-19T20:00:39.231872+00:00 [queued]> [2021-05-19 20:00:54,712] {taskinstance.py:1068} INFO - the rest of the logs are good and it exists in s3 **Anything else we need to know**: In the Airflow UI I tried: > Conn Id = s3_logs Conn Type = S3 Login = <key_here> Password = <secret_here> and the same error would occur I do have a redshift database set up for airflow metadata. So when I completed Step 5 prior, I go right to Step 6. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
