[ https://issues.apache.org/jira/browse/AIRFLOW-1667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16195581#comment-16195581 ]
Allison Wang edited comment on AIRFLOW-1667 at 10/7/17 5:24 AM: ---------------------------------------------------------------- I agree that we shouldn't rely on the logging module's close to upload the log since we have no control when it's called. Instead of calling close, we could explicitly invoke a post_task_run method in handlers that handles any additional clean up/operations upon task completion. This change only requires modifying a small amount of current code. I am not exactly sure how the to upload the log to remote storage like S3/GCS periodically upon task execution, but it's possible to use a log collector (e.g Filebeat) to ship the log to a centralized storage (e.g ElasticSearch) in real time. was (Author: allisonwang): I agree that we shouldn't rely on the logging module's close to upload the log since we have no control when it's called. Instead of calling close, we could explicitly invoke a post_task_run method that handles any additional clean up/operations upon task completion. This change only requires modifying a small amount of current code. I am not exactly sure how the to upload the log to remote storage like S3/GCS periodically upon task execution, but it's possible to use a log collector (e.g Filebeat) to ship the log to a centralized storage (e.g ElasticSearch) in real time. > Remote log handlers don't upload logs > ------------------------------------- > > Key: AIRFLOW-1667 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1667 > Project: Apache Airflow > Issue Type: Bug > Components: logging > Affects Versions: 1.9.0, 1.10.0 > Reporter: Arthur Vigil > > AIRFLOW-1385 revised logging for configurability, but the provided remote log > handlers (S3TaskHandler and GCSTaskHandler) only upload on close (flush is > left at the default implementation provided by `logging.FileHandler`). A > handler will be closed on process exit by `logging.shutdown()`, but depending > on the Executor used worker processes may not regularly shutdown, and can > very likely persist between tasks. This means during normal execution log > files are never uploaded. > Need to find a way to flush remote log handlers in a timely manner, but > without hitting the target resources unnecessarily. -- This message was sent by Atlassian JIRA (v6.4.14#64029)