[ 
https://issues.apache.org/jira/browse/AIRFLOW-1667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16195581#comment-16195581
 ] 

Allison Wang edited comment on AIRFLOW-1667 at 10/7/17 5:24 AM:
----------------------------------------------------------------

I agree that we shouldn't rely on the logging module's close to upload the log 
since we have no control when it's called. Instead of calling close, we could 
explicitly invoke a post_task_run method in handlers that handles any 
additional clean up/operations upon task completion. This change only requires 
modifying a small amount of current code. I am not exactly sure how the to 
upload the log to remote storage like S3/GCS periodically upon task execution, 
but it's possible to use a log collector (e.g Filebeat) to ship the log to a 
centralized storage (e.g ElasticSearch) in real time. 


was (Author: allisonwang):
I agree that we shouldn't rely on the logging module's close to upload the log 
since we have no control when it's called. Instead of calling close, we could 
explicitly invoke a post_task_run method that handles any additional clean 
up/operations upon task completion. This change only requires modifying a small 
amount of current code. I am not exactly sure how the to upload the log to 
remote storage like S3/GCS periodically upon task execution, but it's possible 
to use a log collector (e.g Filebeat) to ship the log to a centralized storage 
(e.g ElasticSearch) in real time. 

> Remote log handlers don't upload logs
> -------------------------------------
>
>                 Key: AIRFLOW-1667
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-1667
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: logging
>    Affects Versions: 1.9.0, 1.10.0
>            Reporter: Arthur Vigil
>
> AIRFLOW-1385 revised logging for configurability, but the provided remote log 
> handlers (S3TaskHandler and GCSTaskHandler) only upload on close (flush is 
> left at the default implementation provided by `logging.FileHandler`). A 
> handler will be closed on process exit by `logging.shutdown()`, but depending 
> on the Executor used worker processes may not regularly shutdown, and can 
> very likely persist between tasks. This means during normal execution log 
> files are never uploaded.
> Need to find a way to flush remote log handlers in a timely manner, but 
> without hitting the target resources unnecessarily.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to