Sam Bock created AIRFLOW-3853:
---------------------------------

             Summary: Duplicate Logs appearing in S3
                 Key: AIRFLOW-3853
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-3853
             Project: Apache Airflow
          Issue Type: Bug
          Components: logging
    Affects Versions: 1.10.2
            Reporter: Sam Bock
            Assignee: Sam Bock


We've recently started to see duplicate logs in S3. After digging into it, we 
discovered that this was due to our use of the new `reschedule` mode on our 
sensors. Because the same `try_number` is used when a task reschedules, the 
local log file frequently contains results from previous attempts. 
Additionally, because the `s3_task_helper.py` always tries to `append` the 
local log file to the remove log file, this can result in massive logs (we 
found one that 400 mb).

To fix this, we'd like to remove the local log after a successful upload.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to