[
https://issues.apache.org/jira/browse/AIRFLOW-6522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17022412#comment-17022412
]
ASF subversion and git services commented on AIRFLOW-6522:
----------------------------------------------------------
Commit c7ad2c3f1a5c39999fd46edbbc390b1229e01e64 in airflow's branch
refs/heads/v1-10-test from rconroy293
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=c7ad2c3 ]
[AIRFLOW-6522] Clear task log file before starting to fix duplication in
S3TaskHandler (#7120)
The same task instance (including try number) can be run on a worker
when using a sensor in "reschedule" mode. Accordingly, this clears the
local log file when re-initializing the logger so that the old log
lines aren't uploaded again when the logger is closed.
(cherry picked from commit 88608caa56bf3621807af860a6a378242220de47)
> Sensors in reschedule mode with S3TaskHandler can cause log duplication
> -----------------------------------------------------------------------
>
> Key: AIRFLOW-6522
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6522
> Project: Apache Airflow
> Issue Type: Bug
> Components: logging
> Affects Versions: 1.10.6
> Reporter: Robert Conroy
> Assignee: Robert Conroy
> Priority: Minor
> Fix For: 1.10.8
>
>
> With sensors using {{reschedule}} mode and {{S3TaskHandler}} for logging, the
> task instance log gets a bunch of duplicate messages. I believe this is
> happening because contents of the local log file are appended to what's
> already in S3. The local log file may contain log messages that have already
> been uploaded to S3 if the task is sent back to a worker that had already
> processed a poke for that task instance.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)