[
https://issues.apache.org/jira/browse/FLUME-3219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16429014#comment-16429014
]
John P. Kiffmeyer edited comment on FLUME-3219 at 4/6/18 9:27 PM:
--
I'm seeing this too. This means a plain old logrotate(8) setup on the log
directory TailDir is pointed at will cause lots of reprocessing.
Specifically, a logrotate config like this one would cause TailDir to reprocess
a file every time the _n_ in "thing.log.n" gets bumped. So, lots of
duplication.
{code:none}
/var/log/thing/thing.log {
# Rotate a file when it gets bigger than 25MiB
maxsize 26214400
# Keep at most 40 files
rotate 40
...
}
{code}
was (Author: jpk):
I'm seeing this too. This means a plain old logrotate(8) setup on the log
directory TailDir is pointed at will cause massive duplication.
> Taildir source: if file is renamed, it is consumed again
>
>
> Key: FLUME-3219
> URL: https://issues.apache.org/jira/browse/FLUME-3219
> Project: Flume
> Issue Type: Improvement
> Components: Sinks+Sources
>Affects Versions: 1.8.0
>Reporter: Daniel Lanza García
>Priority: Major
>
> Current behavior of Taildir is such that if a file is renamed (eg log
> rotated) it is consumed again.
> https://github.com/apache/flume/blob/d1f24f56ce9714bb3e1edc671da290c75a17dead/flume-ng-sources/flume-taildir-source/src/main/java/org/apache/flume/source/taildir/ReliableTaildirEventReader.java#L247
> Would not be better if the inode is followed, and if that inode has been
> consumed do not consume it again? With current implementation, once file is
> rotated, you get duplicates in the case you include in the path previous
> days's data (you want to do that if agent fails and needs to consume data
> from previous days).
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
-
To unsubscribe, e-mail: issues-unsubscr...@flume.apache.org
For additional commands, e-mail: issues-h...@flume.apache.org