[
https://issues.apache.org/jira/browse/NIFI-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17308913#comment-17308913
]
Mark Payne commented on NIFI-8344:
----------------------------------
Re-opening issue because there still exists a corner case that is not addressed.
After a file is rolled over, if data is written to it, it's possible that we
can consume data up until some point that is not a new-line character, and then
emit that as a FlowFile. As a result, we can have a situation where we consume
part only part of a line from the file being tailed.
> Allow TailFile to continue tailing a file for some time after it has been
> rolled over
> -------------------------------------------------------------------------------------
>
> Key: NIFI-8344
> URL: https://issues.apache.org/jira/browse/NIFI-8344
> Project: Apache NiFi
> Issue Type: Improvement
> Components: Extensions
> Reporter: Mark Payne
> Assignee: Mark Payne
> Priority: Major
> Fix For: 1.14.0
>
> Time Spent: 40m
> Remaining Estimate: 0h
>
> TailFile makes the assumption that once a file has been rolled over, it will
> never be appended to. If the file's Last Modified timestamp changes, the
> processor assumes that it's a new file and imports the entire contents of the
> file again.
> However, one practice that I've encountered is that users have a syslog
> server that rotates periodically. To rotate, they rename the existing file,
> and then restart the server. When that happens, the server will flush out any
> data that it has buffered to the file that was just rolled over, and then
> begin writing to the new file.
> This results in the TailFile processor ingesting the entire file that has
> been rolled over. Because we can't keep state about every file that is rolled
> over, we should introduce a property that allows the user to indicate that
> upon rollover they want to continue tailing that rolled over file until it is
> no longer being written to, and then begin tailing the new file.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)