[
https://issues.apache.org/jira/browse/NIFI-994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14937018#comment-14937018
]
Joseph Percivall commented on NIFI-994:
---------------------------------------
Adding an email chain that relates to this processor to the comments:
For a NiFi processor, I think the "tail -F" makes more sense. As opposed
to the normal behavior that follows existing file descriptors, "tail -F"
follows on filename (or pattern) so it tracks the current instance of a
file, letting it handle new files during the run, log rotations, etc..
I definitely agree that it should take a regex or a fixed filename.
I think the biggest question is granularity. Though tail is normally a
line oriented operation, in NiFi it should probably be "chunk" oriented
with each pass creating a new flow file with whatever new full lines are
available.
Joe Skora
-------------
Joe,
The problem with "tail -F" is that if NiFi is restarted and then we do
essentially "tail -F"
we may have missed a lot of data that was written to the log file while NiFi
was down.
The idea behind this Processor is to be able to recover that data, even if it
was written
to a log file (or any other sort of file) while NiFi was not running or while
the Processor
was not running.
I agree that it should be "chunk oriented" - likely would need a property that
indicates how
long to tail for a single chunk. E.g., tail for 1 second and create a FlowFile
with the content
received.
-Mark
> Processor to tail files
> -----------------------
>
> Key: NIFI-994
> URL: https://issues.apache.org/jira/browse/NIFI-994
> Project: Apache NiFi
> Issue Type: New Feature
> Affects Versions: 0.4.0
> Reporter: Joseph Percivall
> Assignee: Joseph Percivall
>
> It's a very common data ingest situation to want to input text into the
> system by "tailing" a file, most commonly log files. Currently we don't have
> an easy way to do this.
> A simple processor to tail a file would benefit many users. There would need
> to be an option to not just tail a file but pick up where the processor left
> off if it is interrupted.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)