Mark Payne created NIFI-8773:
--------------------------------
Summary: Allow TailFile to hold off on ingesting lines of text if
the fully message is not available
Key: NIFI-8773
URL: https://issues.apache.org/jira/browse/NIFI-8773
Project: Apache NiFi
Issue Type: New Feature
Components: Extensions
Reporter: Mark Payne
Assignee: Mark Payne
When using TailFile, there are times when multi-line messages are written to a
file. For example, we may have something like:
{code}
<1> My Message
<2> My Message
<3> My Message
A continuation of my message
{code}
If TailFile now runs, it will ingest these 4 lines of text as a FlowFile.
Perhaps the next lines to get written, though, will be something like:
{code}
Another continuation of my message
A final continuation
<4> Another Message
<5> Yet another Message
{code}
And we may want to avoid pulling in lines "<3> My Message" and " A
continuation of my message" until we are able to fully consume the full message.
We should enable this capability by allowing for a new property that specifies
a Regular Expression to run against the start of a line. If we read a line from
the file and it matches that Regex, then we know the previous message is
complete. Otherwise, the previous message may not be complete and should be
buffered (up to some configurable limit, in order to avoid exhausting the Java
heap).
--
This message was sent by Atlassian Jira
(v8.3.4#803005)