Mark Payne created NIFI-8773:
--------------------------------

             Summary: Allow TailFile to hold off on ingesting lines of text if 
the fully message is not available
                 Key: NIFI-8773
                 URL: https://issues.apache.org/jira/browse/NIFI-8773
             Project: Apache NiFi
          Issue Type: New Feature
          Components: Extensions
            Reporter: Mark Payne
            Assignee: Mark Payne


When using TailFile, there are times when multi-line messages are written to a 
file. For example, we may have something like:

{code}
<1> My Message
<2> My Message
<3> My Message
   A continuation of my message
{code}

If TailFile now runs, it will ingest these 4 lines of text as a FlowFile.
Perhaps the next lines to get written, though, will be something like:

{code}
  Another continuation of my message
  A final continuation
<4> Another Message
<5> Yet another Message
{code}

And we may want to avoid pulling in lines "<3> My Message" and "   A 
continuation of my message" until we are able to fully consume the full message.

We should enable this capability by allowing for a new property that specifies 
a Regular Expression to run against the start of a line. If we read a line from 
the file and it matches that Regex, then we know the previous message is 
complete. Otherwise, the previous message may not be complete and should be 
buffered (up to some configurable limit, in order to avoid exhausting the Java 
heap).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to