Mark Payne created NIFI-942:
-------------------------------
Summary: Create RouteText processor
Key: NIFI-942
URL: https://issues.apache.org/jira/browse/NIFI-942
Project: Apache NiFi
Issue Type: New Feature
Components: Extensions
Reporter: Mark Payne
Fix For: 0.4.0
The idea is to route individual lines of a text file to different
relationships. This allows for splitting lines based on some criteria or
filtering out specific lines, and would be a much more convenient alternative
than RouteOnContent for textual data.
A discussion for this took place on the users mailing list
(http://mail-archives.apache.org/mod_mbox/nifi-users/201509.mbox/%3CCAKpk5PxjszdX-NXMMf6Pcet4x7Y5GmrT7_jn9uyzS-h_a9TG3A%40mail.gmail.com%3E)
The way that I could see this working is to have a few different properties:
Routing Strategy:
- Route each line to matching Property Name (default)
- Route matching lines to 'matched' if all match
- Route matching lines to 'matched' if any match
- Route FlowFile to 'matched' if all lines match
- Route FlowFile to 'matched' if any line matches
A Match Strategy
- Starts With
- Ends With
- Contains
- Equals
- Matches Regular Expression
- Contains Regular Expression
And then user-defined properties that indicate what to search each line of text
for.
So to find lines that begin with the < character
You would simply add a property named "Begins with Less Than" and set the value
to : <
Then set the Match Strategy to Starts With
And Routing Strategy to "Route each line to matching Property Name"
Then, any line that begins with < will be routed to the Begins with Less Than
relationship.
This would be a simple way to pull out any particular lines of interest in a
text file.
I can see this being very useful for processing log files, CSV, etc.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)