[
https://issues.apache.org/jira/browse/NIFI-942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14902942#comment-14902942
]
Joseph Percivall commented on NIFI-942:
---------------------------------------
Yeah you're right about not routing a FlowFile for each but I need to add one
more option to that separation strategy. Ultimately there are 9 different
options we are thinking to support broken out by Route_Strategy:
Properties -> 1. Route line to property name if line matches property
2. Route FlowFile to property name if all lines match
property
3. Route FlowFile to property name if any line matches
property
AllMatch -> 1. Route line to "matched" if line matches all properties
2. Route FlowFile to "matched" if all lines match all
properties
3. Route FlowFile to "matched" if all lines match any
property
AnyMatch -> 1. Route line to "matched" if line matches any property
2. Route FlowFile to "matched" if any line matches all
properties
3. Route FlowFile to "matched" if any line matches any
property
So the options for separation will be (corresponding to the numbered options
above):
1. Route each line individually
2. Route FlowFile as a whole if all lines match the match strategy
3. Route FlowFile as a whole if any line matches the match strategy.
> Create RouteText processor
> --------------------------
>
> Key: NIFI-942
> URL: https://issues.apache.org/jira/browse/NIFI-942
> Project: Apache NiFi
> Issue Type: New Feature
> Components: Extensions
> Reporter: Mark Payne
> Assignee: Joseph Percivall
> Fix For: 0.4.0
>
>
> The idea is to route individual lines of a text file to different
> relationships. This allows for splitting lines based on some criteria or
> filtering out specific lines, and would be a much more convenient alternative
> than RouteOnContent for textual data.
> A discussion for this took place on the users mailing list
> (http://mail-archives.apache.org/mod_mbox/nifi-users/201509.mbox/%3CCAKpk5PxjszdX-NXMMf6Pcet4x7Y5GmrT7_jn9uyzS-h_a9TG3A%40mail.gmail.com%3E)
> The way that I could see this working is to have a few different properties:
> Routing Strategy:
> - Route each line to matching Property Name (default)
> - Route matching lines to 'matched' if all match
> - Route matching lines to 'matched' if any match
> - Route FlowFile to 'matched' if all lines match
> - Route FlowFile to 'matched' if any line matches
> A Match Strategy
> - Starts With
> - Ends With
> - Contains
> - Equals
> - Matches Regular Expression
> - Contains Regular Expression
> And then user-defined properties that indicate what to search each line of
> text for.
> So to find lines that begin with the < character
> You would simply add a property named "Begins with Less Than" and set the
> value to : <
> Then set the Match Strategy to Starts With
> And Routing Strategy to "Route each line to matching Property Name"
> Then, any line that begins with < will be routed to the Begins with Less Than
> relationship.
> This would be a simple way to pull out any particular lines of interest in a
> text file.
> I can see this being very useful for processing log files, CSV, etc.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)