[ 
https://issues.apache.org/jira/browse/NIFI-942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14804461#comment-14804461
 ] 

Joseph Percivall commented on NIFI-942:
---------------------------------------

Giving the option to route the flowfile as a whole and lines individually makes 
for a bit more complicated properties. For example, the last two Routing 
Strategies, how should they behave with multiple user-defined properties. We 
can either route the FlowFile if the lines match any or all user-defined 
properties. 

What I'm thinking is adding another property that allows the user to choose the 
separation strategy:

* Route each individually
* Route FlowFile as a whole for each matched line
* Route FlowFile as a whole once

So in general for the first three routing strategies when there is a match the 
user can either route the line that matched, route the entire flowfile each 
time there is matched line or route the entire flowfile once if one of the 
lines is a match.

> Create RouteText processor
> --------------------------
>
>                 Key: NIFI-942
>                 URL: https://issues.apache.org/jira/browse/NIFI-942
>             Project: Apache NiFi
>          Issue Type: New Feature
>          Components: Extensions
>            Reporter: Mark Payne
>            Assignee: Joseph Percivall
>             Fix For: 0.4.0
>
>
> The idea is to route individual lines of a text file to different 
> relationships. This allows for splitting lines based on some criteria or 
> filtering out specific lines, and would be a much more convenient alternative 
> than RouteOnContent for textual data.
> A discussion for this took place on the users mailing list 
> (http://mail-archives.apache.org/mod_mbox/nifi-users/201509.mbox/%3CCAKpk5PxjszdX-NXMMf6Pcet4x7Y5GmrT7_jn9uyzS-h_a9TG3A%40mail.gmail.com%3E)
> The way that I could see this working is to have a few different properties:
> Routing Strategy:
> - Route each line to matching Property Name (default)
> - Route matching lines to 'matched' if all match
> - Route matching lines to 'matched' if any match
> - Route FlowFile to 'matched' if all lines match
> - Route FlowFile to 'matched' if any line matches
> A Match Strategy
> - Starts With
> - Ends With
> - Contains
> - Equals
> - Matches Regular Expression
> - Contains Regular Expression
> And then user-defined properties that indicate what to search each line of 
> text for.
> So to find lines that begin with the < character
> You would simply add a property named "Begins with Less Than" and set the 
> value to : <
> Then set the Match Strategy to Starts With
> And Routing Strategy to "Route each line to matching Property Name"
> Then, any line that begins with < will be routed to the Begins with Less Than 
> relationship.
> This would be a simple way to pull out any particular lines of interest in a 
> text file.
> I can see this being very useful for processing log files, CSV, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to