[
https://issues.apache.org/jira/browse/NIFI-436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14367204#comment-14367204
]
Joseph Witt commented on NIFI-436:
----------------------------------
Why close the ticket though? It seems totally reasonable that someone would
try to use 'SplitText' for this and seems similarly reasonable that it would
allow the user to select line-ending styles. We should also just be able to
auto-detect line ending styles by searching for the first 'common' line ending
sequence. I could imagine the following modes:
- AUTO-DETECT
- CRLF
- LF
- ...
We could seal the deal and support the range of
http://en.wikipedia.org/wiki/Newline
It is awesome that SplitContent works for his immediate case but SplitText is
almost certainly going to be more desirable for people dealing with text data.
This is also a great ticket to leave as it is, mark as beginner, and invite
contribs on.
Thanks
Joe
> SplitText should allow changing the endline regex
> -------------------------------------------------
>
> Key: NIFI-436
> URL: https://issues.apache.org/jira/browse/NIFI-436
> Project: Apache NiFi
> Issue Type: Improvement
> Reporter: Jon Parise
> Labels: beginner
>
> I have a CSV file in a format that inidcates the end of a line with a crlf.
> This file has embedded comments that have lf in them.
> When I run this file through the split text processor, it is splitting at the
> LF characters.
> I think it would be nice to have a setting to change the line ending
> characters for splitting text.
> I can't find anything in the documentation that indicates how I would change
> this behavior, so I assume it does not exist.
> Also, I would be willing to try and implement this improvement, but I can't
> seem to find the source for the SplitTextProcessor.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)