[
https://issues.apache.org/jira/browse/NIFI-436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15336159#comment-15336159
]
Michael Moser commented on NIFI-436:
------------------------------------
Hello [[email protected]], thanks for your continued interest and
contribution.
I reviewed your latest patch and I'm worried that the new End Of Line Character
property changes existing behavior of SplitText too much. Currently, SplitText
handles both LF and CRLF line endings equally. The new End Of Line Character
property requires the user to know which line endings are being used. As a
user, if I don't know the line endings in my files, and I guess incorrectly on
my End Of Line Character choice, then SplitText fails to function like it did
in the past. Also, if my data flow has some files with LF endings mixed in
with files that have CRLF endings, then I dont want SplitText to work properly
on some files and not others.
I believe we need SplitText to support an "AUTO-DETECT" or "ANY" option for End
Of Line Character, as suggested by [~joewitt], and make that the default.
> SplitText should allow changing the endline regex
> -------------------------------------------------
>
> Key: NIFI-436
> URL: https://issues.apache.org/jira/browse/NIFI-436
> Project: Apache NiFi
> Issue Type: Improvement
> Affects Versions: 0.6.1
> Reporter: Jon Parise
> Assignee: Karthik Narayanan
> Labels: beginner
> Fix For: 1.0.0
>
> Attachments: nifi-4361x.patch
>
>
> I have a CSV file in a format that inidcates the end of a line with a crlf.
> This file has embedded comments that have lf in them.
> When I run this file through the split text processor, it is splitting at the
> LF characters.
> I think it would be nice to have a setting to change the line ending
> characters for splitting text.
> I can't find anything in the documentation that indicates how I would change
> this behavior, so I assume it does not exist.
> Also, I would be willing to try and implement this improvement, but I can't
> seem to find the source for the SplitTextProcessor.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)