[
https://issues.apache.org/jira/browse/NIFI-478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14387830#comment-14387830
]
Joseph Witt commented on NIFI-478:
----------------------------------
mark,
There appears to be an edge condition bug. I've attached the template of my
test flow. But the general logic is this.
Given this input text:
{quote}
This is a test. This is another test. And this is yet another test. Finally
this is the last Test.
{quote}
And this 'Byte Sequence'
{quote}
test
{quote}
I expected *and did successfully receive* the following output per split under
default configuration (not keeping the sequence)
{quote}
This is a
. This is another
. And this is yet another
. Finally this is the last Test.
{quote}
I expected *and did successfully receive* the following output per split under
a configuration that keeps the sequence as a trailing indicator
{quote}
This is a test
. This is another test
. And this is yet another test
. Finally this is the last Test.
{quote}
I expected *but did not receive* the following output per split under a
configuration that keeps the sequence as a leading indicator
{quote}
This is a
test. This is another
test. And this is yet another
test. Finally this is the last Test.
{quote}
..instead what I got was
{quote}
This is a
test. This is another
test. And this is yet another
. Finally this is the last Test.
{quote}
> Allow SplitContent to split based on text and allow byte sequence to be
> leading or trailing
> -------------------------------------------------------------------------------------------
>
> Key: NIFI-478
> URL: https://issues.apache.org/jira/browse/NIFI-478
> Project: Apache NiFi
> Issue Type: Improvement
> Components: Extensions
> Reporter: Mark Payne
> Assignee: Mark Payne
> Fix For: 0.1.0
>
> Attachments:
> 0001-NIFI-478-Added-byte-sequence-position-property-and-a.patch
>
>
> Currently SplitContent requires that the user encode the byte sequence to
> split on into hexadecimal notation. User should be able to instead enter
> UTF-8 formatted text.
> Also, it requires that if we keep byte sequence, the byte sequence between
> two splits, the byte sequence is appended to the first split; should allow
> user to add it as a leading byte sequence as well.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)