[ 
https://issues.apache.org/jira/browse/NIFI-478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14387830#comment-14387830
 ] 

Joseph Witt commented on NIFI-478:
----------------------------------

mark,

There appears to be an edge condition bug.   I've attached the template of my 
test flow.  But the general logic is this.

Given this input text:

{quote}
This is a test.  This is another test.  And this is yet another test.  Finally 
this is the last Test.
{quote}

And this 'Byte Sequence'
{quote}
test
{quote}

I expected *and did successfully receive* the following output per split under 
default configuration (not keeping the sequence)

{quote}
This is a 
.  This is another 
.  And this is yet another 
.  Finally this is the last Test.
{quote}

I expected *and did successfully receive* the following output per split under 
a configuration that keeps the sequence as a trailing indicator
{quote}
This is a test
.  This is another test
.  And this is yet another test
.  Finally this is the last Test.
{quote}

I expected *but did not receive* the following output per split under a 
configuration that keeps the sequence as a leading indicator
{quote}
This is a 
test.  This is another 
test.  And this is yet another 
test.  Finally this is the last Test.
{quote}
..instead what I got was
{quote}
This is a 
test.  This is another 
test.  And this is yet another 
.  Finally this is the last Test.
{quote}



> Allow SplitContent to split based on text and allow byte sequence to be 
> leading or trailing
> -------------------------------------------------------------------------------------------
>
>                 Key: NIFI-478
>                 URL: https://issues.apache.org/jira/browse/NIFI-478
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Extensions
>            Reporter: Mark Payne
>            Assignee: Mark Payne
>             Fix For: 0.1.0
>
>         Attachments: 
> 0001-NIFI-478-Added-byte-sequence-position-property-and-a.patch
>
>
> Currently SplitContent requires that the user encode the byte sequence to 
> split on into hexadecimal notation. User should be able to instead enter 
> UTF-8 formatted text.
> Also, it requires that if we keep byte sequence, the byte sequence between 
> two splits, the byte sequence is appended to the first split; should allow 
> user to add it as a leading byte sequence as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to