[
https://issues.apache.org/jira/browse/NIFI-2874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mark Payne updated NIFI-2874:
-----------------------------
Status: Patch Available (was: Open)
> StreamDemarcator can return wrong data for token
> ------------------------------------------------
>
> Key: NIFI-2874
> URL: https://issues.apache.org/jira/browse/NIFI-2874
> Project: Apache NiFi
> Issue Type: Bug
> Components: Extensions
> Reporter: Mark Payne
> Assignee: Mark Payne
> Priority: Critical
> Fix For: 1.1.0, 0.7.1
>
>
> There is a case where StreamDemarcator can return the wrong data for a token.
> If a token ends at the end of the buffer, and the next token is smaller than
> the previous, it can result in the next token keeping part of the buffer's
> content. The code below is a unit test that exposes this:
> {code}
> @Test
> public void testOnBufferSplitNoTrailingDelimiter() throws IOException {
> final byte[] inputData = "Yes\nNo".getBytes(StandardCharsets.UTF_8);
> ByteArrayInputStream is = new ByteArrayInputStream(inputData);
> StreamDemarcator scanner = new StreamDemarcator(is, "\n".getBytes(),
> 1000, 3);
> final byte[] first = scanner.nextToken();
> final byte[] second = scanner.nextToken();
> assertNotNull(first);
> assertNotNull(second);
> assertArrayEquals(first, new byte[] {'Y', 'e', 's'});
> assertArrayEquals(second, new byte[] {'N', 'o'});
> }
> {code}
> In this case, the second token, which should be 'No' comes back as 'Nos'
> because it contains the 's' from the previous token.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)