[ 
https://issues.apache.org/jira/browse/NIFI-12238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gábor Gyimesi updated NIFI-12238:
---------------------------------
    Status: Patch Available  (was: In Progress)

https://github.com/apache/nifi/pull/7892

> SplitText trims text ending character when max fragment size is specified and 
> multiple endlines are present
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: NIFI-12238
>                 URL: https://issues.apache.org/jira/browse/NIFI-12238
>             Project: Apache NiFi
>          Issue Type: Bug
>            Reporter: Gábor Gyimesi
>            Assignee: Gábor Gyimesi
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> There seems to be an issue with SplitText processor when max fragment size is 
> specified for fragment limit and the ending of the fragment contains multiple 
> endline characters to be trimmed. In this case not only the endline 
> characters, but the last character of the text is also trimmed.
> Test case:
> {code:java}
> @Test
> public void testMaxFragmentSizeWithTrimmedEndlines() {
>     final TestRunner splitRunner = TestRunners.newTestRunner(new SplitText());
>     splitRunner.setProperty(SplitText.HEADER_LINE_COUNT, "2");
>     splitRunner.setProperty(SplitText.LINE_SPLIT_COUNT, "0");
>     splitRunner.setProperty(SplitText.FRAGMENT_MAX_SIZE, "30 B");
>     splitRunner.setProperty(SplitText.REMOVE_TRAILING_NEWLINES, "true");
>     splitRunner.enqueue("header1\nheader2\nline1 longer than 
> limit\nline2\nline3\n\n\n\n\n");
>     splitRunner.run();
>     splitRunner.assertTransferCount(SplitText.REL_SPLITS, 3);
>     splitRunner.assertTransferCount(SplitText.REL_ORIGINAL, 1);
>     splitRunner.assertTransferCount(SplitText.REL_FAILURE, 0);
>     final List<MockFlowFile> splits = 
> splitRunner.getFlowFilesForRelationship(SplitText.REL_SPLITS);
>     splits.get(0).assertContentEquals("header1\nheader2\nline1 longer than 
> limit");
>     splits.get(1).assertContentEquals("header1\nheader2\nline2\nline3");
>     splits.get(2).assertContentEquals("header1\nheader2");
> }
> {code}
> Result:
> {code:java}
> expected: <header1
> header2
> line2
> line3> but was: <header1
> header2
> line2
> line>
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to