[
https://issues.apache.org/jira/browse/NIFI-12238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gábor Gyimesi updated NIFI-12238:
---------------------------------
Status: Patch Available (was: In Progress)
https://github.com/apache/nifi/pull/7892
> SplitText trims text ending character when max fragment size is specified and
> multiple endlines are present
> -----------------------------------------------------------------------------------------------------------
>
> Key: NIFI-12238
> URL: https://issues.apache.org/jira/browse/NIFI-12238
> Project: Apache NiFi
> Issue Type: Bug
> Reporter: Gábor Gyimesi
> Assignee: Gábor Gyimesi
> Priority: Major
> Time Spent: 10m
> Remaining Estimate: 0h
>
> There seems to be an issue with SplitText processor when max fragment size is
> specified for fragment limit and the ending of the fragment contains multiple
> endline characters to be trimmed. In this case not only the endline
> characters, but the last character of the text is also trimmed.
> Test case:
> {code:java}
> @Test
> public void testMaxFragmentSizeWithTrimmedEndlines() {
> final TestRunner splitRunner = TestRunners.newTestRunner(new SplitText());
> splitRunner.setProperty(SplitText.HEADER_LINE_COUNT, "2");
> splitRunner.setProperty(SplitText.LINE_SPLIT_COUNT, "0");
> splitRunner.setProperty(SplitText.FRAGMENT_MAX_SIZE, "30 B");
> splitRunner.setProperty(SplitText.REMOVE_TRAILING_NEWLINES, "true");
> splitRunner.enqueue("header1\nheader2\nline1 longer than
> limit\nline2\nline3\n\n\n\n\n");
> splitRunner.run();
> splitRunner.assertTransferCount(SplitText.REL_SPLITS, 3);
> splitRunner.assertTransferCount(SplitText.REL_ORIGINAL, 1);
> splitRunner.assertTransferCount(SplitText.REL_FAILURE, 0);
> final List<MockFlowFile> splits =
> splitRunner.getFlowFilesForRelationship(SplitText.REL_SPLITS);
> splits.get(0).assertContentEquals("header1\nheader2\nline1 longer than
> limit");
> splits.get(1).assertContentEquals("header1\nheader2\nline2\nline3");
> splits.get(2).assertContentEquals("header1\nheader2");
> }
> {code}
> Result:
> {code:java}
> expected: <header1
> header2
> line2
> line3> but was: <header1
> header2
> line2
> line>
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)