Gábor Gyimesi created NIFI-12238:
------------------------------------
Summary: SplitText trims text ending character when max fragment
size is specified and multiple endlines are present
Key: NIFI-12238
URL: https://issues.apache.org/jira/browse/NIFI-12238
Project: Apache NiFi
Issue Type: Bug
Reporter: Gábor Gyimesi
Assignee: Gábor Gyimesi
There seems to be an issue with SplitText processor when max fragment size is
specified for fragment limit and the ending of the fragment contains multiple
endline characters to be trimmed. In this case not only the endline characters,
but the last character of the text is also trimmed.
Test case:
{code:java}
@Test
public void testMaxFragmentSizeWithTrimmedEndlines() {
final TestRunner splitRunner = TestRunners.newTestRunner(new SplitText());
splitRunner.setProperty(SplitText.HEADER_LINE_COUNT, "2");
splitRunner.setProperty(SplitText.LINE_SPLIT_COUNT, "0");
splitRunner.setProperty(SplitText.FRAGMENT_MAX_SIZE, "30 B");
splitRunner.setProperty(SplitText.REMOVE_TRAILING_NEWLINES, "true");
splitRunner.enqueue("header1\nheader2\nline1 longer than
limit\nline2\nline3\n\n\n\n\n");
splitRunner.run();
splitRunner.assertTransferCount(SplitText.REL_SPLITS, 3);
splitRunner.assertTransferCount(SplitText.REL_ORIGINAL, 1);
splitRunner.assertTransferCount(SplitText.REL_FAILURE, 0);
final List<MockFlowFile> splits =
splitRunner.getFlowFilesForRelationship(SplitText.REL_SPLITS);
splits.get(0).assertContentEquals("header1\nheader2\nline1 longer than
limit");
splits.get(1).assertContentEquals("header1\nheader2\nline2\nline3");
splits.get(2).assertContentEquals("header1\nheader2");
}
{code}
Result:
{code:java}
expected: <header1
header2
line2
line3> but was: <header1
header2
line2
line>
{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)