[ 
https://issues.apache.org/jira/browse/NIFI-1118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15311560#comment-15311560
 ] 

ASF GitHub Bot commented on NIFI-1118:
--------------------------------------

Github user markobean commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/444#discussion_r65471941
  
    --- Diff: 
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/test/java/org/apache/nifi/processors/standard/TestSplitText.java
 ---
    @@ -39,6 +39,303 @@
                 + 
"\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nLastLine\n";
     
         @Test
    +    public void testLastLineExceedsSizeLimit() {
    +        final TestRunner runner = TestRunners.newTestRunner(new 
SplitText());
    +        runner.setProperty(SplitText.HEADER_LINE_COUNT, "0");
    +        runner.setProperty(SplitText.LINE_SPLIT_COUNT, "2");
    +        runner.setProperty(SplitText.FRAGMENT_MAX_SIZE, "20 B");
    +
    +        runner.enqueue("Line #1\nLine #2\nLine #3\nLong line exceeding 
limit");
    +        runner.run();
    +
    +        runner.assertTransferCount(SplitText.REL_FAILURE, 0);
    +        runner.assertTransferCount(SplitText.REL_ORIGINAL, 1);
    +        runner.assertTransferCount(SplitText.REL_SPLITS, 3);
    +    }
    --- End diff --
    
    Talk about thorough! I don't know how you managed to catch this edge case. 
However, for completeness, the RTN functionality once again increased 
complexity and another buffer count was added. Now, if the size limit is 
exceeded (with RTN = true and no header lines), then the number of EOL bytes 
previously added to the info.lengthBytes are subtracted from the previous line 
effectively removing the final EOL characters of the split.


> Enable SplitText processor to limit line length and filter header lines
> -----------------------------------------------------------------------
>
>                 Key: NIFI-1118
>                 URL: https://issues.apache.org/jira/browse/NIFI-1118
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Extensions
>            Reporter: Mark Bean
>            Assignee: Mark Bean
>             Fix For: 0.7.0
>
>
> Include the following functionality to the SplitText processor:
> 1) Maximum size limit of the split file(s)
> A new split file will be created if the next line to be added to the current 
> split file exceeds a user-defined maximum file size
> 2) Header line marker
> User-defined character(s) can be used to identify the header line(s) of the 
> data file rather than a predetermined number of lines
> These changes are additions, not a replacement of any property or behavior. 
> In the case of header line marker, the existing property "Header Line Count" 
> must be zero for the new property and behavior to be used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to