[jira] [Commented] (NIFI-1118) Enable SplitText processor to limit line length and filter header lines

ASF GitHub Bot (JIRA) Wed, 20 Jan 2016 14:07:42 -0800

    [ 
https://issues.apache.org/jira/browse/NIFI-1118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15109574#comment-15109574
 ]


ASF GitHub Bot commented on NIFI-1118:
--------------------------------------

Github user markap14 commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/135#discussion_r50327323
  
    --- Diff: 
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/SplitText.java
 ---
    @@ -143,26 +165,12 @@ protected void init(final 
ProcessorInitializationContext context) {
             return properties;
         }
     
    -    private int readLines(final InputStream in, final int maxNumLines, 
final OutputStream out, final boolean keepAllNewLines) throws IOException {
    -        int numLines = 0;
    -        for (int i = 0; i < maxNumLines; i++) {
    -            final long bytes = countBytesToSplitPoint(in, out, 
keepAllNewLines || (i != maxNumLines - 1));
    -            if (bytes <= 0) {
    -                return numLines;
    -            }
    -
    -            numLines++;
    -        }
    -
    -        return numLines;
    -    }
    -
    -    private long countBytesToSplitPoint(final InputStream in, final 
OutputStream out, final boolean includeLineDelimiter) throws IOException {
    +    private int readLine(final InputStream in, final OutputStream out,
    +                          final boolean includeLineDelimiter) throws 
IOException {
             int lastByte = -1;
    -        long bytesRead = 0L;
    +        int bytesRead = 0;
     
             while (true) {
    -            in.mark(1);
    --- End diff --
    
    Trying to understand the logic here - why was this line removed? It looks 
like it is marked so that down below, on line 206 we can call in.reset() - with 
this removed, if we get to that line, where in.reset() is called, I believe it 
will thrown an IOException.


> Enable SplitText processor to limit line length and filter header lines
> -----------------------------------------------------------------------
>
>                 Key: NIFI-1118
>                 URL: https://issues.apache.org/jira/browse/NIFI-1118
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Extensions
>            Reporter: Mark Bean
>            Assignee: Joe Skora
>             Fix For: 0.5.0
>
>
> Include the following functionality to the SplitText processor:
> 1) Maximum size limit of the split file(s)
> A new split file will be created if the next line to be added to the current 
> split file exceeds a user-defined maximum file size
> 2) Header line marker
> User-defined character(s) can be used to identify the header line(s) of the 
> data file rather than a predetermined number of lines
> These changes are additions, not a replacement of any property or behavior. 
> In the case of header line marker, the existing property "Header Line Count" 
> must be zero for the new property and behavior to be used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (NIFI-1118) Enable SplitText processor to limit line length and filter header lines

Reply via email to