Github user markap14 commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/1116#discussion_r83262537
  
    --- Diff: 
nifi-commons/nifi-utils/src/main/java/org/apache/nifi/stream/io/util/TextLineDemarcator.java
 ---
    @@ -0,0 +1,227 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.nifi.stream.io.util;
    +
    +import java.io.BufferedReader;
    +import java.io.IOException;
    +import java.io.InputStream;
    +
    +/**
    + * Implementation of demarcator of text lines in the provided
    + * {@link InputStream}. It works similar to the {@link BufferedReader} and 
its
    + * {@link BufferedReader#readLine()} methods except that it does not 
create a
    + * String representing the text line and instead returns the offset info 
for the
    + * computed text line. See {@link #nextOffsetInfo()} and
    + * {@link #nextOffsetInfo(byte[])} for more details.
    + * <p>
    + * This class is NOT thread-safe.
    + * </p>
    + */
    +public class TextLineDemarcator {
    +
    +    private final static int INIT_BUFFER_SIZE = 8192;
    +
    +    private final InputStream is;
    +
    +    private final int initialBufferSize;
    +
    +    private byte[] buffer;
    +
    +    private int index;
    +
    +    private int mark;
    +
    +    private long offset;
    +
    +    private int bufferLength;
    +
    +    /**
    +     * Constructs an instance of demarcator with provided {@link 
InputStream}
    +     * and default buffer size.
    +     */
    +    public TextLineDemarcator(InputStream is) {
    +        this(is, INIT_BUFFER_SIZE);
    +    }
    +
    +    /**
    +     * Constructs an instance of demarcator with provided {@link 
InputStream}
    +     * and initial buffer size.
    +     */
    +    public TextLineDemarcator(InputStream is, int initialBufferSize) {
    +        if (is == null) {
    +            throw new IllegalArgumentException("'is' must not be null.");
    +        }
    +        if (initialBufferSize < 1) {
    +            throw new IllegalArgumentException("'initialBufferSize' must 
be > 0.");
    +        }
    +        this.is = is;
    +        this.initialBufferSize = initialBufferSize;
    +        this.buffer = new byte[initialBufferSize];
    +    }
    +
    +    /**
    +     * Will compute the next <i>offset info</i> for a
    +     * text line (line terminated by either '\r', '\n' or '\r\n').
    +     * <br>
    +     * The <i>offset info</i> computed and returned as <code>long[]</code> 
consisting of
    +     * 4 elements <code>{startOffset, length, crlfLength, 
startsWithMatch}</code>.
    +     *  <ul>
    +     *    <li><i>startOffset</i> - the offset in the overall stream which 
represents the beginning of the text line</li>
    +     *    <li><i>length</i> - length of the text line including CRLF 
characters</li>
    +     *    <li><i>crlfLength</i> - the length of the CRLF. Could be either 
1 (if line ends with '\n' or '\r')
    +     *                                          or 2 (if line ends with 
'\r\n').</li>
    +     *    <li><i>startsWithMatch</i> - value is always 1. See {@link 
#nextOffsetInfo(byte[])} for more info.</li>
    +     *  </ul>
    +     *
    +     * @return offset info as <code>long[]</code>
    +     */
    +    public long[] nextOffsetInfo() {
    --- End diff --
    
    Why are we returning a long[] here instead of a POJO? This makes the code 
more difficult to follow.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to