David Mollitor created CSV-283:
----------------------------------

             Summary: Whitespace Check Determines Delimiter Twice
                 Key: CSV-283
                 URL: https://issues.apache.org/jira/browse/CSV-283
             Project: Commons CSV
          Issue Type: Improvement
            Reporter: David Mollitor


{code:java|title=Lexer.java}
    /**
     * Tests if the given char is a whitespace character.
     *
     * @return true if the given char is a whitespace character.
     * @throws IOException If an I/O error occurs.
     */
    boolean isWhitespace(final int ch) throws IOException {
        return !isDelimiter(ch) && Character.isWhitespace((char) ch);
    }

                    while (true) {
                        c = reader.read();
                        if (isDelimiter(c)) {
                            token.type = TOKEN;
                            return token;
                        }
                        if (isEndOfFile(c)) {
                            ...
                        }
                        if (readEndOfLine(c)) {
                            ...
                        }
                        if (!isWhitespace(c)) {
                            // error invalid char between token and next 
delimiter
                            throw new IOException("(line " + 
getCurrentLineNumber() +
                                    ") invalid char between encapsulated token 
and delimiter");
                        }
                    }
{code}

So the first check is for the delimiter, and it returns quick if it finds it.  
After that point, it's known that this is NOT a delimiter, so no need to 
re-check it in {{isWhiteSpace}}.  The delimiter check can be somewhat expensive 
given it may involve a look-ahead IO read.

Remove the redundancy. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to