[jira] [Updated] (CSV-283) Remove Whitespace Check Determines Delimiter Twice

David Mollitor (Jira) Tue, 13 Jul 2021 14:07:04 -0700


     [ 
https://issues.apache.org/jira/browse/CSV-283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


David Mollitor updated CSV-283:
-------------------------------
    Summary: Remove Whitespace Check Determines Delimiter Twice  (was: 
Whitespace Check Determines Delimiter Twice)

> Remove Whitespace Check Determines Delimiter Twice
> --------------------------------------------------
>
>                 Key: CSV-283
>                 URL: https://issues.apache.org/jira/browse/CSV-283
>             Project: Commons CSV
>          Issue Type: Improvement
>            Reporter: David Mollitor
>            Priority: Minor
>
> {code:java|title=Lexer.java}
>     /**
>      * Tests if the given char is a whitespace character.
>      *
>      * @return true if the given char is a whitespace character.
>      * @throws IOException If an I/O error occurs.
>      */
>     boolean isWhitespace(final int ch) throws IOException {
>         return !isDelimiter(ch) && Character.isWhitespace((char) ch);
>     }
>                     while (true) {
>                         c = reader.read();
>                         if (isDelimiter(c)) {
>                             token.type = TOKEN;
>                             return token;
>                         }
>                         if (isEndOfFile(c)) {
>                             ...
>                         }
>                         if (readEndOfLine(c)) {
>                             ...
>                         }
>                         if (!isWhitespace(c)) {
>                             // error invalid char between token and next 
> delimiter
>                             throw new IOException("(line " + 
> getCurrentLineNumber() +
>                                     ") invalid char between encapsulated 
> token and delimiter");
>                         }
>                     }
> {code}
> So the first check is for the delimiter, and it returns quick if it finds it. 
>  After that point, it's known that this is NOT a delimiter, so no need to 
> re-check it in {{isWhiteSpace}}.  The delimiter check can be somewhat 
> expensive given it may involve a look-ahead IO read.
> Remove the redundancy. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (CSV-283) Remove Whitespace Check Determines Delimiter Twice

Reply via email to