David Mollitor created CSV-283:
----------------------------------
Summary: Whitespace Check Determines Delimiter Twice
Key: CSV-283
URL: https://issues.apache.org/jira/browse/CSV-283
Project: Commons CSV
Issue Type: Improvement
Reporter: David Mollitor
{code:java|title=Lexer.java}
/**
* Tests if the given char is a whitespace character.
*
* @return true if the given char is a whitespace character.
* @throws IOException If an I/O error occurs.
*/
boolean isWhitespace(final int ch) throws IOException {
return !isDelimiter(ch) && Character.isWhitespace((char) ch);
}
while (true) {
c = reader.read();
if (isDelimiter(c)) {
token.type = TOKEN;
return token;
}
if (isEndOfFile(c)) {
...
}
if (readEndOfLine(c)) {
...
}
if (!isWhitespace(c)) {
// error invalid char between token and next
delimiter
throw new IOException("(line " +
getCurrentLineNumber() +
") invalid char between encapsulated token
and delimiter");
}
}
{code}
So the first check is for the delimiter, and it returns quick if it finds it.
After that point, it's known that this is NOT a delimiter, so no need to
re-check it in {{isWhiteSpace}}. The delimiter check can be somewhat expensive
given it may involve a look-ahead IO read.
Remove the redundancy.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)