[
https://issues.apache.org/jira/browse/CSV-150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17813977#comment-17813977
]
Gary D. Gregory commented on CSV-150:
-------------------------------------
This seems to be about parsing non-characters like \ufffe which feels like a
garbage-in problem.
> Escaping is not disableable
> ---------------------------
>
> Key: CSV-150
> URL: https://issues.apache.org/jira/browse/CSV-150
> Project: Commons CSV
> Issue Type: Bug
> Components: Parser
> Affects Versions: 1.1
> Reporter: Georg Tsakumagos
> Priority: Major
> Fix For: Review
>
> Attachments: CSV-150.patch
>
> Time Spent: 1h 10m
> Remaining Estimate: 0h
>
> h6. Problem
> If escaping is disabled the Lexer maps the NULL Character to the magic char
> '\ufffe'. I currently hit this char randomly with data. This leads to a
> RuntimeException inside of
> org.apache.commons.csv.Lexer.parseEncapsulatedToken(Token) with the message
> "invalid char between encapsulated token and delimiter".
> h6. Solution
> Don't map the Character object and use it.
> {code:title=Lexer.java|borderStyle=solid}
> Lexer(final CSVFormat format, final ExtendedBufferedReader reader) {
> this.reader = reader;
> this.delimiter = format.getDelimiter();
> this.escape = format.getEscapeCharacter();
> .
> .
> .
> }
> boolean isEscape(final int ch) {
> return null != this.escape && escape.charValue() == ch;
> }
> {code}
> h6. Hint
> This pattern is used in other cases to. It seem to be a systematic error.
> This cases should be refactored also.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)