[
https://issues.apache.org/jira/browse/CSV-58?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13611681#comment-13611681
]
Benedikt Ritter commented on CSV-58:
------------------------------------
I agree with Sebb. If we find an escape character we should only change the
content of the token being parsed, if there is a special character (defined by
the CSVFormat) that needs escaping. Otherwise we should let the character
sequence unchanged.
Anirudha, I'm having troubles appling your patch. It seem's to contain a lot of
white space changes. Please try to create diffs only for lines that are
actually affected :)
I have only looked thourgh the diff file. It looks like you are defining an
additional collection of escape characters. I don't understand this. IIUC it is
sufficient to tweak the Lexer so that is just doesn't remove escape characters
if nothing follows that has to be escaped.
> Escape handling needs rethinking
> --------------------------------
>
> Key: CSV-58
> URL: https://issues.apache.org/jira/browse/CSV-58
> Project: Commons CSV
> Issue Type: Bug
> Components: Parser
> Reporter: Sebb
> Fix For: 1.0
>
> Attachments: commons-csv.diff
>
>
> The current escape parsing converts <esc><char> to plain <char> if the <char>
> is not one of the special characters to be escaped.
> This can affect unicode escapes if the <esc> character is backslash.
> One way round this is to specifically check for <char> == 'u', but it seems
> wrong to only do this for 'u'.
> Another solution would be to leave <esc><char> as is unless the <char> is one
> of the special characters.
> There are several possible ways to treat unrecognised escapes:
> - treat it as if the escape char had not been present (current behaviour)
> - leave the escape char as is
> - throw an exception
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira