[
https://issues.apache.org/jira/browse/CSV-58?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13229415#comment-13229415
]
Sebb commented on CSV-58:
-------------------------
If unicode parsing is not selected, the unicode sequences lose their escape
character so cannot then be parsed later.
This is really about more than just unicode escape sequences, though that is
what alerted me to the issue.
The whole business of escape handling needs to be very carefully documented
(and tested!) to ensure predictable behaviour.
> Unicode escapes are lost if escape character is backslash
> ---------------------------------------------------------
>
> Key: CSV-58
> URL: https://issues.apache.org/jira/browse/CSV-58
> Project: Commons CSV
> Issue Type: Bug
> Reporter: Sebb
>
> The current escape parsing converts <esc><char> to plain <char> if the <char>
> is not one of the special characters to be escaped.
> This can affect unicode escapes if the <esc> character is backslash.
> One way round this is to specifically check for <char> == 'u', but it seems
> wrong to only do this for 'u'.
> Another solution would be to leave <esc><char> as is unless the <char> is one
> of the special characters.
> There are several possible ways to treat unrecognised escapes:
> - treat it as if the escape char had not been present (current behaviour)
> - leave the escape char as is
> - throw an exception
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira