[
https://issues.apache.org/jira/browse/CSV-58?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13611715#comment-13611715
]
Benedikt Ritter commented on CSV-58:
------------------------------------
I just got aware of another problem. If one escapes a simple character that can
also be part of a special character, this sequence will be replaced by the
special character. So the following will fail:
{code}
@Test
public void testEscaping2() throws Exception {
final String code = "plain," +
"character!rEscaped";
final Lexer lexer = getLexer(code,
CSVFormat.newBuilder().withEscape('!').build());
assertTokenEquals(TOKEN, "plain", lexer.nextToken(new Token()));
assertTokenEquals(EOF, "character!rEscaped", lexer.nextToken(new Token()));
}
{code}
Reason:
{code}
org.junit.ComparisonFailure: Token content expected:<character[!r]Escaped> but
was:<character[
]Escaped>
{code}
> Escape handling needs rethinking
> --------------------------------
>
> Key: CSV-58
> URL: https://issues.apache.org/jira/browse/CSV-58
> Project: Commons CSV
> Issue Type: Bug
> Components: Parser
> Reporter: Sebb
> Fix For: 1.0
>
> Attachments: commons-csv.diff
>
>
> The current escape parsing converts <esc><char> to plain <char> if the <char>
> is not one of the special characters to be escaped.
> This can affect unicode escapes if the <esc> character is backslash.
> One way round this is to specifically check for <char> == 'u', but it seems
> wrong to only do this for 'u'.
> Another solution would be to leave <esc><char> as is unless the <char> is one
> of the special characters.
> There are several possible ways to treat unrecognised escapes:
> - treat it as if the escape char had not been present (current behaviour)
> - leave the escape char as is
> - throw an exception
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira