[ 
https://issues.apache.org/jira/browse/CSV-294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17460861#comment-17460861
 ] 

Joern Huxhorn commented on CSV-294:
-----------------------------------

Added a test reproducing the issue.

> CSVFormat does not support explicit " as escape char
> ----------------------------------------------------
>
>                 Key: CSV-294
>                 URL: https://issues.apache.org/jira/browse/CSV-294
>             Project: Commons CSV
>          Issue Type: Bug
>    Affects Versions: 1.9.0
>            Reporter: Joern Huxhorn
>            Priority: Major
>         Attachments: JiraCsv294Test.java
>
>
> Reading data that contains " does not work if escape character is *manually 
> set to {{'"'}}* as specified in [RFC 
> 4180|https://datatracker.ietf.org/doc/html/rfc4180].
> *It works for other escape characters or if no escape character is explicitly 
> defined in the format.*
> This line in {{Lexer.java}} is responsible for the originally quite erroneous 
> ticket:
> {{this.escape = mapNullToDisabled(format.getEscapeCharacter());}}
> From this line I (wrongly) deduced that an unspecified escape character would 
> actually disable escaping. Because of that I wanted to enable it by setting 
> it to {{'"'}} which causes exceptions in the Lexer for perfectly valid input. 
> That in turn convinced my that this is a way bigger issue than it is. Sorry 
> about that.
> I don't think that the current situation is ideal, though.
> I would not have been this confused if {{CSVFormat}} would be more explicit 
> about the escape char that will be used, i.e. if {{toString()}} would show 
> the implicitly used quote character or print - in case of {{null}} - that 
> this means it's using the quote character. It is currently omitted from the 
> output if it is not set explicitly.
> There is also no documentation about what {{null}} as escape character 
> actually means - it may be documented somewhere but isn't documented for 
> {{CSVFormat.getEscapeCharacter()}} or {{CSVFormat.Builder.set/getEscape()}} 
> methods.
> And setting the escape character explicitly to the value specified in the RFC 
> should certainly not fail, even if setting it to that value is superfluous 
> since {{null}} behaves exactly the same. 
> h4. Relevant part of the RFC:
> 7. If double-quotes are used to enclose fields, then a double-quote
> appearing inside a field must be escaped by preceding it with
> another double quote. For example:
> "aaa","b""bb","ccc"
> h4. Related issue:
> https://issues.apache.org/jira/browse/CSV-150



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to