[
https://issues.apache.org/jira/browse/CSV-294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17460861#comment-17460861
]
Joern Huxhorn commented on CSV-294:
-----------------------------------
Added a test reproducing the issue.
> CSVFormat does not support explicit " as escape char
> ----------------------------------------------------
>
> Key: CSV-294
> URL: https://issues.apache.org/jira/browse/CSV-294
> Project: Commons CSV
> Issue Type: Bug
> Affects Versions: 1.9.0
> Reporter: Joern Huxhorn
> Priority: Major
> Attachments: JiraCsv294Test.java
>
>
> Reading data that contains " does not work if escape character is *manually
> set to {{'"'}}* as specified in [RFC
> 4180|https://datatracker.ietf.org/doc/html/rfc4180].
> *It works for other escape characters or if no escape character is explicitly
> defined in the format.*
> This line in {{Lexer.java}} is responsible for the originally quite erroneous
> ticket:
> {{this.escape = mapNullToDisabled(format.getEscapeCharacter());}}
> From this line I (wrongly) deduced that an unspecified escape character would
> actually disable escaping. Because of that I wanted to enable it by setting
> it to {{'"'}} which causes exceptions in the Lexer for perfectly valid input.
> That in turn convinced my that this is a way bigger issue than it is. Sorry
> about that.
> I don't think that the current situation is ideal, though.
> I would not have been this confused if {{CSVFormat}} would be more explicit
> about the escape char that will be used, i.e. if {{toString()}} would show
> the implicitly used quote character or print - in case of {{null}} - that
> this means it's using the quote character. It is currently omitted from the
> output if it is not set explicitly.
> There is also no documentation about what {{null}} as escape character
> actually means - it may be documented somewhere but isn't documented for
> {{CSVFormat.getEscapeCharacter()}} or {{CSVFormat.Builder.set/getEscape()}}
> methods.
> And setting the escape character explicitly to the value specified in the RFC
> should certainly not fail, even if setting it to that value is superfluous
> since {{null}} behaves exactly the same.
> h4. Relevant part of the RFC:
> 7. If double-quotes are used to enclose fields, then a double-quote
> appearing inside a field must be escaped by preceding it with
> another double quote. For example:
> "aaa","b""bb","ccc"
> h4. Related issue:
> https://issues.apache.org/jira/browse/CSV-150
--
This message was sent by Atlassian Jira
(v8.20.1#820001)