[
https://issues.apache.org/jira/browse/CSV-294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Joern Huxhorn updated CSV-294:
------------------------------
Description:
Writing and reading data that contains " does not work even if escape character
is set to '"' as specified in [RFC
4180|https://datatracker.ietf.org/doc/html/rfc4180]. It works for other escape
characters.
It *does not work* if no escape character is specified at all, which was
reported in CSV-150.
This means that the default {{CSVFormat}} constants are unable to handle data
that contain " somewhere in the middle of the string.
{{CSVFormat.DEFAULT}} or at least {{CSVFormat.RFC4180}} and {{CSVFormat.EXCEL}}
should have escape character set to '"' by default, as defined in the RFC.
This is also the way Excel escapes ", i.e. Excel is behaving as specified in
RFC 4180 but commons-csv isn't.
I upgraded this ticket to *Critical* since the current default behavior will
cause broken CSV files that can't be consumed with commons-csv and changing the
default to what the RFC defines has a similar effect.
h4. Relevant part of the RFC:
7. If double-quotes are used to enclose fields, then a double-quote
appearing inside a field must be escaped by preceding it with
another double quote. For example:
"aaa","b""bb","ccc"
h4. Related issue:
https://issues.apache.org/jira/browse/CSV-150
was:
CSVFormat.DEFAULT or at least CSVFormat.RFC4180 and CSVFormat.EXCEL should have
escape character set to '"' as defined in [RFC
4180|https://datatracker.ietf.org/doc/html/rfc4180].
7. If double-quotes are used to enclose fields, then a double-quote
appearing inside a field must be escaped by preceding it with
another double quote. For example:
"aaa","b""bb","ccc"
I just realized that '"' {*}doesn't work as escape character at all{*}. This...
isn't good.
This is also the way Excel escapes ", i.e. Excel is behaving as specified in
RFC 4180 but commons-csv isn't.
Related issue:
https://issues.apache.org/jira/browse/CSV-150
> CSVFormat does not support " as escape char
> -------------------------------------------
>
> Key: CSV-294
> URL: https://issues.apache.org/jira/browse/CSV-294
> Project: Commons CSV
> Issue Type: Bug
> Affects Versions: 1.9.0
> Reporter: Joern Huxhorn
> Priority: Critical
>
> Writing and reading data that contains " does not work even if escape
> character is set to '"' as specified in [RFC
> 4180|https://datatracker.ietf.org/doc/html/rfc4180]. It works for other
> escape characters.
> It *does not work* if no escape character is specified at all, which was
> reported in CSV-150.
> This means that the default {{CSVFormat}} constants are unable to handle data
> that contain " somewhere in the middle of the string.
> {{CSVFormat.DEFAULT}} or at least {{CSVFormat.RFC4180}} and
> {{CSVFormat.EXCEL}} should have escape character set to '"' by default, as
> defined in the RFC.
> This is also the way Excel escapes ", i.e. Excel is behaving as specified in
> RFC 4180 but commons-csv isn't.
> I upgraded this ticket to *Critical* since the current default behavior will
> cause broken CSV files that can't be consumed with commons-csv and changing
> the default to what the RFC defines has a similar effect.
> h4. Relevant part of the RFC:
> 7. If double-quotes are used to enclose fields, then a double-quote
> appearing inside a field must be escaped by preceding it with
> another double quote. For example:
> "aaa","b""bb","ccc"
> h4. Related issue:
> https://issues.apache.org/jira/browse/CSV-150
--
This message was sent by Atlassian Jira
(v8.20.1#820001)