[ 
https://issues.apache.org/jira/browse/CSV-294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joern Huxhorn updated CSV-294:
------------------------------

This is the exception thrown while parsing with " as escape character:

Caused by: java.io.IOException: (startline 3) EOF reached before encapsulated 
token finished
    at org.apache.commons.csv.Lexer.parseEncapsulatedToken(Lexer.java:371)
    at org.apache.commons.csv.Lexer.nextToken(Lexer.java:285)
    at org.apache.commons.csv.CSVParser.nextRecord(CSVParser.java:701)
    at 
org.apache.commons.csv.CSVParser$CSVRecordIterator.getNextRecord(CSVParser.java:146)
    ... 3 more

> CSVFormat does not support " as escape char
> -------------------------------------------
>
>                 Key: CSV-294
>                 URL: https://issues.apache.org/jira/browse/CSV-294
>             Project: Commons CSV
>          Issue Type: Bug
>    Affects Versions: 1.9.0
>            Reporter: Joern Huxhorn
>            Priority: Critical
>
> Writing and reading data that contains " does not work even if escape 
> character is set to '"' as specified in [RFC 
> 4180|https://datatracker.ietf.org/doc/html/rfc4180]. It works for other 
> escape characters.
> It *does not work* if no escape character is specified at all, which was 
> reported in CSV-150.
> This means that the default {{CSVFormat}} constants are unable to handle data 
> that contain " somewhere in the middle of the string.
> {{CSVFormat.DEFAULT}} or at least {{CSVFormat.RFC4180}} and 
> {{CSVFormat.EXCEL}} should have escape character set to '"' by default, as 
> defined in the RFC.
> This is also the way Excel escapes ", i.e. Excel is behaving as specified in 
> RFC 4180 but commons-csv isn't.
> I upgraded this ticket to *Critical* since the current default behavior will 
> cause broken CSV files that can't be consumed with commons-csv and changing 
> the default to what the RFC defines has a similar effect.
> h4. Relevant part of the RFC:
> 7. If double-quotes are used to enclose fields, then a double-quote
> appearing inside a field must be escaped by preceding it with
> another double quote. For example:
> "aaa","b""bb","ccc"
> h4. Related issue:
> https://issues.apache.org/jira/browse/CSV-150



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to