Patrick Gäckle created CSV-222:
----------------------------------

             Summary: invalid char between encapsulated token and delimiter
                 Key: CSV-222
                 URL: https://issues.apache.org/jira/browse/CSV-222
             Project: Commons CSV
          Issue Type: Bug
          Components: Parser
    Affects Versions: 1.4
            Reporter: Patrick Gäckle
         Attachments: faulty.csv

When trying to read the file [^faulty.csv] and parse it I get the folowwing 
error:

{code}
java.io.IOException: (line 1) invalid char between encapsulated token and 
delimiter
        at org.apache.commons.csv.Lexer.parseEncapsulatedToken(Lexer.java:275)
        at org.apache.commons.csv.Lexer.nextToken(Lexer.java:152)
        at org.apache.commons.csv.CSVParser.nextRecord(CSVParser.java:500)
        at org.apache.commons.csv.CSVParser.initializeHeader(CSVParser.java:389)
        at org.apache.commons.csv.CSVParser.<init>(CSVParser.java:284)
        at org.apache.commons.csv.CSVParser.<init>(CSVParser.java:252)
        at org.apache.commons.csv.CSVFormat.parse(CSVFormat.java:846)
{code}

The line of code is the parsing part returning the iterator of it:

{code:java}
csvFormat = 
CSVFormat.DEFAULT.withHeader().withDelimiter(';').withIgnoreHeaderCase();
iterator = csvFormat.parse(reader).iterator();
{code}

The invalid char is the contained SOH and STX non printable characters at the 
end of line.
I debugged through the source of this and ran into the Exception in 
{noformat}Lexer#parseEncapsulatedToken{noformat}.

Unfortunately I'm not able to provide some hints on fixing this as I'm not 
familiar with these type of characters and what behaviour they should have.

Sincerely



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to