[
https://issues.apache.org/jira/browse/CSV-222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Patrick Gäckle updated CSV-222:
-------------------------------
Description:
When trying to read the file [^faulty.csv] and parse it I get the following
error:
{code}
java.io.IOException: (line 1) invalid char between encapsulated token and
delimiter
at org.apache.commons.csv.Lexer.parseEncapsulatedToken(Lexer.java:275)
at org.apache.commons.csv.Lexer.nextToken(Lexer.java:152)
at org.apache.commons.csv.CSVParser.nextRecord(CSVParser.java:500)
at org.apache.commons.csv.CSVParser.initializeHeader(CSVParser.java:389)
at org.apache.commons.csv.CSVParser.<init>(CSVParser.java:284)
at org.apache.commons.csv.CSVParser.<init>(CSVParser.java:252)
at org.apache.commons.csv.CSVFormat.parse(CSVFormat.java:846)
{code}
The line of code is the parsing part returning the iterator of it:
{code:java}
csvFormat =
CSVFormat.DEFAULT.withHeader().withDelimiter(';').withIgnoreHeaderCase();
iterator = csvFormat.parse(reader).iterator();
{code}
The invalid char is the contained SOH and STX non printable characters at the
end of line.
I debugged through the source of this and ran into the Exception in the Lexer
not handling these special characters
Unfortunately I'm not able to provide some hints on fixing this as I'm not
familiar with these type of characters and what behaviour they should have.
Sincerely
was:
When trying to read the file [^faulty.csv] and parse it I get the following
error:
{code}
java.io.IOException: (line 1) invalid char between encapsulated token and
delimiter
at org.apache.commons.csv.Lexer.parseEncapsulatedToken(Lexer.java:275)
at org.apache.commons.csv.Lexer.nextToken(Lexer.java:152)
at org.apache.commons.csv.CSVParser.nextRecord(CSVParser.java:500)
at org.apache.commons.csv.CSVParser.initializeHeader(CSVParser.java:389)
at org.apache.commons.csv.CSVParser.<init>(CSVParser.java:284)
at org.apache.commons.csv.CSVParser.<init>(CSVParser.java:252)
at org.apache.commons.csv.CSVFormat.parse(CSVFormat.java:846)
{code}
The line of code is the parsing part returning the iterator of it:
{code:java}
csvFormat =
CSVFormat.DEFAULT.withHeader().withDelimiter(';').withIgnoreHeaderCase();
iterator = csvFormat.parse(reader).iterator();
{code}
The invalid char is the contained SOH and STX non printable characters at the
end of line.
I debugged through the source of this and ran into the Exception in
{noformat}Lexer#parseEncapsulatedToken{noformat}.
Unfortunately I'm not able to provide some hints on fixing this as I'm not
familiar with these type of characters and what behaviour they should have.
Sincerely
> invalid char between encapsulated token and delimiter
> -----------------------------------------------------
>
> Key: CSV-222
> URL: https://issues.apache.org/jira/browse/CSV-222
> Project: Commons CSV
> Issue Type: Bug
> Components: Parser
> Affects Versions: 1.4
> Reporter: Patrick Gäckle
> Priority: Major
> Attachments: faulty.csv
>
>
> When trying to read the file [^faulty.csv] and parse it I get the following
> error:
> {code}
> java.io.IOException: (line 1) invalid char between encapsulated token and
> delimiter
> at org.apache.commons.csv.Lexer.parseEncapsulatedToken(Lexer.java:275)
> at org.apache.commons.csv.Lexer.nextToken(Lexer.java:152)
> at org.apache.commons.csv.CSVParser.nextRecord(CSVParser.java:500)
> at org.apache.commons.csv.CSVParser.initializeHeader(CSVParser.java:389)
> at org.apache.commons.csv.CSVParser.<init>(CSVParser.java:284)
> at org.apache.commons.csv.CSVParser.<init>(CSVParser.java:252)
> at org.apache.commons.csv.CSVFormat.parse(CSVFormat.java:846)
> {code}
> The line of code is the parsing part returning the iterator of it:
> {code:java}
> csvFormat =
> CSVFormat.DEFAULT.withHeader().withDelimiter(';').withIgnoreHeaderCase();
> iterator = csvFormat.parse(reader).iterator();
> {code}
> The invalid char is the contained SOH and STX non printable characters at the
> end of line.
> I debugged through the source of this and ran into the Exception in the Lexer
> not handling these special characters
> Unfortunately I'm not able to provide some hints on fixing this as I'm not
> familiar with these type of characters and what behaviour they should have.
> Sincerely
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)