[CSV] Invalid char between encapsulated token and delimiter

Christopher Dodunski (Apache Tomcat) Thu, 02 Jan 2025 17:56:18 -0800

Hi all,

I have a web application which uses Apache Commons CSV for processinguploaded CSV files.


Occasionally, users are experiencing an error when using this feature:

"IOException reading next record: java.io.IOException: (line 6)invalid char between encapsulated token and delimiter"


On inspecting the problematic CSV file, however, line 6 looks just fine.

By gradually modifying this CSV along with another which uploaded fine,I eventually had two CSV files that were visually identical. Though onewould still throw the above error.

Whilst both appeared identical, I noticed that one was 3 bytes larger.It turns out that the problematic CSV begins with "<EF><BB><BF>"(discovered using Linux 'less' command). That's a byte-order mark(BOM).


Here is my section of code that reads these CSVs:

    Reader in = new FileReader(crewList);
    CSVFormat csvFormat = CSVFormat.DEFAULT.builder().build();
    Iterable<CSVRecord> records = csvFormat.parse(in);
    Iterator<CSVRecord> iterator = records.iterator();

1) I'm puzzled as to why the presence of a BOM seems to have resulted inan erroneous error directed at line 6.

2) If the presence of a BOM is indeed the culprit, how best to resolvethis without creating a problem for CSVs not containing a BOM.


Your suggestions are much appreciated.

Kind regards,

Chris.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@commons.apache.org
For additional commands, e-mail: user-h...@commons.apache.org

[CSV] Invalid char between encapsulated token and delimiter

Reply via email to