Hi all,

I have a web application which uses Apache Commons CSV for processing uploaded CSV files.

Occasionally, users are experiencing an error when using this feature:

"IOException reading next record: java.io.IOException: (line 6) invalid char between encapsulated token and delimiter"

On inspecting the problematic CSV file, however, line 6 looks just fine.

By gradually modifying this CSV along with another which uploaded fine, I eventually had two CSV files that were visually identical. Though one would still throw the above error.

Whilst both appeared identical, I noticed that one was 3 bytes larger. It turns out that the problematic CSV begins with "<EF><BB><BF>" (discovered using Linux 'less' command). That's a byte-order mark (BOM).

Here is my section of code that reads these CSVs:

    Reader in = new FileReader(crewList);
    CSVFormat csvFormat = CSVFormat.DEFAULT.builder().build();
    Iterable<CSVRecord> records = csvFormat.parse(in);
    Iterator<CSVRecord> iterator = records.iterator();

1) I'm puzzled as to why the presence of a BOM seems to have resulted in an erroneous error directed at line 6.

2) If the presence of a BOM is indeed the culprit, how best to resolve this without creating a problem for CSVs not containing a BOM.

Your suggestions are much appreciated.

Kind regards,

Chris.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@commons.apache.org
For additional commands, e-mail: user-h...@commons.apache.org

Reply via email to