Hi all,
I have a web application which uses Apache Commons CSV for processing
uploaded CSV files.
Occasionally, users are experiencing an error when using this feature:
"IOException reading next record: java.io.IOException: (line 6)
invalid char between encapsulated token and delimiter"
On inspecting the problematic CSV file, however, line 6 looks just fine.
By gradually modifying this CSV along with another which uploaded fine,
I eventually had two CSV files that were visually identical. Though one
would still throw the above error.
Whilst both appeared identical, I noticed that one was 3 bytes larger.
It turns out that the problematic CSV begins with "<EF><BB><BF>"
(discovered using Linux 'less' command). That's a byte-order mark
(BOM).
Here is my section of code that reads these CSVs:
Reader in = new FileReader(crewList);
CSVFormat csvFormat = CSVFormat.DEFAULT.builder().build();
Iterable<CSVRecord> records = csvFormat.parse(in);
Iterator<CSVRecord> iterator = records.iterator();
1) I'm puzzled as to why the presence of a BOM seems to have resulted in
an erroneous error directed at line 6.
2) If the presence of a BOM is indeed the culprit, how best to resolve
this without creating a problem for CSVs not containing a BOM.
Your suggestions are much appreciated.
Kind regards,
Chris.
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@commons.apache.org
For additional commands, e-mail: user-h...@commons.apache.org