[issue23178] csv.reader does not handle BOM

Jon Dufresne Tue, 06 Jan 2015 11:05:30 -0800

New submission from Jon Dufresne:

The following test script demonstrates that Python's csv library does not 
handle a BOM. I would expect the returned row to be equal to expected and to 
print 'True' to stdout.


In the wild, it is typical for other CSV writers to add a BOM. MS Excel is 
especially picky about the BOM when reading a utf-8 encoded file. So many 
writers add a BOM for interopability with MS Excel.

If a python program accepts a CSV file as input (often the case in web apps), 
these files will not be handled correctly without preprocessing. In my opinion, 
this should "just work" when reading the file.

---
import codecs
import csv

f = open('foo.csv', 'wb')
f.write(codecs.BOM_UTF8 + b'a,b,c')
f.close()

expected = ['a', 'b', 'c']
f = open('foo.csv')
r = csv.reader(f)
row = next(r)

print(row)
print(row == expected)
---

Output
---
$ ./python ~/test.py
['\ufeffa', 'b', 'c']
False
---

----------
components: Library (Lib)
messages: 233549
nosy: jdufresne
priority: normal
severity: normal
status: open
title: csv.reader does not handle BOM
type: behavior
versions: Python 3.5

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue23178>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue23178] csv.reader does not handle BOM

Reply via email to