New submission from Jon Dufresne: The following test script demonstrates that Python's csv library does not handle a BOM. I would expect the returned row to be equal to expected and to print 'True' to stdout.
In the wild, it is typical for other CSV writers to add a BOM. MS Excel is especially picky about the BOM when reading a utf-8 encoded file. So many writers add a BOM for interopability with MS Excel. If a python program accepts a CSV file as input (often the case in web apps), these files will not be handled correctly without preprocessing. In my opinion, this should "just work" when reading the file. --- import codecs import csv f = open('foo.csv', 'wb') f.write(codecs.BOM_UTF8 + b'a,b,c') f.close() expected = ['a', 'b', 'c'] f = open('foo.csv') r = csv.reader(f) row = next(r) print(row) print(row == expected) --- Output --- $ ./python ~/test.py ['\ufeffa', 'b', 'c'] False --- ---------- components: Library (Lib) messages: 233549 nosy: jdufresne priority: normal severity: normal status: open title: csv.reader does not handle BOM type: behavior versions: Python 3.5 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue23178> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com