Guido van Rossum <guido <at> python.org> writes: > > On Thu, Jan 7, 2010 at 10:12 PM, Tres Seaver <tseaver <at> palladion.com> wrote: > > The BOM should not be seekeable if the file is opened with the proposed > > "guess encoding from BOM" mode: it isn't properly part of the stream at > > all in that case. > > This feels about right to me. There are still questions though: > immediately after opening a file with a BOM, what should .tell() > return?
tell() in the context of text I/O is specified to return an "opaque cookie". So whatever value it returns would probably be fine, as long as seeking to that value leaves the file in an acceptable state. Rewinding (seeking to 0) in the presence of a BOM is already reasonably supported by the TextIOWrapper object: >>> dec = codecs.getincrementaldecoder('utf-16')() >>> dec.decode(b'\xff\xfea\x00b\x00') 'ab' >>> dec.decode(b'\xff\xfea\x00b\x00') '\ufeffab' >>> >>> bio = io.BytesIO(b'\xff\xfea\x00b\x00') >>> f = io.TextIOWrapper(bio, encoding='utf-16') >>> f.read() 'ab' >>> f.seek(0) 0 >>> f.read() 'ab' There are tests for this in test_io.py (test_encoded_writes, line 1929, and test_append_bom and test_seek_bom, line 2045). Regards Antoine. _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com