Martin Panter <vadmium...@gmail.com> added the comment: I suspect this is caused by TextIOWrapper guessing if it is writing the start of a file versus in the middle, and being confused by “seekable” returning False. GzipFile implements some “seek” calls in write mode, but LZMAFile and BZ2File do not.
Using this test class: class Writer(BufferedIOBase): def writable(self): return True def __init__(self, offset): self.offset = offset def seekable(self): result = self.offset is not None print('seekable ->', result) return result def tell(self): print('tell ->', self.offset) return self.offset def write(self, data): print('write', repr(data)) a BOM is inserted when “tell” returns zero: >>> t = io.TextIOWrapper(Writer(0), 'utf-16') seekable -> True tell -> 0 >>> t.write('HI'); t.flush() # Writes BOM 2 write b'\xff\xfeH\x00I\x00' and not when “tell” returns a positive number: >>> t = io.TextIOWrapper(Writer(1), 'utf-16') seekable -> True tell -> 1 >>> t.write('HI'); t.flush() # Omits BOM 2 write b'H\x00I\x00' However the “io” and “_pyio” behaviours differ when “seekable” returns False: >>> t = io.TextIOWrapper(Writer(None), 'utf-16') seekable -> False >>> t.write('HI'); t.flush() # io omits BOM 2 write b'H\x00I\x00' >>> t = _pyio.TextIOWrapper(Writer(None), 'utf-16') seekable -> False >>> t.write('HI'); t.flush() # _pyio writes BOM write b'\xff\xfeH\x00I\x00' 2 IMO the “_pyio” behaviour is more sensible: write a BOM because that’s what the UTF-16 codec produces. ---------- nosy: +martin.panter _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue36304> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com