New submission from STINNER Victor <victor.stin...@haypocalc.com>: Python 2.x allows to encode any byte string (str) and ASCII unicode string (unicode):
$ python Python 2.5.1 (r251:54863, Jul 31 2008, 23:17:40) >>> import zlib >>> zlib.compress('abc') "x\x9cKLJ\x06\x00\x02M\x01'" >>> zlib.compress(u'abc') "x\x9cKLJ\x06\x00\x02M\x01'" >>> zlib.compress(u'abc\xe9') ... UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' ... I'm not sure that this behaviour was really wanted become the decompress operation is not symetric (the result type is always byte string): $ python Python 2.5.1 (r251:54863, Jul 31 2008, 23:17:40) >>> import zlib >>> zlib.decompress("x\x9cKLJ\x06\x00\x02M\x01'") 'abc' --- Python 3.0 accepts any string: bytes or characters. But decompress always produce bytes string: $ ./python Python 3.1a0 (py3k:67926M, Dec 26 2008, 23:59:07) >>> import zlib >>> zlib.compress(b'abc') b"x\x9cKLJ\x06\x00\x02M\x01'" >>> zlib.compress('abc') b"x\x9cKLJ\x06\x00\x02M\x01'" >>> zlib.compress('abc\xe9') b'x\x9cKLJ>\xbc\x12\x00\x06\xca\x02\x93' >>> zlib.compress('abc\xe9'.encode('utf-8')) b'x\x9cKLJ>\xbc\x12\x00\x06\xca\x02\x93' >>> zlib.decompress(b'x\x9cKLJ>\xbc\x12\x00\x06\xca\x02\x93') b'abc\xc3\xa9' The most strange operation is the decompression of an unicode string: $ ./python >>> zlib.decompress('x\x9cKLJ>\xbc\x12\x00\x06\xca\x02\x93') ... zlib.error: Error -3 while decompressing data: incorrect header check --- I propose to change zlib API to reject unicode string and use explicit conversion to/from bytes. Functions/methods: - compress(bytes, ...) - decompress(bytes, ...) - <compress object>.compress(bytes, ...) - <decompress object>.decompress(bytes, ...) - crc32(bytes, value=0) - adler(bytes, value=1) Note: binascii.crc32() already rejects unicode string. The behaviour may kept in Python 3.0.x and only changed in Python 3.1. ---------- components: Extension Modules files: zlib_bytes.patch keywords: patch messages: 78356 nosy: haypo severity: normal status: open title: reject unicode in zlib type: behavior versions: Python 3.0, Python 3.1 Added file: http://bugs.python.org/file12472/zlib_bytes.patch _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue4757> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com