New submission from Antoine Pitrou <[email protected]>:
In Python 3.2:
>>> pickle.dumps(b'xyz', protocol=2)
b'\x80\x02c__builtin__\nbytes\nq\x00]q\x01(KxKyKze\x85q\x02Rq\x03.'
In Python 2.7:
>>> pickle.loads(b'\x80\x02c__builtin__\nbytes\nq\x00]q\x01(KxKyKze\x85q\x02Rq\x03.')
'[120, 121, 122]'
The problem is that the bytes() constructor argument is a list of ints, which
gives a different result when reconstructed under 2.x where bytes is an alias
of str:
>>> pickletools.dis(pickle.dumps(b'xyz', protocol=2))
0: \x80 PROTO 2
2: c GLOBAL '__builtin__ bytes'
21: q BINPUT 0
23: ] EMPTY_LIST
24: q BINPUT 1
26: ( MARK
27: K BININT1 120
29: K BININT1 121
31: K BININT1 122
33: e APPENDS (MARK at 26)
34: \x85 TUPLE1
35: q BINPUT 2
37: R REDUCE
38: q BINPUT 3
40: . STOP
highest protocol among opcodes = 2
Bytearray objects use a different trick: they pass a (unicode string, encoding)
pair which has the same constructor semantics under 2.x and 3.x. Additionally,
such encoding is statistically more efficient: a list of 1-byte ints will take
2 bytes per encoded char, while a latin1-to-utf8 transcoded string (BINUNICODE
uses utf-8) will take on average 1.5 bytes per encoded char (assuming a 50%
probability of higher-than-127 bytes).
>>> pickletools.dis(pickle.dumps(bytearray(b'xyz'), protocol=2))
0: \x80 PROTO 2
2: c GLOBAL '__builtin__ bytearray'
25: q BINPUT 0
27: X BINUNICODE 'xyz'
35: q BINPUT 1
37: X BINUNICODE 'latin-1'
49: q BINPUT 2
51: \x86 TUPLE2
52: q BINPUT 3
54: R REDUCE
55: q BINPUT 4
57: . STOP
highest protocol among opcodes = 2
----------
components: Library (Lib)
messages: 148635
nosy: alexandre.vassalotti, irmen, pitrou
priority: high
severity: normal
status: open
title: Bytes objects pickled in 3.x with protocol <=2 are unpickled incorrectly
in 2.x
type: behavior
versions: Python 3.2, Python 3.3
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue13505>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com