New submission from Serhiy Storchaka:

Python 2 allows pickling and unpickling non-ascii persistent ids. In Python 3 C 
implementation of pickle saves persistent ids with protocol version 0 as 
utf8-encoded strings and loads as bytes.

>>> import pickle, io
>>> class MyPickler(pickle.Pickler):
...     def persistent_id(self, obj):
...         if isinstance(obj, str):
...             return obj
...         return None
... 
>>> class MyUnpickler(pickle.Unpickler):
...     def persistent_load(self, pid):
...         return pid
... 
>>> f = io.BytesIO(); MyPickler(f).dump('\u20ac'); data = f.getvalue()
>>> MyUnpickler(io.BytesIO(data)).load()
'€'
>>> f = io.BytesIO(); MyPickler(f, 0).dump('\u20ac'); data = f.getvalue()
>>> MyUnpickler(io.BytesIO(data)).load()
b'\xe2\x82\xac'
>>> f = io.BytesIO(); MyPickler(f, 0).dump('a'); data = f.getvalue()
>>> MyUnpickler(io.BytesIO(data)).load()
b'a'

Python implementation in Python 3 doesn't works with non-ascii persistant ids 
at all.

----------
components: Extension Modules, Library (Lib)
messages: 186705
nosy: alexandre.vassalotti, pitrou, serhiy.storchaka
priority: normal
severity: normal
stage: needs patch
status: open
title: Persistent id in pickle with protocol version 0
type: behavior
versions: Python 3.3, Python 3.4

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue17711>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to