On Tue, 25 Aug 2015 19:12:30 +0300
Pauli Virtanen <p...@iki.fi> wrote:

> 25.08.2015, 01:15, Chris Laumann kirjoitti:
> > Would it be possible then (in relatively short order) to create
> > a py2 -> py3 numpy pickle converter? 
> 
> You probably need to modify the pickle stream directly, replacing
> *STRING opcodes with *BYTES opcodes when it comes to objects that are
> needed for constructing Numpy arrays.
> 
> https://hg.python.org/cpython/file/tip/Modules/_pickle.c#l82
> 
> Or, use a custom pickler class that emits the new opcodes when it comes
> to data that is part of Numpy arrays, as Python 2 pickler doesn't know
> how to write bytes opcodes.
> 
> It's probably doable, although likely annoying to implement. the pickles
> created won't be loadable on Py2, only Py3.

One could take a look at how the built-in bytearray type achieves
pickle compatibility between 2.x and 3.x. The solution is to serialize
the binary data as a latin-1 decoded unicode string, and to return the
right reconstructor from __reduce__.

The solution is less space-efficient than pure bytes pickling, since
the unicode string is serialized as utf-8 (so bytes > 0x80 are
multibyte-encoded). There's also some CPU overhead, due to the
successive decoding and encoding steps.

You can take a look at the bytearray_reduce() function in
Objects/bytearrayobject.c, both for 2.x and 3.x.

(also note how the 3.x version does it only for protocols < 3, to
achieve better efficiency on newer protocol versions)


Another possibility would be a custom Unpickler class for 3.x, dealing
specifically with 2.x-produced Numpy array pickles. That way the
pickles themselves could be cross-version.

Regards

Antoine.


_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to