On Tue, 25 Aug 2015 19:12:30 +0300 Pauli Virtanen <p...@iki.fi> wrote:
> 25.08.2015, 01:15, Chris Laumann kirjoitti: > > Would it be possible then (in relatively short order) to create > > a py2 -> py3 numpy pickle converter? > > You probably need to modify the pickle stream directly, replacing > *STRING opcodes with *BYTES opcodes when it comes to objects that are > needed for constructing Numpy arrays. > > https://hg.python.org/cpython/file/tip/Modules/_pickle.c#l82 > > Or, use a custom pickler class that emits the new opcodes when it comes > to data that is part of Numpy arrays, as Python 2 pickler doesn't know > how to write bytes opcodes. > > It's probably doable, although likely annoying to implement. the pickles > created won't be loadable on Py2, only Py3. One could take a look at how the built-in bytearray type achieves pickle compatibility between 2.x and 3.x. The solution is to serialize the binary data as a latin-1 decoded unicode string, and to return the right reconstructor from __reduce__. The solution is less space-efficient than pure bytes pickling, since the unicode string is serialized as utf-8 (so bytes > 0x80 are multibyte-encoded). There's also some CPU overhead, due to the successive decoding and encoding steps. You can take a look at the bytearray_reduce() function in Objects/bytearrayobject.c, both for 2.x and 3.x. (also note how the 3.x version does it only for protocols < 3, to achieve better efficiency on newer protocol versions) Another possibility would be a custom Unpickler class for 3.x, dealing specifically with 2.x-produced Numpy array pickles. That way the pickles themselves could be cross-version. Regards Antoine. _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion