hi there, I have written a Go package[1] that can read/write simple arrays in the numpy file format [2]. when I wrote it, it was for simple interoperability use cases, but now people would like to be able to read back ragged-arrays[3].
unless I am mistaken, this means I need to interpret pieces of pickled data (`ndarray`, `multiarray` and `dtype`). so I am trying to understand how to unpickle `dtype` values that have been pickled: ```python import numpy as np import pickle import pickletools as pt pt.dis(pickle.dumps(np.dtype("int32"), protocol=4), annotate=True) ``` gives: ``` 0: \x80 PROTO 4 Protocol version indicator. 2: \x95 FRAME 55 Indicate the beginning of a new frame. 11: \x8c SHORT_BINUNICODE 'numpy' Push a Python Unicode string object. 18: \x94 MEMOIZE (as 0) Store the stack top into the memo. The stack is not popped. 19: \x8c SHORT_BINUNICODE 'dtype' Push a Python Unicode string object. 26: \x94 MEMOIZE (as 1) Store the stack top into the memo. The stack is not popped. 27: \x93 STACK_GLOBAL Push a global object (module.attr) on the stack. 28: \x94 MEMOIZE (as 2) Store the stack top into the memo. The stack is not popped. 29: \x8c SHORT_BINUNICODE 'i4' Push a Python Unicode string object. 33: \x94 MEMOIZE (as 3) Store the stack top into the memo. The stack is not popped. 34: \x89 NEWFALSE Push False onto the stack. 35: \x88 NEWTRUE Push True onto the stack. 36: \x87 TUPLE3 Build a three-tuple out of the top three items on the stack. 37: \x94 MEMOIZE (as 4) Store the stack top into the memo. The stack is not popped. 38: R REDUCE Push an object built from a callable and an argument tuple. 39: \x94 MEMOIZE (as 5) Store the stack top into the memo. The stack is not popped. 40: ( MARK Push markobject onto the stack. 41: K BININT1 3 Push a one-byte unsigned integer. 43: \x8c SHORT_BINUNICODE '<' Push a Python Unicode string object. 46: \x94 MEMOIZE (as 6) Store the stack top into the memo. The stack is not popped. 47: N NONE Push None on the stack. 48: N NONE Push None on the stack. 49: N NONE Push None on the stack. 50: J BININT -1 Push a four-byte signed integer. 55: J BININT -1 Push a four-byte signed integer. 60: K BININT1 0 Push a one-byte unsigned integer. 62: t TUPLE (MARK at 40) Build a tuple out of the topmost stack slice, after markobject. 63: \x94 MEMOIZE (as 7) Store the stack top into the memo. The stack is not popped. 64: b BUILD Finish building an object, via __setstate__ or dict update. 65: . STOP Stop the unpickling machine. highest protocol among opcodes = 4 ``` I have tried to find the usual `__reduce__` and `__setstate__` methods to understand what are the various arguments, to no avail. so, in : ```python >>> np.dtype("int32").__reduce__()[1] ('i4', False, True) >>> np.dtype("int32").__reduce__()[2] (3, '<', None, None, None, -1, -1, 0) ``` what are the meaning of the various arguments ? thanks in advance, sebastien. [1] https://github.com/sbinet/npyio [2] https://numpy.org/neps/nep-0001-npy-format.html [3] https://github.com/sbinet/npyio/issues/20 _______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com