[ https://issues.apache.org/jira/browse/ARROW-8122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17060752#comment-17060752 ]
Joris Van den Bossche commented on ARROW-8122: ---------------------------------------------- Thanks for the report and PR! Related issue (reported as serialization of empty dataframe): ARROW-7996 > [Python] Empty numpy arrays with shape cannot be deserialized > ------------------------------------------------------------- > > Key: ARROW-8122 > URL: https://issues.apache.org/jira/browse/ARROW-8122 > Project: Apache Arrow > Issue Type: Bug > Components: Python > Affects Versions: 0.16.0 > Reporter: Wenjun Si > Priority: Major > Labels: pull-request-available > Fix For: 0.17.0 > > Time Spent: 10m > Remaining Estimate: 0h > > In PyArrow 0.16.0, when we try to deserialize a serialized empty Numpy Array > with shape, for instance, np.array([[], []]), an ArrowInvalid is raised. > Code reproducing this error: > {code:python} > import numpy as np > import pyarrow > arr = np.array([[], []]) > pyarrow.deserialize(pyarrow.serialize(arr).to_buffer()) # this line cannot > work > {code} > and the error stack is > {code:python} > Traceback (most recent call last): > File > "/Users/wenjun/miniconda3/lib/python3.7/site-packages/IPython/core/interactiveshell.py", > line 3326, in run_code > exec(code_obj, self.user_global_ns, self.user_ns) > File "<ipython-input-4-0ace9226dd72>", line 1, in <module> > pyarrow.deserialize(pyarrow.serialize(arr).to_buffer()) > File "pyarrow/serialization.pxi", line 476, in pyarrow.lib.deserialize > File "pyarrow/serialization.pxi", line 438, in pyarrow.lib.deserialize_from > File "pyarrow/serialization.pxi", line 414, in pyarrow.lib.read_serialized > File "pyarrow/error.pxi", line 84, in pyarrow.lib.check_status > pyarrow.lib.ArrowInvalid: strides must not involve buffer over run > {code} > The same code works in PyArrow 0.15.x -- This message was sent by Atlassian Jira (v8.3.4#803005)