Benjamin created ARROW-7883: ------------------------------- Summary: pyarrow-serialize-pandas-df-with-nullable-integer-type Key: ARROW-7883 URL: https://issues.apache.org/jira/browse/ARROW-7883 Project: Apache Arrow Issue Type: Bug Reporter: Benjamin
Serializing an IntegerArray doesn't seem to work with the latest version of pandas and pyarrow {code:java} import pandas as pd import pyarrow # version 0.16 import pyarrow as pa # workaround suggested in https://issues.apache.org/jira/browse/ARROW-5379 pd.arrays.IntegerArray.__arrow_array__ = lambda self, type: pyarrow.array(self._data, mask=self._mask, type=type) df = pd.DataFrame([1, 2]) df = df.convert_dtypes() # following https://arrow.apache.org/docs/python/ipc.html#serializing-pandas-objects context = pa.default_serialization_context() context.serialize(df) {code} {{}} {code:java} SerializationCallbackError: pyarrow does not know how to serialize objects of type <class 'pandas.core.arrays.integer.IntegerArray'>{code} xref https://stackoverflow.com/q/60285486/2146052 {{}} {{}}{{}} -- This message was sent by Atlassian Jira (v8.3.4#803005)