Li Jin created ARROW-1654: ----------------------------- Summary: [Python] pa.DataType cannot be pickled Key: ARROW-1654 URL: https://issues.apache.org/jira/browse/ARROW-1654 Project: Apache Arrow Issue Type: Improvement Reporter: Li Jin
In [26]: t Out[26]: DataType(int64) In [25]: pickle.dumps(t) --------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-25-f90063f6658b> in <module>() ----> 1 pickle.dumps(t) /home/icexelloss/miniconda3/envs/spark-dev/lib/python3.5/site-packages/pyarrow/lib.cpython-35m-x86_64-linux-gnu.so in pyarrow.lib.DataType.__reduce_cython__() TypeError: no default __reduce__ due to non-trivial __cinit__ This is discovered when trying to send a pa.DataType along with a udf in pyspark. The workaround is to send pyspark DataType and convert to pa.DataType. It would be nice to able to pickle pa.DataType. -- This message was sent by Atlassian JIRA (v6.4.14#64029)