Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/18659#discussion_r138858332
--- Diff: python/pyspark/serializers.py ---
@@ -199,6 +211,33 @@ def __repr__(self):
return "ArrowSerializer"
+class ArrowPandasSerializer(ArrowSerializer):
+
+ def __init__(self):
+ super(ArrowPandasSerializer, self).__init__()
+
+ def dumps(self, series):
+ """
+ Make an ArrowRecordBatch from a Pandas Series and serialize
+ """
+ import pyarrow as pa
--- End diff --
Should we catch `ImportError`?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]