Github user BryanCutler commented on a diff in the pull request:
https://github.com/apache/spark/pull/19459#discussion_r149874063
--- Diff: python/pyspark/serializers.py ---
@@ -213,7 +213,15 @@ def __repr__(self):
return "ArrowSerializer"
-def _create_batch(series):
+def _create_batch(series, copy=False):
--- End diff --
Yeah, we don't want to end up double copying if `copy=True`. Let me try
something and if it ends up making things too complicated then we can remove
the copy flag altogether and just rely on `fillna(0)` to always make a copy -
not ideal but will be more simple
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]