Yicong Huang created SPARK-55176:
------------------------------------
Summary: Extract arrow_to_pandas converter logic into
ArrowArrayToPandasConversion
Key: SPARK-55176
URL: https://issues.apache.org/jira/browse/SPARK-55176
Project: Spark
Issue Type: Improvement
Components: PySpark
Affects Versions: 4.2.0
Reporter: Yicong Huang
Problem:
{{ArrowStreamPandasSerializer.arrow_to_pandas}} and
{{ArrowStreamPandasUDFSerializer.arrow_to_pandas}} mix Arrow-to-pandas
conversion logic with serializer instance state. The conversion logic is
duplicated and tightly coupled to serializer classes.
Proposal:
Extract the conversion logic into
{{ArrowArrayToPandasConversion.create_converter}} factory method. Serializers
can then use the factory to create converters without defining their own
{{arrow_to_pandas}} methods.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]