L. C. Hsieh created SPARK-54134:
-----------------------------------
Summary: Optimize Arrow memory usage
Key: SPARK-54134
URL: https://issues.apache.org/jira/browse/SPARK-54134
Project: Spark
Issue Type: Improvement
Components: PySpark, SQL
Affects Versions: 4.2.0
Reporter: L. C. Hsieh
We have encountered OOM when loading data and processing them in PySpark
through toArrow or toPandas. The same data could be loaded by PyArrow directly
but fails to load through toArrow or toPandas into PySpark due to OOM issues.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]