[jira] [Updated] (SPARK-44486) Implement PyArrow `self_destruct` feature for `toPandas`

Xinrong Meng (Jira) Wed, 19 Jul 2023 17:11:05 -0700


     [ 
https://issues.apache.org/jira/browse/SPARK-44486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Xinrong Meng updated SPARK-44486:
---------------------------------
    Description: 
Implement PyArrow `self_destruct` feature for `toPandas`

 

Now the Spark configuration 
`spark.sql.execution.arrow.pyspark.selfDestruct.enabled` can be used to enable 
PyArrow’s `self_destruct` feature in Spark Connect, which can save memory when 
creating a Pandas DataFrame via `toPandas` by freeing Arrow-allocated memory 
while building the Pandas DataFrame. 

  was:Implement PyArrow `self_destruct` feature for `toPandas`


> Implement PyArrow `self_destruct` feature for `toPandas`
> --------------------------------------------------------
>
>                 Key: SPARK-44486
>                 URL: https://issues.apache.org/jira/browse/SPARK-44486
>             Project: Spark
>          Issue Type: Improvement
>          Components: Connect, PySpark
>    Affects Versions: 4.0.0
>            Reporter: Xinrong Meng
>            Priority: Major
>
> Implement PyArrow `self_destruct` feature for `toPandas`
>  
> Now the Spark configuration 
> `spark.sql.execution.arrow.pyspark.selfDestruct.enabled` can be used to 
> enable PyArrow’s `self_destruct` feature in Spark Connect, which can save 
> memory when creating a Pandas DataFrame via `toPandas` by freeing 
> Arrow-allocated memory while building the Pandas DataFrame. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (SPARK-44486) Implement PyArrow `self_destruct` feature for `toPandas`

Reply via email to