[ 
https://issues.apache.org/jira/browse/SPARK-31793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gengliang Wang updated SPARK-31793:
-----------------------------------
    Summary: Reduce the memory usage in file scan location metadata  (was: 
Reduce the memory usage in data source scan metadata)

> Reduce the memory usage in file scan location metadata
> ------------------------------------------------------
>
>                 Key: SPARK-31793
>                 URL: https://issues.apache.org/jira/browse/SPARK-31793
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.1.0
>            Reporter: Gengliang Wang
>            Assignee: Gengliang Wang
>            Priority: Major
>
> Currently, the data source scan node stores all the paths in its metadata. 
> The metadata is kept when a SparkPlan is converted into SparkPlanInfo. 
> SparkPlanInfo can be used to construct the Spark plan graph in UI.
> However, the paths can be very large (e.g. it can be many partitions after 
> partition pruning), while UI pages only require up to 100 bytes for the 
> location metadata. We can reduce the paths stored in metadata to reduce 
> memory usage.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to