Venkata Sai Akhil Gudesa created SPARK-54001:
------------------------------------------------
Summary: Reduce memory footprint from cached local relations upon
cloning
Key: SPARK-54001
URL: https://issues.apache.org/jira/browse/SPARK-54001
Project: Spark
Issue Type: Improvement
Components: Connect, Spark Core
Affects Versions: 4.2
Reporter: Venkata Sai Akhil Gudesa
Cloning sessions is a common operation in Spark applications (e.g., for
creating isolated execution contexts). The current approach of duplicating
cached data can significantly increase memory footprint, especially when:
* Sessions are cloned frequently
* Cached relations contain large datasets
* Multiple clones exist simultaneously
An improvement can be made by implementing reference counting as opposed to
data replication for the block manager entries that reference cached local
relations.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]