[ https://issues.apache.org/jira/browse/SPARK-10000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrew Or updated SPARK-10000: ------------------------------ Attachment: unified-memory-management-spark-10000.pdf > Consolidate cache memory management and execution memory management > ------------------------------------------------------------------- > > Key: SPARK-10000 > URL: https://issues.apache.org/jira/browse/SPARK-10000 > Project: Spark > Issue Type: Story > Components: Block Manager, Spark Core > Reporter: Reynold Xin > Attachments: unified-memory-management-spark-10000.pdf > > > Memory management in Spark is currently broken down into two disjoint > regions: one for execution and one for storage. The sizes of these regions > are statically configured and fixed for the duration of the application. > There are several limitations to this approach. It requires user expertise to > avoid unnecessary spilling, and there are no sensible defaults that will work > for all workloads. As a Spark user, I want Spark to manage the memory more > intelligently so I do not need to worry about how to statically partition the > execution (shuffle) memory fraction and cache memory fraction. More > importantly, applications that do not use caching use only a small fraction > of the heap space, resulting in suboptimal performance. > Instead, we should unify these two regions and let one borrow from another if > possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org