[
https://issues.apache.org/jira/browse/SPARK-2025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ankur Dave resolved SPARK-2025.
-------------------------------
Resolution: Fixed
Fix Version/s: 1.1.0
1.0.1
> EdgeRDD persists after pregel iteration
> ---------------------------------------
>
> Key: SPARK-2025
> URL: https://issues.apache.org/jira/browse/SPARK-2025
> Project: Spark
> Issue Type: Bug
> Components: GraphX
> Affects Versions: 1.0.0, 1.0.1
> Environment: RHEL6 on local and on spark cluster
> Reporter: Tim Weninger
> Assignee: Ankur Dave
> Labels: Pregel
> Fix For: 1.0.1, 1.1.0
>
>
> Symptoms: During execution of a pregel script/function a copy of an
> intermediate EdgeRDD object persists after each iteration as shown by the
> Spark WebUI - storage.
> This is like a memory leak that affects in the Pregel function.
> For example, after the first iteration I will have an EdgeRDD in addition to
> the EdgeRDD and VertexRDD that are kept for the next iteration. After 15
> iterations I will have 15 EdgeRDDs in addition to the current/correct state
> represented by a single set of 1 EdgeRDD and 1 VertexRDD.
> At the end of a Pregel loop the old EdgeRDD and VertexRDD are unpersisted,
> but there seems to be another EdgeRDD that is created somewhere that does not
> get unpersisted.
> i _think_ this is from the replicateVertex function, but I cannot be sure.
> Update - Dave Ankur says, in comments on SPARK-2011 -
> {quote}
> ... is a bug introduced by https://github.com/apache/spark/pull/497.
> It occurs because unpersistVertices used to unpersist both the vertices and
> the replicated vertices, but after unifying replicated vertices with edges,
> there was no way to unpersist only one of them. I think the solution is just
> to unpersist both the vertices and the edges in Pregel.{quote}
--
This message was sent by Atlassian JIRA
(v6.2#6252)