Repository: spark
Updated Branches:
  refs/heads/master 5da21f07d -> fc0a1475e


[SPARK-4672][GraphX]Perform checkpoint() on PartitionsRDD to shorten the lineage

The related JIRA is https://issues.apache.org/jira/browse/SPARK-4672

Iterative GraphX applications always have long lineage, while checkpoint() on 
EdgeRDD and VertexRDD themselves cannot shorten the lineage. In contrast, if we 
perform checkpoint() on their ParitionsRDD, the long lineage can be cut off. 
Moreover, the existing operations such as cache() in this code is performed on 
the PartitionsRDD, so checkpoint() should do the same way. More details and 
explanation can be found in the JIRA.

Author: JerryLead <jerryl...@163.com>
Author: Lijie Xu <csxuli...@gmail.com>

Closes #3549 from JerryLead/my_graphX_checkpoint and squashes the following 
commits:

d1aa8d8 [JerryLead] Perform checkpoint() on PartitionsRDD not VertexRDD and 
EdgeRDD themselves
ff08ed4 [JerryLead] Merge branch 'master' of https://github.com/apache/spark
c0169da [JerryLead] Merge branch 'master' of https://github.com/apache/spark
52799e3 [Lijie Xu] Merge pull request #1 from apache/master


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/fc0a1475
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/fc0a1475
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/fc0a1475

Branch: refs/heads/master
Commit: fc0a1475ef7c8b33363d88adfe8e8f28def5afc7
Parents: 5da21f0
Author: JerryLead <jerryl...@163.com>
Authored: Tue Dec 2 17:08:02 2014 -0800
Committer: Ankur Dave <ankurd...@gmail.com>
Committed: Tue Dec 2 17:08:02 2014 -0800

----------------------------------------------------------------------
 .../main/scala/org/apache/spark/graphx/impl/EdgeRDDImpl.scala    | 4 ++++
 .../main/scala/org/apache/spark/graphx/impl/VertexRDDImpl.scala  | 4 ++++
 2 files changed, 8 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/fc0a1475/graphx/src/main/scala/org/apache/spark/graphx/impl/EdgeRDDImpl.scala
----------------------------------------------------------------------
diff --git 
a/graphx/src/main/scala/org/apache/spark/graphx/impl/EdgeRDDImpl.scala 
b/graphx/src/main/scala/org/apache/spark/graphx/impl/EdgeRDDImpl.scala
index a816961..504559d 100644
--- a/graphx/src/main/scala/org/apache/spark/graphx/impl/EdgeRDDImpl.scala
+++ b/graphx/src/main/scala/org/apache/spark/graphx/impl/EdgeRDDImpl.scala
@@ -70,6 +70,10 @@ class EdgeRDDImpl[ED: ClassTag, VD: ClassTag] 
private[graphx] (
     this
   }
 
+  override def checkpoint() = {
+    partitionsRDD.checkpoint()
+  }
+    
   /** The number of edges in the RDD. */
   override def count(): Long = {
     partitionsRDD.map(_._2.size.toLong).reduce(_ + _)

http://git-wip-us.apache.org/repos/asf/spark/blob/fc0a1475/graphx/src/main/scala/org/apache/spark/graphx/impl/VertexRDDImpl.scala
----------------------------------------------------------------------
diff --git 
a/graphx/src/main/scala/org/apache/spark/graphx/impl/VertexRDDImpl.scala 
b/graphx/src/main/scala/org/apache/spark/graphx/impl/VertexRDDImpl.scala
index d92a55a..c8898b1 100644
--- a/graphx/src/main/scala/org/apache/spark/graphx/impl/VertexRDDImpl.scala
+++ b/graphx/src/main/scala/org/apache/spark/graphx/impl/VertexRDDImpl.scala
@@ -71,6 +71,10 @@ class VertexRDDImpl[VD] private[graphx] (
     this
   }
 
+  override def checkpoint() = {
+    partitionsRDD.checkpoint()
+  }
+    
   /** The number of vertices in the RDD. */
   override def count(): Long = {
     partitionsRDD.map(_.size).reduce(_ + _)


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to