ding created SPARK-17559:
----------------------------
Summary: PeriodicGraphCheckpointer didnot persist edges as
expected in some cases
Key: SPARK-17559
URL: https://issues.apache.org/jira/browse/SPARK-17559
Project: Spark
Issue Type: Bug
Components: MLlib
Reporter: ding
Priority: Minor
When use PeriodicGraphCheckpointer to persist graph, sometimes the edge isn't
persisted. As currently only when vertices's storage level is none, graph is
persisted. However there is a chance vertices's storage level is not none while
edges's is none. Eg. graph created by a outerJoinVertices operation, vertices
is automatically cached while edges is not. In this way, edges will not be
persisted if we use PeriodicGraphCheckpointer do persist.
See below minimum example:
val graphCheckpointer = new PeriodicGraphCheckpointer[Array[String], Int](2,
sc)
val users = sc.textFile("data/graphx/users.txt")
.map(line => line.split(",")).map(parts => (parts.head.toLong,
parts.tail))
val followerGraph = GraphLoader.edgeListFile(sc,
"data/graphx/followers.txt")
val graph = followerGraph.outerJoinVertices(users) {
case (uid, deg, Some(attrList)) => attrList
case (uid, deg, None) => Array.empty[String]
}
graphCheckpointer.update(graph)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]