ding created SPARK-17559:
----------------------------

             Summary: PeriodicGraphCheckpointer didnot persist edges as 
expected in some cases
                 Key: SPARK-17559
                 URL: https://issues.apache.org/jira/browse/SPARK-17559
             Project: Spark
          Issue Type: Bug
          Components: MLlib
            Reporter: ding
            Priority: Minor


When use PeriodicGraphCheckpointer to persist graph, sometimes the edge isn't 
persisted. As currently only when vertices's storage level is none, graph is 
persisted. However there is a chance vertices's storage level is not none while 
edges's is none. Eg. graph created by a outerJoinVertices operation, vertices 
is automatically cached while edges is not. In this way, edges will not be 
persisted if we use PeriodicGraphCheckpointer do persist.

See below minimum example:
   val graphCheckpointer = new PeriodicGraphCheckpointer[Array[String], Int](2, 
sc)
    val users = sc.textFile("data/graphx/users.txt")
      .map(line => line.split(",")).map(parts => (parts.head.toLong, 
parts.tail))
    val followerGraph = GraphLoader.edgeListFile(sc, 
"data/graphx/followers.txt")

    val graph = followerGraph.outerJoinVertices(users) {
      case (uid, deg, Some(attrList)) => attrList
      case (uid, deg, None) => Array.empty[String]
    }
    graphCheckpointer.update(graph)    



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to