Hi ,

I am adding bunch of vertices in a graph in graphx using the following method . 
I am facing the problem of latency. First time an addition of say 400 vertices 
to a graph with 100,000 nodes takes around 7 seconds. next time its taking 15 
seconds. So every subsequent adds are taking more time than the previous one. 
Please help me solve this problem.

My cluster is presently having one machine with 8 core and 8 gb ram. I am 
running in local mode.


def addVertex(rdd: RDD[String], sc: SparkContext, session: String): Long = {
    val defaultUser = (0, 0)
    rdd.collect().foreach { x =>
      {
        val aVertex: RDD[(VertexId, (Int, Int))] = 
sc.parallelize(Array((x.toLong, (100, 100))))
        gVertices = gVertices.union(aVertex)
      }
    }
    inputGraph = Graph(gVertices, gEdges, defaultUser)
    inputGraph.cache()
    gVertices = inputGraph.vertices
    gVertices.cache()
    val count = gVertices.count
    println(count);

    return 1;
  }


Thanks,

Udbhav

Reply via email to