Hi All, I was trying to generate a subset of graphs using GraphGenerators provided in GraphX Package. My code is as shown below
def generateGraph(config: Config, sparkContext: SparkContext) = { if (config.graphType == "LogNormal") { GraphGenerators.logNormalGraph(sparkContext, config.numVertices, config.partitionCount, config.mu, config.sigma, config.seed) } else if(config.graphType == "RMAT"){ GraphGenerators.rmatGraph(sparkContext, config.numVertices, config.numEdges) } else if(config.graphType == "Star"){ GraphGenerators.starGraph(sparkContext, config.numVertices) } else { GraphLoader.edgeListFile(sparkContext, config.edgeListFile , edgeStorageLevel = StorageLevel.MEMORY_AND_DISK_SER,vertexStorageLevel = StorageLevel.MEMORY_AND_DISK_SER) } } in main(){ val graph = generateGraph(config, sparkContext) graph.persist(StorageLevel.MEMORY_AND_DISK_SER) } While i'm passing storage level to edgelist graph generation, but not for graph generated by GraphGenerators. My persist call shown above is throwing an error Exception in thread "main" java.lang.UnsupportedOperationException: Cannot change storage level of an RDD after it was already assigned a level at org.apache.spark.rdd.RDD.persist(RDD.scala:159) at org.apache.spark.graphx.impl.EdgeRDDImpl.persist(EdgeRDDImpl.scala:58) at org.apache.spark.graphx.impl.GraphImpl.persist(GraphImpl.scala:58) at org.prasad.bfs.GraphBFS$.generateGraph(GraphBFS.scala:152) at org.prasad.bfs.GraphBFS$.main(GraphBFS.scala:186) at org.prasad.bfs.GraphBFS.main(GraphBFS.scala) Can someone tell me what's the mistake in my code , is there a way for me to set the default storage level by passing it as config parameter. Also I'm trying to perform BFS using pregel API provided by GraphX. I'm ideally running on machines with 8 cores and 24 GB RAM. I'm running on Spark standalone mode Whats the ideal number for 1) number of executor instances 2) number of core per executor 3) amount of memory for each executor Lets say for sake of argument , if i'm using 3 machines , I presume 1 machine that's running master(driver) program will not host any executors. Hence the above numbers has to be set only for the remaining 2 machines. Any help is appreciated Thanks in advance, Prasad -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-GraphX-default-Storage-Level-tp25605.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org