Spark GraphX default Storage Level

prasad223 Sun, 06 Dec 2015 06:50:51 -0800

Hi All,

I was trying to generate a subset of graphs using GraphGenerators provided
in GraphX Package.
My code is as shown below


    def generateGraph(config: Config, sparkContext: SparkContext) = {
        if (config.graphType == "LogNormal") {
                        GraphGenerators.logNormalGraph(sparkContext, 
config.numVertices,
config.partitionCount, config.mu, config.sigma, config.seed)
                }
                else if(config.graphType == "RMAT"){
                        GraphGenerators.rmatGraph(sparkContext, 
config.numVertices,
config.numEdges)
                }
                else if(config.graphType == "Star"){
                        GraphGenerators.starGraph(sparkContext, 
config.numVertices)
                }
                else {
                        GraphLoader.edgeListFile(sparkContext, 
config.edgeListFile ,
edgeStorageLevel = StorageLevel.MEMORY_AND_DISK_SER,vertexStorageLevel =
StorageLevel.MEMORY_AND_DISK_SER)
                }
    }



in main(){

val graph = generateGraph(config, sparkContext)
graph.persist(StorageLevel.MEMORY_AND_DISK_SER)
}

While i'm passing storage level to edgelist graph generation, but not for
graph generated by GraphGenerators.
My persist call shown above is throwing an error 
Exception in thread "main" java.lang.UnsupportedOperationException: Cannot
change storage level of an RDD after it was already assigned a level
        at org.apache.spark.rdd.RDD.persist(RDD.scala:159)
        at
org.apache.spark.graphx.impl.EdgeRDDImpl.persist(EdgeRDDImpl.scala:58)
        at
org.apache.spark.graphx.impl.GraphImpl.persist(GraphImpl.scala:58)
        at org.prasad.bfs.GraphBFS$.generateGraph(GraphBFS.scala:152)
        at org.prasad.bfs.GraphBFS$.main(GraphBFS.scala:186)
        at org.prasad.bfs.GraphBFS.main(GraphBFS.scala)

Can someone tell me what's the mistake in my code , is there a way for me to
set the default storage level by passing it as config parameter.

Also I'm trying to perform BFS using pregel API provided by GraphX.
I'm ideally running on machines with 8 cores and 24 GB RAM. I'm running on
Spark standalone mode
Whats the ideal number for
1) number of  executor instances
2) number of core per executor
3) amount of memory for each executor

Lets say for sake of argument , if i'm using 3 machines , I presume 1
machine that's running master(driver) program will not host any executors.
Hence the above numbers has to be set only for the remaining 2 machines.

Any help is appreciated
Thanks in advance,
Prasad   



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-GraphX-default-Storage-Level-tp25605.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Spark GraphX default Storage Level

Reply via email to