Hi Harsha, You could look through the GraphX source to see the approach taken there for ideas in your own. I'd recommend starting at https://github.com/apache/spark/blob/master/graphx/src/main/scala/org/apache/spark/graphx/Graph.scala#L385 to see the storage technique.
Why do you want to avoid using GraphX? Good luck! Andrew On Wed, Sep 17, 2014 at 6:43 AM, Harsha HN <99harsha.h....@gmail.com> wrote: > Hello > > We are building an adjacency list to represent a graph. Vertexes, Edges > and Weights for the same has been extracted from hdfs files by a Spark job. > Further we expect size of the adjacency list(Hash Map) could grow over > 20Gigs. > How can we represent this in RDD, so that it will distributed in nature? > > Basically we are trying to fit HashMap(Adjacency List) into Spark RDD. Is > there any other way other than GraphX? > > Thanks and Regards, > Harsha >