Hi Harsha,

You could look through the GraphX source to see the approach taken there
for ideas in your own.  I'd recommend starting at
https://github.com/apache/spark/blob/master/graphx/src/main/scala/org/apache/spark/graphx/Graph.scala#L385
to see the storage technique.

Why do you want to avoid using GraphX?

Good luck!
Andrew

On Wed, Sep 17, 2014 at 6:43 AM, Harsha HN <99harsha.h....@gmail.com> wrote:

> Hello
>
> We are building an adjacency list to represent a graph. Vertexes, Edges
> and Weights for the same has been extracted from hdfs files by a Spark job.
> Further we expect size of the adjacency list(Hash Map) could grow over
> 20Gigs.
> How can we represent this in RDD, so that it will distributed in nature?
>
> Basically we are trying to fit HashMap(Adjacency List) into Spark RDD. Is
> there any other way other than GraphX?
>
> Thanks and Regards,
> Harsha
>

Reply via email to