
I want to control the placement of the partitions of the Property Graph
across my cluster nodes. As I understand, in order to specify the preferred
locations for a partition of an RDD, one will need to create a subclass
that overrides the getPreferredLocations() function. For example
the ParallelCollectionRDD overrides that method to take into account the

However, Property Graph in GraphX is combination of multiple RDDs. How can
I provide the preferred locations for it? Would I need to implement only a
custom EdgeRDD with the getPreferredLocations() function? Since it seems
that VertexRDD is partitioned according to the edge partitions. Do I need
to implement something else as well?

I have asked the same question on stack overflow as well:


Reply via email to