Ankur Dave created SPARK-1931:
---------------------------------
Summary: Graph.partitionBy does not reconstruct routing tables
Key: SPARK-1931
URL: https://issues.apache.org/jira/browse/SPARK-1931
Project: Spark
Issue Type: Bug
Components: GraphX
Affects Versions: 1.0.0
Reporter: Ankur Dave
Commit 905173df57b90f90ebafb22e43f55164445330e6 introduced a bug in partitionBy
where, after repartitioning the edges, it reuses the VertexRDD without updating
the routing tables to reflect the new edge layout. This causes the following
test to fail:
{code:scala}
val g = Graph(
sc.parallelize(List((0L, "a"), (1L, "b"), (2L, "c"))),
sc.parallelize(List(Edge(0L, 1L, 1), Edge(0L, 2L, 1)), 2))
assert(g.triplets.collect.map(_.toTuple).toSet ===
Set(((0L, "a"), (1L, "b"), 1), ((0L, "a"), (2L, "c"), 1)))
val gPart = g.partitionBy(EdgePartition2D)
assert(gPart.triplets.collect.map(_.toTuple).toSet ===
Set(((0L, "a"), (1L, "b"), 1), ((0L, "a"), (2L, "c"), 1)))
{code}
--
This message was sent by Atlassian JIRA
(v6.2#6252)