This was an optimization that reuses a triplet object in GraphX, and when
you do a collect directly on triplets, the same object is returned.

It has been fixed in Spark 1.0 here:
https://issues.apache.org/jira/browse/SPARK-1188

To work around in older version of Spark, you can add a copy step to it,
e.g.

graph.triplets.map(_.copy()).collect()



On Mon, May 19, 2014 at 1:09 PM, GlennStrycker <glenn.stryc...@gmail.com>wrote:

> graph.triplets does not work -- it returns incorrect results
>
> I have a graph with the following edges:
>
> orig_graph.edges.collect
> =  Array(Edge(1,4,1), Edge(1,5,1), Edge(1,7,1), Edge(2,5,1), Edge(2,6,1),
> Edge(3,5,1), Edge(3,6,1), Edge(3,7,1), Edge(4,1,1), Edge(5,1,1),
> Edge(5,2,1), Edge(5,3,1), Edge(6,2,1), Edge(6,3,1), Edge(7,1,1),
> Edge(7,3,1))
>
> When I run triplets.collect, I only get the last edge repeated 16 times:
>
> orig_graph.triplets.collect
> = Array(((7,1),(3,1),1), ((7,1),(3,1),1), ((7,1),(3,1),1), ((7,1),(3,1),1),
> ((7,1),(3,1),1), ((7,1),(3,1),1), ((7,1),(3,1),1), ((7,1),(3,1),1),
> ((7,1),(3,1),1), ((7,1),(3,1),1), ((7,1),(3,1),1), ((7,1),(3,1),1),
> ((7,1),(3,1),1), ((7,1),(3,1),1), ((7,1),(3,1),1), ((7,1),(3,1),1))
>
> I've also tried writing various map steps first before calling the triplet
> function, but I get the same results as above.
>
> Similarly, the example on the graphx programming guide page
> (http://spark.apache.org/docs/0.9.0/graphx-programming-guide.html) is
> incorrect.
>
> val facts: RDD[String] =
>   graph.triplets.map(triplet =>
>     triplet.srcAttr._1 + " is the " + triplet.attr + " of " +
> triplet.dstAttr._1)
>
> does not work, but
>
> val facts: RDD[String] =
>   graph.triplets.map(triplet =>
>     triplet.srcAttr + " is the " + triplet.attr + " of " + triplet.dstAttr)
>
> does work, although the results are meaningless.  For my graph example, I
> get the following line repeated 16 times:
>
> 1 is the 1 of 1
>
>
>
> --
> View this message in context:
> http://apache-spark-developers-list.1001551.n3.nabble.com/BUG-graph-triplets-does-not-return-proper-values-tp6693.html
> Sent from the Apache Spark Developers List mailing list archive at
> Nabble.com.
>

Reply via email to