[
https://issues.apache.org/jira/browse/SPARK-5790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14329252#comment-14329252
]
Brennon York commented on SPARK-5790:
-------------------------------------
[~maropu] this looks very similar to the work I just pushed up for
[SPARK-1955|https://github.com/apache/spark/pull/4705] which was acting as the
overarching issue for this ticket. I didn't write tests though which would be a
major benefit. Would you be willing to refactor and only include the tests to
close this issue out? That would help out tremendously and I wouldn't want to
lose that effort!
> VertexRDD's won't zip properly for `diff` capability
> ----------------------------------------------------
>
> Key: SPARK-5790
> URL: https://issues.apache.org/jira/browse/SPARK-5790
> Project: Spark
> Issue Type: Bug
> Components: GraphX
> Reporter: Brennon York
> Assignee: Brennon York
>
> For VertexRDD's with differing partition sizes one cannot run commands like
> `diff` as it will thrown an IllegalArgumentException. The code below provides
> an example:
> {code}
> import org.apache.spark.graphx._
> import org.apache.spark.rdd._
> val setA: VertexRDD[Int] = VertexRDD(sc.parallelize(0L until 3L).map(id =>
> (id, id.toInt+1)))
> setA.collect.foreach(println(_))
> val setB: VertexRDD[Int] = VertexRDD(sc.parallelize(2L until 4L).map(id =>
> (id, id.toInt+2)))
> setB.collect.foreach(println(_))
> val diff = setA.diff(setB)
> diff.collect.foreach(println(_))
> val setC: VertexRDD[Int] = VertexRDD(sc.parallelize(2L until 4L).map(id =>
> (id, id.toInt+2)) ++ sc.parallelize(6L until 8L).map(id => (id, id.toInt+2)))
> setA.diff(setC).collect
> // java.lang.IllegalArgumentException: Can't zip RDDs with unequal numbers of
> partitions
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]