GitHub user khayyatzy opened a pull request: https://github.com/apache/incubator-spark/pull/587
Adding RDD unique self cross product Hi, I am using Spark in some data analysis project and I frequently requires the unique self cross product for a single RDD. Since I am using Spark's Java API, I added the new function "selfCartesian" JavaRDDLike.scala. I also modify RDD.scala where it calls function "CartesianRDD2". "CartesianRDD2" Has similar implementation to "CartesianRDD", where it only returns elements (a, b) if a.index <= b.index. I have been using this Spark's modification for couple of months and the function always return correct results I hope this new small feature would be favorable for other Spark users. Regards, Zuhair Khayyat You can merge this pull request into a Git repository by running: $ git pull https://github.com/apache/incubator-spark master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-spark/pull/587.patch ---- commit 82a80ef264ad15fa706eb566691470308b30f63a Author: Zuhair Khayyat <zuhair.khay...@gmail.com> Date: 2014-02-12T12:46:05Z Adding unique self cross product of in a single RDD commit fb8ad2eee1c4ce175f3cf4227492bbc9f3502db3 Author: Zuhair Khayyat <zuhair.khay...@gmail.com> Date: 2014-02-12T12:54:11Z changing ClassManifest to ClassTag in unique self product classes commit ea02451bb11274f55be3706ea86e21e43e54fd35 Author: Zuhair Khayyat <zuhair.khay...@gmail.com> Date: 2014-02-12T12:59:16Z adding import scala.reflect.ClassTag to CartesianRDD2.scala commit 8f81706f374773aeea0d608b6baa9d2164c8f364 Author: Zuhair Khayyat <zuhair.khay...@gmail.com> Date: 2014-02-12T13:29:27Z removing unwanted text from CartesianRDD2.scala ----