Since, my earlier question is still unanswered, I have decided to dig into the
spark code myself. However, I am new to spark as well as scala in particular.
Can some one help me understand the following code snippet:
1. def cogroup[W](other: RDD[(K, W)], partitioner: Partitioner): RDD[(K,
(Seq[V], Seq[W]))] = {
2. val cg = new CoGroupedRDD[K](Seq(self, other), partitioner)
3. val prfs = new PairRDDFunctions[K, Seq[Seq[_]]](cg)(classTag[K],
ClassTags.seqSeqClassTag)
4. prfs.mapValues { case Seq(vs, ws) =>
(vs.asInstanceOf[Seq[V]], ws.asInstanceOf[Seq[W]])
5. }
6. }
Thanks,
rose
On Friday, January 24, 2014 4:32 PM, rose <[email protected]> wrote:
Hi all,
I want to know more about join operation in spark. I know it uses hash join,
but I am not able to figure out the nature of the implementation such as
blocking, non blocking, or shared , not shared partitions.
If anyone knows, please reply to this post along with the linkers of the
implementation in the spark source files.
Thanks,
rose
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Hash-Join-in-Spark-tp873.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.