Hi Sameer,
If you set those two IDs to be a Tuple2 in the key of the RDD, then you can
join on that tuple.
Example:
val rdd1: RDD[Tuple3[Int, Int, String]] = ...
val rdd2: RDD[Tuple3[Int, Int, String]] = ...
val resultRDD = rdd1.map(k = ((k._1, k._2), k._3)).join(
rdd2.map(k = ((k._1, k._2), k.)3)))
Note that when using .join though, that is an inner join so you only get
results from (id1, id2) pairs that have BOTH a score1 and a score2.
Andrew
On Wed, Jul 2, 2014 at 5:12 PM, Sameer Tilak ssti...@live.com wrote:
Hi everyone,
Is it possible to join RDDs using composite keys? I would like to join
these two RDDs with RDD1.id1 = RDD2.id1 and RDD1.id2 RDD2.id2
RDD1 (id1, id2, scoretype1)
RDD2 (id1, id2, scoretype2)
I want the result to be ResultRDD = (id1, id2, (score1, score2))
Would really appreciate if you can point me in the right direction.