Hi Sameer,

If you set those two IDs to be a Tuple2 in the key of the RDD, then you can
join on that tuple.

Example:

val rdd1: RDD[Tuple3[Int, Int, String]] = ...
val rdd2: RDD[Tuple3[Int, Int, String]] = ...

val resultRDD = rdd1.map(k => ((k._1, k._2), k._3)).join(
                rdd2.map(k => ((k._1, k._2), k.)3)))


Note that when using .join though, that is an inner join so you only get
results from (id1, id2) pairs that have BOTH a score1 and a score2.

Andrew


On Wed, Jul 2, 2014 at 5:12 PM, Sameer Tilak <ssti...@live.com> wrote:

> Hi everyone,
>
> Is it possible to join RDDs using composite keys? I would like to join
> these two RDDs with RDD1.id1 = RDD2.id1 and RDD1.id2 RDD2.id2
>
> RDD1 (id1, id2, scoretype1)
> RDD2 (id1, id2, scoretype2)
>
> I want the result to be ResultRDD = (id1, id2, (score1, score2))
>
> Would really appreciate if you can point me in the right direction.
>
>

Reply via email to