Re: RDD join: composite keys

2014-07-03 Thread Andrew Ash
Hi Sameer,

If you set those two IDs to be a Tuple2 in the key of the RDD, then you can
join on that tuple.

Example:

val rdd1: RDD[Tuple3[Int, Int, String]] = ...
val rdd2: RDD[Tuple3[Int, Int, String]] = ...

val resultRDD = rdd1.map(k = ((k._1, k._2), k._3)).join(
rdd2.map(k = ((k._1, k._2), k.)3)))


Note that when using .join though, that is an inner join so you only get
results from (id1, id2) pairs that have BOTH a score1 and a score2.

Andrew


On Wed, Jul 2, 2014 at 5:12 PM, Sameer Tilak ssti...@live.com wrote:

 Hi everyone,

 Is it possible to join RDDs using composite keys? I would like to join
 these two RDDs with RDD1.id1 = RDD2.id1 and RDD1.id2 RDD2.id2

 RDD1 (id1, id2, scoretype1)
 RDD2 (id1, id2, scoretype2)

 I want the result to be ResultRDD = (id1, id2, (score1, score2))

 Would really appreciate if you can point me in the right direction.




RDD join: composite keys

2014-07-02 Thread Sameer Tilak
Hi everyone,
Is it possible to join RDDs using composite keys? I would like to join these 
two RDDs with RDD1.id1 = RDD2.id1 and RDD1.id2 RDD2.id2RDD1 (id1, id2, 
scoretype1) RDD2 (id1, id2, scoretype2)
I want the result to be ResultRDD = (id1, id2, (score1, score2))
Would really appreciate if you can point me in the right direction.