Hi folks,

I need to join RDDs having composite keys like this : (K1, K2 ... Kn).

The joining rule looks like this :
* if left.K1 == right.K1, then we have a "true equality", and all K2... Kn
are also equal.
* if left.K1 != right.K1 but left.K2 == right.K2, I have a partial
equality, and I also want the join to occur there.
* if K2 don't match, then I test K3 and so on.

Is there a way to implement a custom join with a given predicate to
implement this ? (I would probably also need to provide a partitioner, and
some sorting predicate).

Left and right RDD are 1-10 millions lines long.
Any idea ?

Thanks
Mathieu

Reply via email to