Parallel map side join

Shi Yu Fri, 10 Jun 2011 15:56:58 -0700

Hi,

How to configure map side join in multiple mappers in parallel?


Suppose I have data set s   a1,  a2, a3 and data set  b1, b2, b3    .

I want to let a1 join with b1, a2 join with b2, a3 join with b3 andlet the join done in parallel? I think it should be able to configure inmapper 1 joining a1 with b1, in mapper 2 joining a2 with b2, .... Howshould I configure this in hadoop anyway? Does CompositeInputFormattake multiple series of input? Thanks.

Shi

Parallel map side join

Reply via email to