Hi, How to configure map side join in multiple mappers in parallel?
Suppose I have data set s a1, a2, a3 and data set b1, b2, b3 .I want to let a1 join with b1, a2 join with b2, a3 join with b3 and let the join done in parallel? I think it should be able to configure in mapper 1 joining a1 with b1, in mapper 2 joining a2 with b2, .... How should I configure this in hadoop anyway? Does CompositeInputFormat take multiple series of input? Thanks.
Shi