if you explicitly join 3 or more relations with a single command ("d =
join a on id, b on id, c on id;"), a and b will be buffered for each
key, while c, the rightmost relation, will be streamed.

This is on a per-reducer basis. There is of course a whole lot of IO
going on for getting from the Mappers to Reducers, but none of it is
the intermediate result of joining A to B.

-Dmitriy

On Tue, Feb 2, 2010 at 10:52 PM, bharath v
<[email protected]> wrote:
> Hi ,
>
> I have a small doubt in how pig handles queries containing join of more than
> 2 tables .
>
> Suppose we have 3 tables A,B,C .. and the plan is  "((AB)C)" ..
> We can join A,B in a map reduce job and join the resultant table with "C". I
> have a doubt whether the result of "AB" is stored to disk before joining
> with C or is it streamed directly to join with C (I dont know how , just a
> guess) .
>
> Any help is appreciated ,
>
> Thanks
>

Reply via email to