if you explicitly join 3 or more relations with a single command ("d =
join a on id, b on id, c on id;"), a and b will be buffered for each
key, while c, the rightmost relation, will be streamed.This is on a per-reducer basis. There is of course a whole lot of IO going on for getting from the Mappers to Reducers, but none of it is the intermediate result of joining A to B. -Dmitriy On Tue, Feb 2, 2010 at 10:52 PM, bharath v <[email protected]> wrote: > Hi , > > I have a small doubt in how pig handles queries containing join of more than > 2 tables . > > Suppose we have 3 tables A,B,C .. and the plan is "((AB)C)" .. > We can join A,B in a map reduce job and join the resultant table with "C". I > have a doubt whether the result of "AB" is stored to disk before joining > with C or is it streamed directly to join with C (I dont know how , just a > guess) . > > Any help is appreciated , > > Thanks >
