Oh, I see what my confusion is... It's the "null"s on which join behaves differently in pig than sql. Right? that's where things are different.
On Thu, Jun 10, 2010 at 12:48 PM, Alan Gates <[email protected]> wrote: > That's already what happens, because flattening a bag that is empty results > in 0 rows, regardless of how many rows came out of the other bag. > > Alan. > > > On Jun 10, 2010, at 11:09 AM, hc busy wrote: > > Isn't that kind of annoying? Since JOIN in sql implicitly is an inner >> join. >> Would have been great if >> >> C = JOIN A by id, B b id; >> >> is alias for >> C1 = COGROUP A by id, B by id; >> C2 = filter C1 by IsEmpty(A) OR IsEmpty(B); >> C = foreach C2 generate FLATTEN(A), FLATTEN(B); >> >> >> On Tue, Jun 8, 2010 at 12:03 PM, Alan Gates <[email protected]> wrote: >> >> Historically >>> >>> C = JOIN A by a, B by a >>> >>> was defined in Pig Latin as shorthand for: >>> >>> C1 = COGROUP A by a, B by a; >>> C = FOREACH C1 GENERATE flatten(A), flatten(B) >>> >>> which produces the doubling of keys. >>> >>> Also, given that Pig Latin does not require that key names be the same >>> (as >>> USING or NATURAL do in SQL) there would be issues if it did not have both >>> keys in the output. (For the same reason ON in SQL duplicates the keys >>> in >>> the results.) >>> >>> Alan. >>> >>> >>> On Jun 8, 2010, at 4:45 AM, Alexander Schätzle wrote: >>> >>> Hi all, >>> >>>> >>>> the JOIN operator of Pig produces duplicate columns in its output. >>>> Let's say the statement is like this: >>>> >>>> C = JOIN A BY (var1, var2), B BY (var1, var2); >>>> >>>> Then C contains var1 and var2 two times (one for each input relation), >>>> of >>>> course with the same content. >>>> This is somehow not what a user "usually" expects from a Join. >>>> Why does Pig produce such redundant entries? >>>> If you want to get rid of these entries you have to do a FOREACH for >>>> projection. >>>> Otherwise you shuffle unnecessary data through MR-phases. >>>> In my opinion this is somehow really unnecessary. >>>> I just wonder why Pig produces theo output of a Join the way it does? >>>> >>>> Cheers, >>>> Alex >>>> >>>> >>>> >>>> >>> >
