If I understand correctly, this is nothing more than an anti-join which can be done with pig using a cogroup.
So your SQL below: > select * from yee a left join yer b on a.loc != b.loc; becomes something like: a = load 'yee' as (loc:chararray, stuff:int); b = load 'yer' as (loc:chararray, stuff:int); c = cogroup a by loc, b by loc; d = foreach (filter c by IsEmpty(b)) generate FLATTEN(a); which will result in d containing only the records from a where the 'loc' field doesn't match with the 'loc' field in b. --jacob @thedatachef
