If I understand correctly, this is nothing more than an anti-join which
can be done with pig using a cogroup.

So your SQL below:

> select * from yee a left join yer b on a.loc != b.loc;

becomes something like:

a = load 'yee' as (loc:chararray, stuff:int);
b = load 'yer' as (loc:chararray, stuff:int);

c = cogroup a by loc, b by loc;
d = foreach (filter c by IsEmpty(b)) generate FLATTEN(a);

which will result in d containing only the records from a where the
'loc' field doesn't match with the 'loc' field in b.

--jacob
@thedatachef

Reply via email to