Use an outer join and filter by null instead of using a cogroup (you
don't need to realize the bags just to flatten them back out).

On Thu, May 12, 2011 at 9:14 AM,  <[email protected]> wrote:
> I saw this somewhere. 'Anti-join' doesn't seem very descriptive to me, but 
> that is what it was called.
>
>
> Anti-join (set difference) idiom in pig:
> A = load 'input1' as (x, y);
> B = load 'input2' as (u, v);
> C = cogroup A by x, B by u;
> D = filter C by IsEmpty(B);
> E = foreach D generate flatten(A);
>
>
> William F Dowling
> Sr Technical Specialist, Software Engineering
> Thomson Reuters
> 0 +1 215 823 3853
>
>
> -----Original Message-----
> From: Deepak Singh [mailto:[email protected]]
> Sent: Wednesday, May 11, 2011 9:43 PM
> To: [email protected]; [email protected]
> Subject: Set difference in Pig
>
> Hi,
>   Can we do set difference in pig ?
>
>  The set difference  is defined by:
>  A-B = {x: x element of A and x is not element of B }
>
>
> Thanks
> Deepak
>
>

Reply via email to