Use an outer join and filter by null instead of using a cogroup (you don't need to realize the bags just to flatten them back out).
On Thu, May 12, 2011 at 9:14 AM, <[email protected]> wrote: > I saw this somewhere. 'Anti-join' doesn't seem very descriptive to me, but > that is what it was called. > > > Anti-join (set difference) idiom in pig: > A = load 'input1' as (x, y); > B = load 'input2' as (u, v); > C = cogroup A by x, B by u; > D = filter C by IsEmpty(B); > E = foreach D generate flatten(A); > > > William F Dowling > Sr Technical Specialist, Software Engineering > Thomson Reuters > 0 +1 215 823 3853 > > > -----Original Message----- > From: Deepak Singh [mailto:[email protected]] > Sent: Wednesday, May 11, 2011 9:43 PM > To: [email protected]; [email protected] > Subject: Set difference in Pig > > Hi, > Can we do set difference in pig ? > > The set difference is defined by: > A-B = {x: x element of A and x is not element of B } > > > Thanks > Deepak > >
