Dmitriy, I tried it. However, I don't seem to be getting a handle on it. Some pseudo code will be highly appreciated.
Thanks, Rakesh > Date: Sun, 18 Mar 2012 14:27:25 -0700 > Subject: Re: Selective removal of data from a relation > From: [email protected] > To: [email protected] > > Rakesh, > Just like in SQL, this is achieved by doing an outer join and > filtering for nulls (a null join key indicates absence of a matching > row). > > D > > 2012/3/18 rakesh sharma <[email protected]>: > > > > Thanks to Dan for suggesting to post it on gist. Here is the link to the > > post: > > https://raw.github.com/gist/2079527/bf68dd2f0a7ee3864ef066f126c34880b20b6b04/SelectiveDataRemoval‏ > > Please take a look and I am sure many of you have solution to this problem. > > Thanks,Rakesh > >> Date: Sun, 18 Mar 2012 12:35:33 -0600 > >> Subject: RE: Selective removal of data from a relation > >> From: [email protected] > >> To: [email protected] > >> > >> Post it on https://gist.github.com/ and email out the gist. > >> > >> Regards, > >> > >> Dan > >> On Mar 18, 2012 12:33 PM, "rakesh sharma" <[email protected]> > >> wrote: > >> > >> > > >> > All indentations get removed when message comes back from > >> > [email protected]. Any idea how I can make it work. > >> > > >> > > From: [email protected] > >> > > To: [email protected] > >> > > Subject: RE: Selective removal of data from a relation > >> > > Date: Sun, 18 Mar 2012 18:26:01 +0000 > >> > > > >> > > > >> > > I am sorry for so many re-sends. Resending in Rich text format... > >> > > Hi All, > >> > > I have two relations "mix" and "child_parent". Relation "mix" contains > >> > rows of ids. Each Id can be a parent or a child. Another relation > >> > "child-parent" has rows of children and associated parents. It may not > >> > have > >> > data for every child existing in relation "mix". Also, it can have some > >> > data for which there is no matching data in relation "mix". I need to > >> > remove all children from relation "mix" whose parent exists in the > >> > relation. Here is an example to show what I am trying to achieve:mix = > >> > load > >> > "all_data" as (id:chararray);dump mix; > >> > > 13469 > >> > > child_parent = load "mapping" as (childId:chararray, > >> > parentId:chararray);dump child_parent; > >> > > (3 1)(6 1)(9 15) > >> > > Children "3" and "6" has matching parent "1". Hence, 3 and 6 need to be > >> > removed from "all_data". However, child "9" will stay as its parent "15" > >> > does not exist in "all_data". The outcome will be:149I am having hard > >> > time > >> > in solving it due to lack of experience with pig. Any help/suggestion > >> > will > >> > be highly appreciated. > >> > > Thanks,Rakesh > >> > > >
