Rakesh,
Just like in SQL, this is achieved by doing an outer join and
filtering for nulls (a null join key indicates absence of a matching
row).

D

2012/3/18 rakesh sharma <[email protected]>:
>
> Thanks to Dan for suggesting to post it on gist. Here is the link to the post:
> https://raw.github.com/gist/2079527/bf68dd2f0a7ee3864ef066f126c34880b20b6b04/SelectiveDataRemoval‏
> Please take a look and I am sure many of you have solution to this problem.
> Thanks,Rakesh
>> Date: Sun, 18 Mar 2012 12:35:33 -0600
>> Subject: RE: Selective removal of data from a relation
>> From: [email protected]
>> To: [email protected]
>>
>> Post it on https://gist.github.com/ and email out the gist.
>>
>> Regards,
>>
>> Dan
>> On Mar 18, 2012 12:33 PM, "rakesh sharma" <[email protected]>
>> wrote:
>>
>> >
>> > All indentations get removed when message comes back from
>> > [email protected]. Any idea how I can make it work.
>> >
>> > > From: [email protected]
>> > > To: [email protected]
>> > > Subject: RE: Selective removal of data from a relation
>> > > Date: Sun, 18 Mar 2012 18:26:01 +0000
>> > >
>> > >
>> > > I am sorry for so many re-sends. Resending in Rich text format...
>> > > Hi All,
>> > > I have two relations "mix" and "child_parent". Relation "mix" contains
>> > rows of ids. Each Id can be a parent or a child. Another relation
>> > "child-parent" has rows of children and associated parents. It may not have
>> > data for every child existing in relation "mix". Also, it can have some
>> > data for which there is no matching data in relation "mix". I need to
>> > remove all children from relation "mix" whose parent exists in the
>> > relation. Here is an example to show what I am trying to achieve:mix = load
>> > "all_data" as (id:chararray);dump mix;
>> > > 13469
>> > > child_parent = load "mapping" as (childId:chararray,
>> > parentId:chararray);dump child_parent;
>> > > (3       1)(6       1)(9      15)
>> > > Children "3" and "6" has matching parent "1". Hence, 3 and 6 need to be
>> > removed from "all_data". However, child "9" will stay as its parent "15"
>> > does not exist in "all_data". The outcome will be:149I am having hard time
>> > in solving it due to lack of experience with pig. Any help/suggestion will
>> > be highly appreciated.
>> > > Thanks,Rakesh
>> >
>

Reply via email to