Hi All,
I have two relations "mix" and "child_parent". Relation "mix" contains rows of
ids. Each Id can be a parent or a child. Another relation "child-parent" has
rows of children and associated parents. It may not have data for every child
existing in relation "mix". Also, it can have some data for which there is no
matching data in relation "mix". I need to remove all children from relation
"mix" whose parent exists in the relation. Here is an example to show what I am
trying to achieve:
mix = load "all_data" as (id:chararray);dump mix;
13469
child_parent = load "mapping" as (childId:chararray, parentId:chararray);dump
child_parent;
(3 1)(6 1)(9 15)
Children "3" and "6" has matching parent "1". Hence, 3 and 6 need to be removed
from "all_data". However, child "9" will stay as its parent "15" does not exist
in "all_data". The outcome will be:
149
I am having hard time in solving it due to lack of experience with pig. Any
help/suggestion will be highly appreciated.
Thanks,Rakesh