Thank You Norbert for your reply.


I am still not sure how/if I can do this through Pig script, though?
Given the below adjacency list,
How would I find ancestors of R (P,C,A) in Pig?



cat testgraph.txt

A     B:C

B     D

C     P

P     Q:R



>From the pig script I can get immediate parent, i.e. C

But wouldn’t I need multiple iterations to get parent of C?



-----------------script------------------------

A=LOAD 'testgraph.txt' USING PigStorage() AS (subject:chararray,
link:chararray);

B = FOREACH A GENERATE subject, FLATTEN(STRSPLIT(link,':',2)) AS L;

C = FILTER B BY $1 == 'P'

-----------------------------------------------


On Wed, Feb 29, 2012 at 10:16 PM, Norbert Burger
<[email protected]>wrote:

> Prash, you can just model this tree as a simple graph adjacency list:
>
> A1,A2
> A2,A3
> A3,A4
> A4,Am
> ...
>
> For nodes with more than one child, you simply extend each row
> horizontally.  Child/parent/descendant/ancestor are straightforward
> applications of a traversal on this graph (BFS would be a good choice).
>
> Norbert
>
> On Wed, Feb 29, 2012 at 9:02 AM, prash987 prash987 <[email protected]
> >wrote:
>
> > Hi All,
> > How do I represent hierarchical information in flat file and process it
> in
> > Pig?
> >
> > Let’s say I have objects of type A.
> > I want to have a Tree  representation with their parent-child
> > relationships.
> >
> > In scenario 1:
> > A1 points to A2; A2 points to A3; A3 points to A4; A4 points to Am and
> > so on till An.
> > Given above definition; I want to be able to answer following :
> >
> > Child(A1) = A2
> > Parent(A4) = A3
> > Descendant(A1) = A2,A3,A4, Am… An
> > Ancestor(A4) = A3,A2,A1
> > Ancestor (An) = Am,…A4,A3,A2,A1
> >
> > Can this be represented in text file and queried in Pig.
> >
> > Appreciate any pointers/suggestions.
> > Thanks!
> >
>

Reply via email to