Thank You Norbert for your reply.
I am still not sure how/if I can do this through Pig script, though? Given the below adjacency list, How would I find ancestors of R (P,C,A) in Pig? cat testgraph.txt A B:C B D C P P Q:R >From the pig script I can get immediate parent, i.e. C But wouldn’t I need multiple iterations to get parent of C? -----------------script------------------------ A=LOAD 'testgraph.txt' USING PigStorage() AS (subject:chararray, link:chararray); B = FOREACH A GENERATE subject, FLATTEN(STRSPLIT(link,':',2)) AS L; C = FILTER B BY $1 == 'P' ----------------------------------------------- On Wed, Feb 29, 2012 at 10:16 PM, Norbert Burger <[email protected]>wrote: > Prash, you can just model this tree as a simple graph adjacency list: > > A1,A2 > A2,A3 > A3,A4 > A4,Am > ... > > For nodes with more than one child, you simply extend each row > horizontally. Child/parent/descendant/ancestor are straightforward > applications of a traversal on this graph (BFS would be a good choice). > > Norbert > > On Wed, Feb 29, 2012 at 9:02 AM, prash987 prash987 <[email protected] > >wrote: > > > Hi All, > > How do I represent hierarchical information in flat file and process it > in > > Pig? > > > > Let’s say I have objects of type A. > > I want to have a Tree representation with their parent-child > > relationships. > > > > In scenario 1: > > A1 points to A2; A2 points to A3; A3 points to A4; A4 points to Am and > > so on till An. > > Given above definition; I want to be able to answer following : > > > > Child(A1) = A2 > > Parent(A4) = A3 > > Descendant(A1) = A2,A3,A4, Am… An > > Ancestor(A4) = A3,A2,A1 > > Ancestor (An) = Am,…A4,A3,A2,A1 > > > > Can this be represented in text file and queried in Pig. > > > > Appreciate any pointers/suggestions. > > Thanks! > > >
