I am misunderstanding something. following intro to pig-latin doc (p6), the flatten generating 'a' would generate <1,2,3,4> (and not <1,2>,<1,3>,<1,4>)
-----Original Message----- From: Alan Gates [mailto:[EMAIL PROTECTED] Sent: Tuesday, August 28, 2007 12:47 PM To: [email protected] Cc: [EMAIL PROTECTED] Subject: Re: looking for some help with pig syntax Sorry, I misunderstood what you were trying to generate. Perhaps the following will come closer: t1 = load table1 as id, listOfId; -- <1, <2,3,4>> t2 = load table2 as id, f1; -- <2,a>,<3,b>,<4,c> a = foreach t1 generate id, flatten(listOfId); -- <1,2>,<1,3>,<1,4> b = join a by $0, t2 by id; -- <2,1,2,2,a>,<3,1,3,3,b>,<4,1,4,4,c> c = group b by $1; -- <1,{<2,1,2,2,a>,<3,1,3,3,b>,<4,1,4,4,c>}> d = foreach d generate group, c.b::$4; -- <1, {<a>,<b>,<c>}> where <> represents a tuple and {} a bag. I'm not 100% sure of the syntax c.b::$4 for d, you may have to fiddle with that to get it right. Alan. Joydeep Sen Sarma wrote: > Will it? > > Trying an example: > > t1 = {<1, <2, 3, 4>>} > t2 = {<2, "alpha">,<3,"beta">,<4,"gamma">} > > desired outcome c = {<1, <"alpha", "beta", "gamma">} /* or alternatively > */ > c = {<1, <<2,"alpha">,<3,"beta">,<4,"gamma">>>} > > but as proposed (I hope I am reading the pig document correctly): > > t1a = {<2,3,4>} > b = {<2, 2, "alpha">} > > // no point going further - this doesn't seem to be doing what I want .. > > > -----Original Message----- > From: Alan Gates [mailto:[EMAIL PROTECTED] > Sent: Tuesday, August 28, 2007 10:45 AM > To: [email protected] > Cc: [EMAIL PROTECTED] > Subject: Re: looking for some help with pig syntax > > I think the following will do what you want. > > t1 = load table1 as id, listOfId; > t2 = load table2 as id, f1; > t1a = foreach t1 generate flatten(listOfId); -- flattens the lisOfId > into a set of ids > b = join t1a by $0, t2 by id; -- join the two together. > c = foreach b generate t2.id, t2.f1; -- project just the ids and f1 > entries. > > Alan. > > Joydeep Sen Sarma wrote: > >> Specifically, how can we express this query: >> >> >> >> Table1 contains: id, (list of ids) >> >> Table2 contains: id, f1 >> >> >> >> Where the Table1:list is a variable length list of foreign key (id) >> > into > >> Table2. >> >> >> >> We would like to join every element of Table1:list with corresponding >> Table2:id. Ie. The final output should of the form: >> >> >> >> Table3 contains: id, (list of f1) >> >> >> >> Couldn't quite figure out how to do this - does Pig Latin support >> > nested > >> foreach loops? If there's a more appropriate mailing list - please >> re-direct, >> >> >> >> Thanks, >> >> >> >> Joydeep >> >> >> >> >> >> >> >>
