I am misunderstanding something.

following intro to pig-latin doc (p6), the flatten generating 'a' would
generate <1,2,3,4> (and not <1,2>,<1,3>,<1,4>)


-----Original Message-----
From: Alan Gates [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, August 28, 2007 12:47 PM
To: [email protected]
Cc: [EMAIL PROTECTED]
Subject: Re: looking for some help with pig syntax

Sorry, I misunderstood what you were trying to generate.  Perhaps the 
following will come closer:

t1 = load table1 as id, listOfId; -- <1, <2,3,4>>
t2 = load table2 as id, f1; -- <2,a>,<3,b>,<4,c>
a = foreach t1 generate id, flatten(listOfId); -- <1,2>,<1,3>,<1,4>
b = join a by $0, t2 by id; -- <2,1,2,2,a>,<3,1,3,3,b>,<4,1,4,4,c>
c = group b by $1; -- <1,{<2,1,2,2,a>,<3,1,3,3,b>,<4,1,4,4,c>}>
d = foreach d generate group, c.b::$4; -- <1, {<a>,<b>,<c>}>

where <> represents a tuple and {} a bag.

I'm not 100% sure of the syntax c.b::$4 for d, you may have to fiddle
with that to get it right.

Alan.




Joydeep Sen Sarma wrote:
> Will it?
>
> Trying an example:
>
> t1 = {<1, <2, 3, 4>>}
> t2 = {<2, "alpha">,<3,"beta">,<4,"gamma">}
>
> desired outcome c = {<1, <"alpha", "beta", "gamma">} /* or
alternatively
> */
>                 c = {<1, <<2,"alpha">,<3,"beta">,<4,"gamma">>>}
>
> but as proposed (I hope I am reading the pig document correctly):
>
> t1a = {<2,3,4>}
> b = {<2, 2, "alpha">}
>
> // no point going further - this doesn't seem to be doing what I want
..
>
>
> -----Original Message-----
> From: Alan Gates [mailto:[EMAIL PROTECTED] 
> Sent: Tuesday, August 28, 2007 10:45 AM
> To: [email protected]
> Cc: [EMAIL PROTECTED]
> Subject: Re: looking for some help with pig syntax
>
> I think the following will do what you want.
>
> t1 = load table1 as id, listOfId;
> t2 = load table2 as id, f1;
> t1a = foreach t1 generate flatten(listOfId); -- flattens the lisOfId 
> into a set of ids
> b = join t1a by $0, t2 by id; -- join the two together.
> c = foreach b generate t2.id, t2.f1; -- project just the ids and f1
> entries.
>
> Alan.
>
> Joydeep Sen Sarma wrote:
>   
>> Specifically, how can we express this query:
>>
>>  
>>
>> Table1 contains: id, (list of ids)
>>
>> Table2 contains: id, f1
>>
>>  
>>
>> Where the Table1:list is a variable length list of foreign key (id)
>>     
> into
>   
>> Table2.
>>
>>  
>>
>> We would like to join every element of Table1:list with corresponding
>> Table2:id. Ie. The final output should of the form:
>>
>>  
>>
>> Table3 contains: id, (list of f1)
>>
>>  
>>
>> Couldn't quite figure out how to do this - does Pig Latin support
>>     
> nested
>   
>> foreach loops? If there's a more appropriate mailing list - please
>> re-direct,
>>
>>  
>>
>> Thanks,
>>
>>  
>>
>> Joydeep
>>
>>  
>>
>>  
>>
>>
>>   
>>     

Reply via email to