Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Pig Wiki" for change 

The following page has been changed by Shravan Narayanamurthy:

+ __GROUP__
+ The logical operator co-group would be converted to 3 physical operators the 
Local Rearrange, Global Rearrange and Package as shown below:
+ attachment:Group.jpg
+ There will be a Local Rearrange operator for each input which will aggregate 
to a Global Rearrange followed by a Package as shown below:
+ attachment:GroupPhy.jpg
+ The Local Rearrange takes the input tuple and outputs a key, value pair with 
the group field as the key and the tuple as the value. For eg., (1,R) will be 
converted to {1,(1,R)}. Also the tuple is tagged with the input index it 
originated from. In our case, if (1,R) came from A it would be tagged 1 and if 
it was from B it would be tagged 2. 
+ The Global Rearrange converts the kev-value pairs of keys belonging to a 
partition into a set of (key, list of values). The partition is decided by 
which reducer the Global Rearrange is catering to. This need not be implemented 
by us as this is the intermediate step that happens between mapper and reducer.
+ The Package just takes each key, list of values and puts it in appropriate 
format as required by the co-group. So lets say we have (1,R),(2,G) in A and 
(1,B), (2,Y) in B. If there are two reducers, Global Rearrange catering to 
reducer 1 will have {1,{(1,R),(1,B)}} as the key, list of values which should 
be converted into an output tuple for co-group based on the tagged index of the 
tuples in the list. So this would be converted to {1,{(1,R)},{(1,B)}}. 
Similarly, {2,{(2,G),(2,Y)}} will be converted to {2,{(2,G)},{(2,Y)}} by 
reducer 2.
  === Comments ===
  The Physical plan and the Logical Plan were not clear to me probably because 
of the nested query plan thingy. I think we need to find a better way to draw 
this because, the conditional expression is an attribute of the filter and not 
an input to filter.

Reply via email to