Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Pig Wiki" for change 
notification.

The following page has been changed by Shravan Narayanamurthy:
http://wiki.apache.org/pig/PigExecutionModel

------------------------------------------------------------------------------
        }
  }
  }}}
+ 
  
  {{{
  package org.apache.pig.optimization;
@@ -649, +650 @@

  }
  }}}
  
+ There is a problem with the above model when it is working on a nested bag. 
For example consider the following script:
+ {{{
+ A = load 'a';
+ B = group A by $1;
+ C = foreach B {
+             D = filter A by $0<=2;
+             generate D;
+     }
+ }}}
+ {{{
+ A:
+ (1,R)
+ (2,R)
+ (3,B)
+ B:
+ (R,{(1,R),(2,R)})
+ (B,{(3,B)})
+ }}}
+ 
+ For each tuple in B the filter in the above script works on the bag nested 
inside the tuple. Since this is an explicit bag, and in the above model, a 
project operator would not be able to handle an explicit bag. It would just 
pass the bag instead of streaming its contents. Hence the project inside the 
condition of the filter will fail. As a solution, we want to propose 
overloading the Project operator to check if its input is a bag of tuples and 
if so stream its contents instead of passing the entire bag. 
+ 

Reply via email to