[ 
https://issues.apache.org/jira/browse/PIG-946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12751648#action_12751648
 ] 

Pradeep Kamath commented on PIG-946:
------------------------------------

The root cause is in the CombinerOptimizer, the code expects POPackage to have 
POForEach as the successor and then analyzes the POForEach to check if it can 
be combined. In the above query, the OpLimitOptimizer pushes the limit between 
the cogroup and foreach which in the MRPlan shows up as a POLimit between the 
POPackage and POForEach. In this case, the CombinerOptimizer should ignore the 
POLimit and still analyze the POForEach to check if combiner optimization is 
possible.

> Combiner optimizer does not optimize when limit follow group, foreach
> ---------------------------------------------------------------------
>
>                 Key: PIG-946
>                 URL: https://issues.apache.org/jira/browse/PIG-946
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.3.0
>            Reporter: Pradeep Kamath
>
> The following script is combinable but is not optimized:
> a = load '/user/pig/tests/data/singlefile/studenttab10k';
> b = group a by $1;
> c = foreach b generate group, AVG(a.$2);
> d = limit c 10;
> dump d;

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to