[ https://issues.apache.org/jira/browse/PIG-946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12751648#action_12751648 ]
Pradeep Kamath commented on PIG-946: ------------------------------------ The root cause is in the CombinerOptimizer, the code expects POPackage to have POForEach as the successor and then analyzes the POForEach to check if it can be combined. In the above query, the OpLimitOptimizer pushes the limit between the cogroup and foreach which in the MRPlan shows up as a POLimit between the POPackage and POForEach. In this case, the CombinerOptimizer should ignore the POLimit and still analyze the POForEach to check if combiner optimization is possible. > Combiner optimizer does not optimize when limit follow group, foreach > --------------------------------------------------------------------- > > Key: PIG-946 > URL: https://issues.apache.org/jira/browse/PIG-946 > Project: Pig > Issue Type: Bug > Affects Versions: 0.3.0 > Reporter: Pradeep Kamath > > The following script is combinable but is not optimized: > a = load '/user/pig/tests/data/singlefile/studenttab10k'; > b = group a by $1; > c = foreach b generate group, AVG(a.$2); > d = limit c 10; > dump d; -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.