Better MergeForEach rule ------------------------ Key: PIG-2009 URL: https://issues.apache.org/jira/browse/PIG-2009 Project: Pig Issue Type: Improvement Affects Versions: 0.9.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.10
MergeForEach rule will not merge two consecutive ForEach if the second ForEach has inner relational plan. This prevent some optimizations. Eg, {code} A = LOAD 'input1' AS (a0, a1, a2); B = LOAD 'input2' AS (b0, b1, b2); C = cogroup A by a0, B by b0; D = foreach C { E = limit A 10; F = E.a1; G = DISTINCT F; generate group, COUNT(G);}; explain D; {code} We add ForEach after cogroup to prune B, however, we cannot merge this ForEach with D. Secondary key optimization for this query is thus disabled. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira