Better MergeForEach rule
------------------------

                 Key: PIG-2009
                 URL: https://issues.apache.org/jira/browse/PIG-2009
             Project: Pig
          Issue Type: Improvement
    Affects Versions: 0.9.0
            Reporter: Daniel Dai
            Assignee: Daniel Dai
             Fix For: 0.10


MergeForEach rule will not merge two consecutive ForEach if the second ForEach 
has inner relational plan. This prevent some optimizations. Eg,
{code}
A = LOAD 'input1' AS (a0, a1, a2);
B = LOAD 'input2' AS (b0, b1, b2);
C = cogroup A by a0, B by b0;
D = foreach C { E = limit A 10; F = E.a1; G = DISTINCT F; generate group, 
COUNT(G);};
explain D;
{code}
We add ForEach after cogroup to prune B, however, we cannot merge this ForEach 
with D. Secondary key optimization for this query is thus disabled.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to