Rohini Palaniswamy created PIG-4843:
---------------------------------------

             Summary: Turn off combiner in reducer vertex for Tez
                 Key: PIG-4843
                 URL: https://issues.apache.org/jira/browse/PIG-4843
             Project: Pig
          Issue Type: Improvement
            Reporter: Rohini Palaniswamy
            Assignee: Rohini Palaniswamy
             Fix For: 0.16.0


{code}
B = group A by key;
C = foreach B {
                                         key_value           =  A.key_value;
                                         distinct_key_value  = DISTINCT 
key_value;
                                         generate group, MIN(A.key_value) as 
min_value, MAX(A.key_value) as max_value, COUNT(distinct_key_value) as 
distinct_values;
                    }
{code}

In the above example, the combine plan holds the Distinct bag and it causes OOM 
when combiner is run by the MergeManager in the reducer. We did not have this 
issue with mapreduce as combiner is not running in reducer for new API till now 
(MAPREDUCE-5221)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to