Rohini Palaniswamy created PIG-4843:
---------------------------------------
Summary: Turn off combiner in reducer vertex for Tez
Key: PIG-4843
URL: https://issues.apache.org/jira/browse/PIG-4843
Project: Pig
Issue Type: Improvement
Reporter: Rohini Palaniswamy
Assignee: Rohini Palaniswamy
Fix For: 0.16.0
{code}
B = group A by key;
C = foreach B {
key_value = A.key_value;
distinct_key_value = DISTINCT
key_value;
generate group, MIN(A.key_value) as
min_value, MAX(A.key_value) as max_value, COUNT(distinct_key_value) as
distinct_values;
}
{code}
In the above example, the combine plan holds the Distinct bag and it causes OOM
when combiner is run by the MergeManager in the reducer. We did not have this
issue with mapreduce as combiner is not running in reducer for new API till now
(MAPREDUCE-5221)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)