[
https://issues.apache.org/jira/browse/PIG-4843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rohini Palaniswamy updated PIG-4843:
------------------------------------
Resolution: Fixed
Hadoop Flags: Reviewed
Status: Resolved (was: Patch Available)
> Turn off combiner in reducer vertex for Tez if bags are in combine plan
> -----------------------------------------------------------------------
>
> Key: PIG-4843
> URL: https://issues.apache.org/jira/browse/PIG-4843
> Project: Pig
> Issue Type: Improvement
> Reporter: Rohini Palaniswamy
> Assignee: Rohini Palaniswamy
> Fix For: 0.16.0
>
> Attachments: PIG-4843-1.patch
>
>
> {code}
> B = group A by key;
> C = foreach B {
> key_value = A.key_value;
> distinct_key_value = DISTINCT
> key_value;
> generate group, MIN(A.key_value) as
> min_value, MAX(A.key_value) as max_value, COUNT(distinct_key_value) as
> distinct_values;
> }
> {code}
> In the above example, the combine plan holds the Distinct bag and it causes
> OOM when combiner is run by the MergeManager in the reducer. We did not have
> this issue with mapreduce as combiner is not running in reducer for new API
> till now (MAPREDUCE-5221)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)