[
https://issues.apache.org/jira/browse/HADOOP-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12605833#action_12605833
]
Owen O'Malley commented on HADOOP-3586:
---------------------------------------
No. In fact, the opposite was the intent. The types in Hadoop are:
k1,v1 -> map -> k2,v2
k2,v2 -> combiner -> k2,v2
k2,v2 -> reduce -> k3,v3
precisely because the combiner is supposed to be an optimization that could run
arbitrary numbers of times. In fact, HADOOP-363 was asking for multiple level
combiners from and to memory and was filed two years ago.
The problem is that Pig was depending on the previous semantics and it would be
difficult for them to change it to support 0.18 in a timely manner. This
backward compatibility is just a short term work around to allow Pig (and
possibly other applications that break the abstraction) to work with Hadoop
0.18.
> keep combiner backward compatible with earlier versions of hadoop
> -----------------------------------------------------------------
>
> Key: HADOOP-3586
> URL: https://issues.apache.org/jira/browse/HADOOP-3586
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Reporter: Olga Natkovich
> Assignee: Chris Douglas
> Priority: Blocker
> Fix For: 0.18.0
>
> Attachments: 3586-0.patch
>
>
> In hadoop 16 and earlier, the combiner was guaranteed to run once and only
> once for each map. In 17 this compatibility was slightly broken: the combiner
> does not run if a single <K,V> occupies the entire sort buffer. In 18, this
> is further changed to where the combiner can be called multiple times on both
> map and reduce sides.
> This breaks Pig's current implementation of the combiner and it is not easy
> to fix in a short period of time.
> We would like to ask that for a way for an application to ask for a backward
> compatible behavior for some period of time until it can adjust to the new
> behavior.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.