[
https://issues.apache.org/jira/browse/HADOOP-1981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12537052
]
Doug Cutting commented on HADOOP-1981:
--------------------------------------
I'd rather keep this separate from HADOOP-2046, since it not just
documentation, but an incompatible code change.
As for names, I still like having 'output' in them, to remove potential
confusion with join-like stuff that operates on inputs. We probably don't need
'key' in their name, since only keys are comparable anyway. So I'd vote for
outputSortComparator and outputGroupComparator. Perhaps in HADOOP-2046 we
should document "grouping" as a primary mapreduce pipeline stage: map,
(combine), sort, group, reduce?
> Need to document the controls for sorting and grouping into the reduce
> ----------------------------------------------------------------------
>
> Key: HADOOP-1981
> URL: https://issues.apache.org/jira/browse/HADOOP-1981
> Project: Hadoop
> Issue Type: Task
> Components: mapred
> Reporter: Owen O'Malley
> Assignee: Arun C Murthy
>
> The JavaDoc for the Reducer should document how to control the sort order of
> keys and values via the JobConf methods:
> {code}
> setOutputKeyComparatorClass
> setOutputValueGroupingComparator
> {code}
> Both methods desperately need better names. (I'd vote for
> setKeySortingComparator and setKeyGroupingComparator.)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.