[jira] Commented: (HADOOP-1981) Need to document the controls for sorting and grouping into the reduce

Doug Cutting (JIRA) Tue, 23 Oct 2007 09:04:16 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-1981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12537052
 ]


Doug Cutting commented on HADOOP-1981:
--------------------------------------

I'd rather keep this separate from HADOOP-2046, since it not just 
documentation, but an incompatible code change.

As for names, I still like having 'output' in them, to remove potential 
confusion with join-like stuff that operates on inputs.  We probably don't need 
'key' in their name, since only keys are comparable anyway.  So I'd vote for 
outputSortComparator and outputGroupComparator.  Perhaps in HADOOP-2046 we 
should document "grouping" as a primary mapreduce pipeline stage: map, 
(combine), sort, group, reduce?


> Need to document the controls for sorting and grouping into the reduce
> ----------------------------------------------------------------------
>
>                 Key: HADOOP-1981
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1981
>             Project: Hadoop
>          Issue Type: Task
>          Components: mapred
>            Reporter: Owen O'Malley
>            Assignee: Arun C Murthy
>
> The JavaDoc for the Reducer should document how to control the sort order of 
> keys and values via the JobConf methods:
> {code}
>   setOutputKeyComparatorClass
>   setOutputValueGroupingComparator
> {code}
> Both methods desperately need better names. (I'd vote for 
> setKeySortingComparator and setKeyGroupingComparator.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1981) Need to document the controls for sorting and grouping into the reduce

Reply via email to