[jira] Commented: (HADOOP-686) job.setOutputValueComparatorClass(theClass) should be supported

eric baldeschwieler (JIRA) Mon, 06 Nov 2006 20:12:52 -0800

    [ 
http://issues.apache.org/jira/browse/HADOOP-686?page=comments#action_12447621 ] 
            
eric baldeschwieler commented on HADOOP-686:
--------------------------------------------


There exist hacks to achieve this effect now.  I'll try to scare up an expert 
to document them.  Also a good long term feature.

> job.setOutputValueComparatorClass(theClass) should be supported
> ---------------------------------------------------------------
>
>                 Key: HADOOP-686
>                 URL: http://issues.apache.org/jira/browse/HADOOP-686
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: mapred
>         Environment: all environment
>            Reporter: Feng Jiang
>
> if the input of Reduce phase is :
> K2, V3
> K2, V2
> K1, V5
> K1, V3
> K1, V4
> in the current hadoop, the reduce output could be:
> K1, (V5, V3, V4)
> K2, (V3, V2)
> But I hope hadoop supports job.setOutputValueComparatorClass(theClass), so 
> that i can make values are in order, and the output could be:
> K1, (V3, V4, V5) 
> K2, (V2, V3)
> This feature is very important, I think. Without it, we have to take the 
> sorting by ourselves, and have to worry about the possibility that the values 
> are too large to fit into memory. Then the codes becomes too hard to read. 
> That is the reason why i think this feature is so important, and should be 
> done in the hadoop framework.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HADOOP-686) job.setOutputValueComparatorClass(theClass) should be supported

Reply via email to