job.setOutputValueComparatorClass(theClass) should be supported
---------------------------------------------------------------

                 Key: HADOOP-686
                 URL: http://issues.apache.org/jira/browse/HADOOP-686
             Project: Hadoop
          Issue Type: New Feature
          Components: mapred
         Environment: all environment
            Reporter: Feng Jiang


if the input of Reduce phase is :

K2, V3
K2, V2
K1, V5
K1, V3
K1, V4

in the current hadoop, the reduce output could be:
K1, (V5, V3, V4)
K2, (V3, V2)

But I hope hadoop supports job.setOutputValueComparatorClass(theClass), so that 
i can make values are in order, and the output could be:
K1, (V3, V4, V5) 
K2, (V2, V3)

This feature is very important, I think. Without it, we have to take the 
sorting by ourselves, and have to worry about the possibility that the values 
are too large to fit into memory. Then the codes becomes too hard to read. That 
is the reason why i think this feature is so important, and should be done in 
the hadoop framework.



-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to