[ http://issues.apache.org/jira/browse/HADOOP-686?page=comments#action_12447621 ] eric baldeschwieler commented on HADOOP-686: --------------------------------------------
There exist hacks to achieve this effect now. I'll try to scare up an expert to document them. Also a good long term feature. > job.setOutputValueComparatorClass(theClass) should be supported > --------------------------------------------------------------- > > Key: HADOOP-686 > URL: http://issues.apache.org/jira/browse/HADOOP-686 > Project: Hadoop > Issue Type: New Feature > Components: mapred > Environment: all environment > Reporter: Feng Jiang > > if the input of Reduce phase is : > K2, V3 > K2, V2 > K1, V5 > K1, V3 > K1, V4 > in the current hadoop, the reduce output could be: > K1, (V5, V3, V4) > K2, (V3, V2) > But I hope hadoop supports job.setOutputValueComparatorClass(theClass), so > that i can make values are in order, and the output could be: > K1, (V3, V4, V5) > K2, (V2, V3) > This feature is very important, I think. Without it, we have to take the > sorting by ourselves, and have to worry about the possibility that the values > are too large to fit into memory. Then the codes becomes too hard to read. > That is the reason why i think this feature is so important, and should be > done in the hadoop framework. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira