Thanks. I have attached the patch: http://issues.apache.org/jira/secure/ManageAttachments.jspa?id=12354993
Best regards, Feng On 11/8/06, Doug Cutting <[EMAIL PROTECTED]> wrote:
Feng Jiang wrote: > I think what I am concerning is different with the request485. I mean, > if the input of Reduce phase is : > > K2, V3 > K2, V2 > K1, V5 > K1, V3 > K1, V4 > > in the current hadoop, the reduce output could be: > K1, (V5, V3, V4) > K2, (V3, V2) > > But I hope hadoop supports job.setOutputValueComparatorClass(theClass), > so that i can make values are in order, and the output could be: > K1, (V3, V4, V5) > K2, (V2, V3) Yes, that is different. One can currently achieve what you're after by including values in keys. The only real difference between keys and values is that values are not used for sorting, and some optimizations are made because of that. But if you need to sort by value as well as key, then you can use compound key that includes both, and a null value. Note that with block compression, repeated keys should not use too much space. Does that suffice? Another related issue is http://issues.apache.org/jira/browse/HADOOP-475. > but I have written the GenericWritable, which is a abstract class to > help user wrap different Writable instances with only one byte cost. The > GenericObject is a demo showing how to use GenericWritable. Both of them > are attached within this email. The attachment did not make it. Can you please attach these to a Jira issue, as a patch file? http://wiki.apache.org/lucene-hadoop/HowToContribute Thanks! Doug
