Re: Some new requests about mapreduce

Feng Jiang Tue, 19 Dec 2006 22:07:35 -0800

I saw this bug has been fixed in 0.90. but I still think it can be improved.
can i just attach a new patch under the same issue?


thanks,

Feng

On 11/8/06, Feng Jiang <[EMAIL PROTECTED]> wrote:


Thanks. I have attached the patch:
http://issues.apache.org/jira/secure/ManageAttachments.jspa?id=12354993

Best regards,
Feng

On 11/8/06, Doug Cutting <[EMAIL PROTECTED]> wrote:
>
> Feng Jiang wrote:
> > I think what I am concerning is different with the request485. I mean,
> > if the input of Reduce phase is :
> >
> > K2, V3
> > K2, V2
> > K1, V5
> > K1, V3
> > K1, V4
> >
> > in the current hadoop, the reduce output could be:
> > K1, (V5, V3, V4)
> > K2, (V3, V2)
> >
> > But I hope hadoop supports job.setOutputValueComparatorClass
> (theClass),
> > so that i can make values are in order, and the output could be:
> > K1, (V3, V4, V5)
> > K2, (V2, V3)
>
> Yes, that is different.  One can currently achieve what you're after by
> including values in keys.  The only real difference between keys and
> values is that values are not used for sorting, and some optimizations
> are made because of that.  But if you need to sort by value as well as
> key, then you can use compound key that includes both, and a null value.
>    Note that with block compression, repeated keys should not use too
> much space.  Does that suffice?
>
> Another related issue is http://issues.apache.org/jira/browse/HADOOP-475
> .
>
> > but I have written the GenericWritable, which is a abstract class to
> > help user wrap different Writable instances with only one byte cost.
> The
> > GenericObject is a demo showing how to use GenericWritable. Both of
> them
> > are attached within this email.
>
> The attachment did not make it.  Can you please attach these to a Jira
> issue, as a patch file?
>
> http://wiki.apache.org/lucene-hadoop/HowToContribute
>
> Thanks!
>
> Doug
>

Re: Some new requests about mapreduce

Reply via email to