Hi Arun,

Suppose I am doing a simple wordcount and the map-phase is over. After the
shuffle, in each partition, the inputs to the reducer, come in a sorted
order of keys. I want to disable this.

Take the same case of wc. I don't mind the order in which my reduce gets the
keys of a single partition. I guess hadoop does an external sort for this. I
want to disable that.

Thanks,
jS

On Sun, Sep 11, 2011 at 7:03 AM, Arun C Murthy <[email protected]> wrote:

> The point of a 'reduce phase' is to aggregate keys from different maps
> (i.e. all inputs).
>
> I'm not sure what you are trying to do, but a use-case will help.
>
> IAC, the only way to achieve what you are trying to do is to run to jobs
> with the first a map-only job (i.e. #reduces = 0).
>
> Arun
>
> On Sep 10, 2011, at 10:19 PM, john smith wrote:
>
> > Hey,
> >
> > I have reduce phases too. But for each reduce, I dont need sorted input
> > (map-output for that corresponding reduce task).
> > Setting #red to 0 completely removes the reduce phase.
> >
> > Am I missing something?
> >
> > Thanks,
> >
> > On Sun, Sep 11, 2011 at 12:18 AM, Arun C Murthy <[email protected]>
> wrote:
> >
> >> Run a map-only job with #reduces set to 0.
> >>
> >> Arun
> >>
> >> On Sep 10, 2011, at 2:06 AM, john smith wrote:
> >>
> >>> Hi,
> >>>
> >>> Some of the MR jobs I run doesn't need sorting of map-output in each
> >>> partition. Is there someway I can disable it?
> >>>
> >>> Any help?
> >>>
> >>> Thanks
> >>> jS
> >>
> >>
>
>

Reply via email to