Re: reducer tasks start time issue

Lin Ma Sun, 23 Dec 2012 07:09:35 -0800

Thanks for answering my question with not only the answer, but also
detailed description. :-)


regards,
Lin

On Sun, Dec 23, 2012 at 12:15 AM, Harsh J <[email protected]> wrote:

> A reduce can't process the complete data set until it has fetched all
> partitions. And any map may produce a partition for any reducer.
> Hence, we generally wait before all maps have terminated, and their
> partition outputs ready and copied over to reduces, before we begin to
> group and process the keys.
>
> However, given that you began thinking about this, this paper on
> "Online" Hadoop may interest you:
> http://www.neilconway.org/docs/nsdi2010_hop.pdf
>
> On Sat, Dec 22, 2012 at 6:55 PM, Lin Ma <[email protected]> wrote:
> > Hi guys,
> >
> > Supposing in a Hadoop job, there are both mappers and reducers. My
> question
> > is, reducer tasks cannot begin until all mapper tasks complete? If so,
> why
> > designed in this way?
> >
> > thanks in advance,
> > Lin
>
>
>
> --
> Harsh J
>

Re: reducer tasks start time issue

Reply via email to