I would consider this to be a very delicate optimization with little utility
in the real world.  It is very, very rare to reliably know how many records
the reducer will see.  Getting this wrong would be a disaster.  Getting it
right would be very difficult in almost all cases.

Moreover, this assumption is baked all through the map-reduce design and
thus doing a change to allow reduce to go ahead is likely to be really
tricky (not that I know this for a fact).


On Mon, Jul 6, 2009 at 11:14 AM, Naresh Rapolu <nareshreddy.rap...@gmail.com
> wrote:

> My aim is to make the reduce move ahead with reduction as and when it gets
> the data required, instead of waiting for all the maps to complete.  If it
> knows how many records it needs and compares it with number of records it
> has got until now,  it can move on once they become equal without waiting
> for all the maps to finish.
>

Reply via email to