Re: ChainMapper and ChainReducer: Are the key/value pairs distributed to the nodes of the cluster before each Map phase?

Rahul Jain Fri, 29 Apr 2011 11:55:50 -0700

Your latter statement is correct:

> if the output of the Map1 phase (or Reduce phase) is immediately inserted
to Map2 phase (or Map3 Phase) within the same node, without any
distribution.


ChainMappers / ChainReducers are just convenience classes to allow reuse of
mapper code  whether executing as part of a sequence or executing
standalone. These do not force the system to do any additional distribution,
grouping, sorting etc.

-Rahul

2011/4/29 Panayotis Antonopoulos <[email protected]>

>
> Hello,
> Let' say we have a MR job that uses ChainMapper and ChainReducer like in
> the following diagram:
> Input->Map1->Map2->Reduce->Map3->Output
>
> The input is split and distributed to the nodes of the cluster before being
> processed by Map1 phase.
> Also, before the Reduce phase the key/value pairs are also distributed to
> the Reducers according to the Partitions made by the Partitioner.
>
> I expected that the same thing (distribution of the keys) would happen
> before Map2 and Map3 phases but after reading "Pro Hadoop" Book I strongly
> doubt it.
>
> I would like to ask you if the key/value pairs emitted by the Map1 phase
> (or those emitted by the Reduce phase) are distributed to the nodes of the
> cluster before being processed by the next Map phase,
> or if the output of the Map1 phase (or Reduce phase) is immediately
> inserted to Map2 phase (or Map3 Phase) within the same node, without any
> distribution.
>
> Thank you in advance!
> Panagiotis Antonopoulos
>

Re: ChainMapper and ChainReducer: Are the key/value pairs distributed to the nodes of the cluster before each Map phase?

Reply via email to