Re: Basic question on how reducer works

Arun C Murthy Mon, 09 Jul 2012 06:25:11 -0700

Robert,

On Jul 7, 2012, at 6:37 PM, Grandl Robert wrote:


> Hi,
> 
> I have some questions related to basic functionality in Hadoop. 
> 
> 1. When a Mapper process the intermediate output data, how it knows how many 
> partitions to do(how many reducers will be) and how much data to go in each  
> partition for each reducer ?
> 
> 2. A JobTracker when assigns a task to a reducer, it will also specify the 
> locations of intermediate output data where it should retrieve it right ? But 
> how a reducer will know from each remote location with intermediate output 
> what portion it has to retrieve only ?

To add to Harsh's comment. Essentially the TT *knows* where the output of a 
given map-id/reduce-id pair is present via an output-file/index-file 
combination.

Arun

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/

Re: Basic question on how reducer works

Reply via email to