On Jul 9, 2012, at 12:55 PM, Grandl Robert wrote:

> Thanks a lot guys for answers. 
> 
> Still I am not able to find exactly the code for the following things:
> 
> 1. reducer to read from a Map output only its partition. I looked into 
> ReduceTask#getMapOutput which do the actual read in 
> ReduceTask#shuffleInMemory, but I don't see where it specify which partition 
> to read(reduceID).
> 

Look at TaskTracker.MapOutputServlet.

> 2. still don't understand very well in which part of the code(MapTask.java) 
> the intermediate data is written do which partition. So MapOutputBuffer is 
> the one who actually writes the data to buffer and spill after buffer is 
> full. Could you please elaborate a bit on how the data is written to which 
> partition ?
> 

Essentially you can think of the partition-id as the 'primary key' and the 
actual 'key' in the map-output of <key, value> as the 'secondary key'.

hth,
Arun

> Thanks,
> Robert
> 
> From: Arun C Murthy <a...@hortonworks.com>
> To: mapreduce-user@hadoop.apache.org 
> Sent: Monday, July 9, 2012 9:24 AM
> Subject: Re: Basic question on how reducer works
> 
> Robert,
> 
> On Jul 7, 2012, at 6:37 PM, Grandl Robert wrote:
> 
>> Hi,
>> 
>> I have some questions related to basic functionality in Hadoop. 
>> 
>> 1. When a Mapper process the intermediate output data, how it knows how many 
>> partitions to do(how many reducers will be) and how much data to go in each  
>> partition for each reducer ?
>> 
>> 2. A JobTracker when assigns a task to a reducer, it will also specify the 
>> locations of intermediate output data where it should retrieve it right ? 
>> But how a reducer will know from each remote location with intermediate 
>> output what portion it has to retrieve only ?
> 
> To add to Harsh's comment. Essentially the TT *knows* where the output of a 
> given map-id/reduce-id pair is present via an output-file/index-file 
> combination.
> 
> Arun
> 
> --
> Arun C. Murthy
> Hortonworks Inc.
> http://hortonworks.com/
> 
> 
> 
> 

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/


Reply via email to