Re: Spill and Map Output

Ravi Gummadi Wed, 22 Dec 2010 13:09:16 -0800

Each map task will generate a single intermediate file (i.e. Map output file). 
This is obtained by merging multiple spills, if spills needed to happen.


Index file gives the details of the offset and length for each reducer. Offset 
is offset in the map output file where the input data for the particular 
reducer starts and length is the size of the data starting from the offset.

-Ravi


On 12/23/10 2:17 AM, "Pedro Costa" <psdc1...@gmail.com> wrote:

Hi,

1 - I would like to understand how a partition works in the Map
Reduce. I know that the Map Reduce contains the IndexRecord class that
indicates the length of something. Is it the length of a partition or
of a spill?

2 - In large map output, a partition can be a set of spills, or a
spill is simple the same thing as a partition?

Thanks,
--
Pedro

Re: Spill and Map Output

Reply via email to