Each map task will generate a single intermediate file (i.e. Map output file). This is obtained by merging multiple spills, if spills needed to happen.
Index file gives the details of the offset and length for each reducer. Offset is offset in the map output file where the input data for the particular reducer starts and length is the size of the data starting from the offset. -Ravi On 12/23/10 2:17 AM, "Pedro Costa" <psdc1...@gmail.com> wrote: Hi, 1 - I would like to understand how a partition works in the Map Reduce. I know that the Map Reduce contains the IndexRecord class that indicates the length of something. Is it the length of a partition or of a spill? 2 - In large map output, a partition can be a set of spills, or a spill is simple the same thing as a partition? Thanks, -- Pedro