On Sep 25, 2011, at 2:01 PM, He Chen wrote:

> Hi Arun and Harsh J
> 
> Thank you for your replies.
> 
> Yes, there will be two finally. But during the map running, there are more
> than two.
> 
> The scenario I mentioned before will not occur with the Hadoop default
> partitioner. If there is a partitioner lead to above problem. Is there any
> security policy prevent this?
> 

Irrespective of the partitioner used a single file stores all keys/values 
during a single iteration of each 'spill' after sorting records in the 
sort-buffer.

You could have multiple spills, but you have lots of keys/values in each spill 
- we never do file per record. You'd very quickly run out of inodes.

In very early days we had a file per reducer and that caused huge issues, never 
mind file per record.

Arun

Reply via email to