Re: Question about how Hadoop stores intermediate results

Harsh J Sun, 25 Sep 2011 12:50:57 -0700

Chen,

Files are stored based on the reducer partitions, not exactly per-key.
The result is that there are far lesser files than you imagine there
ought to be. The keys are kept sorted inside the partitioned files and
thus you do not lose out on your key groups either.


See Partitioner, which is responsible for doing the partitioning of
your map outputs:
(http://hadoop.apache.org/common/docs/r0.20.2/mapred_tutorial.html#Partitioner)

On Sun, Sep 25, 2011 at 10:30 PM, He Chen <airb...@gmail.com> wrote:
> Hi everyone
>
> According to my understanding of Hadoop, it save MapReduce  job's
> intermediate results into files in the mapper's hard drive. Each key will
> occupy a file. I am curious what will happen if mapper's hard drive does not
> have enough inodes to save the generated keys.  Because every file needs a
> inode.
>
> Best wishes!
>
> Chen
>



-- 
Harsh J

Re: Question about how Hadoop stores intermediate results

Reply via email to