Why is Spilled Records always equal to Map output records

Mu Qiao Sun, 12 Jul 2009 03:56:00 -0700

Hi, everyone
I'm a beginner of hadoop.

I notice it from the web console after I've tried to run serveral jobs.
Every one of the jobs has the number of Spilled Records equal to Map output
records, even if there are only 5 map output records


In the reduce phase, there are also spilled records which is equal to reduce
input records.

I know it is better that there are as few Spilled Records as possible. But I
don't know why this happened. Is there any thing wrong with the
configuration?

I've set this property in mapred-site.xml:

<property>
    <name>mapred.child.java.opts</name>
    <value>-Xmx200m</value>
</property>


-- 
Best wishes,
Qiao Mu
MOE KLINNS Lab and SKLMS Lab, Xi'an Jiaotong University
Department of Computer Science and Technology, Xi’an Jiaotong University
TEL: 15991676983
E-mail: [email protected]

Why is Spilled Records always equal to Map output records

Reply via email to