"Reduce input groups" vs "Reduce input records"

Pedro Costa Fri, 25 Mar 2011 09:24:54 -0700

Hi,

in this MR example, it exists the field "Reduce input groups" and
"Reduce input records". What's the difference between these 2 fields?



$ hadoop jar cloud9.jar edu.umd.cloud9.example.simple.DemoWordCount
data/bible+shakes.nopunc wc 1
10/07/11 22:25:42 INFO simple.DemoWordCount: Tool: DemoWordCount
10/07/11 22:25:42 INFO simple.DemoWordCount:  - input path:
data/bible+shakes.nopunc
10/07/11 22:25:42 INFO simple.DemoWordCount:  - output path: wc
10/07/11 22:25:42 INFO simple.DemoWordCount:  - number of reducers: 1
[...]
10/07/11 22:25:48 INFO mapred.JobClient: Counters: 12
10/07/11 22:25:48 INFO mapred.JobClient:   FileSystemCounters
10/07/11 22:25:48 INFO mapred.JobClient:     FILE_BYTES_READ=22907000
10/07/11 22:25:48 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=5867160
10/07/11 22:25:48 INFO mapred.JobClient:   Map-Reduce Framework
10/07/11 22:25:48 INFO mapred.JobClient:     Reduce input groups=41788
10/07/11 22:25:48 INFO mapred.JobClient:     Combine output records=128253
10/07/11 22:25:48 INFO mapred.JobClient:     Map input records=156215
10/07/11 22:25:48 INFO mapred.JobClient:     Reduce shuffle bytes=0
10/07/11 22:25:48 INFO mapred.JobClient:     Reduce output records=41788
10/07/11 22:25:48 INFO mapred.JobClient:     Spilled Records=170041
10/07/11 22:25:48 INFO mapred.JobClient:     Map output bytes=15919397
10/07/11 22:25:48 INFO mapred.JobClient:     Combine input records=1820763
10/07/11 22:25:48 INFO mapred.JobClient:     Map output records=1734298
10/07/11 22:25:48 INFO mapred.JobClient:     Reduce input records=41788
10/07/11 22:25:48 INFO simple.DemoWordCount: Job Finished in 5.345 seconds


-- 
Pedro

"Reduce input groups" vs "Reduce input records"

Reply via email to