Hi,

We are using the old API 0.20.2 of cloudera CDH3.  When I have the 
combiner set (just using the reducer class), it works both in the mapper 
and reducer.  In the mapper, it only aggregate a couple of records a time, 
while in the reducer, it aggregates 1000 a time.  The reducer has some 
overhead.  And this overhead is deteriorated and significant because a 
mapper task run reducer/combiner as many times as groups (# of different 
output keys) sequentially.  Can I turn it off in mapper while keep it on 
reducer? 

Zhu, Guojun
Modeling Sr Graduate
571-3824370
guojun_...@freddiemac.com
Financial Engineering
Freddie Mac

Reply via email to