Hi Lucian,
For every map task, combiner will be executed multiple times before writing the map output. Combine step is not a separate task and it is part of map task execution. Reducer will copy the output of the map task which is reduced by the combiner. >For example: >If I have 2 map tasks ran on the same machine, will I have 1 combine task on that machine to combine the maps outputs, or 2 combine tasks? In this case, combiner will be executed for each map task independent to each other. This combiner step will execute multiple times till it gets same output for 1 or more runs of combiner. You can go through Combiner section here for more info : http://wiki.apache.org/hadoop/HadoopMapReduce Devaraj K ---------------------------------------------------------------------------- --------------------------------------------------------- This e-mail and its attachments contain confidential information from HUAWEI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it!ss _____ From: Lucian Iordache [mailto:lucian.george.iorda...@gmail.com] Sent: Friday, July 01, 2011 2:25 PM To: mapreduce-user@hadoop.apache.org Subject: Relation between Mapper and Combiner Hello guys, Can anybody tell me which is the relation between map task and combine tasks? I would like to know if there is a 1:1 relation between them, or is a *:1 (many to one). For example: If I have 2 map tasks ran on the same machine, will I have 1 combine task on that machine to combine the maps outputs, or 2 combine tasks? Best Regards, -- Lucian