The MapReduce-Job contains a shuffle phase, where the intermediary map 
outputs are copied to the reducer nodes. This phase of the job is assumed 
to be part of the reduce-phase, therefore. the counter already starts 
before the map-phase has finished. The actual reduce task will be started, 
just as you have heard, when all the map tasks are finished.


On Wednesday, April 23, 2014 1:18:40 PM UTC+2, Kishore kumar wrote:
>
> Hi All,
>
> I heard about the reduce job, it will be started after all map tasks 
> finished 100%, but in my hive query the reduce job started at below stage, 
> please explain why is this.(I copied below line when the job is running).
>  
> 2014-04-22 21:15:12,803 Stage-1 map = 83%, reduce = 1%, Cumulative CPU 
> 4194.4 sec
>
> -- 
>
>
> *Kishore *
>  

Reply via email to