Hi,

I have programmatically specified setNumReduceTasks(16) in 
MeanShiftCanopyDriver.java. On execution the number of reducers is being set 
correctly (i.e. 16 as visible on jobtracker screen)  but on digging deeper I 
see that one node has maximum number of bytes to process and it is nominal for 
rest of the nodes. Hence the reduce phase is very slow after 98% completion.

I am trying this on a cluster of 18 nodes. I also see that load is distributed 
evenly in map phase but not in reduce. This is happening on 0.4 and 0.5 
versions of Mahout. Has anyone faced such a problem and how to get around it?
Thanks a lot in advance,
Sohini

________________________________
Important notice: This e-mail and any attachment there to contains corporate 
proprietary information. If you have received it by mistake, please notify us 
immediately by reply e-mail and delete this e-mail and its attachments from 
your system.
Thank You.

Reply via email to