Any inputs will be helpful. Thanks ________________________________ From: Sengupta, Sohini IN BLR SISL Sent: Wednesday, June 22, 2011 5:15 PM To: [email protected] Cc: Sengupta, Sohini IN BLR SISL Subject: meanshift reduce task problem
Hi, I have programmatically specified setNumReduceTasks(16) in MeanShiftCanopyDriver.java. On execution the number of reducers is being set correctly (i.e. 16 as visible on jobtracker screen) but on digging deeper I see that one node has maximum number of bytes to process and it is nominal for rest of the nodes. Hence the reduce phase is very slow after 98% completion. I am trying this on a cluster of 18 nodes. I also see that load is distributed evenly in map phase but not in reduce. This is happening on 0.4 and 0.5 versions of Mahout. Has anyone faced such a problem and how to get around it? Thanks a lot in advance, Sohini ________________________________ Important notice: This e-mail and any attachment there to contains corporate proprietary information. If you have received it by mistake, please notify us immediately by reply e-mail and delete this e-mail and its attachments from your system. Thank You.
