If the problem is only the number of reduce tasks, then you can try to
reduce the dfs block size. This might help in triggering multiple reducers.
Also check the size of the mapper's output, if its greater than the
block size ( or the mapper output is scattered in multiple files ) ,
then only multiple reducers would be triggered.
HTH,
Paritosh
On 30-08-2012 12:08, C.V.Krishnakumar Iyer wrote:
Hi,
I've already tried setting it in the code using job.setNumReduceTasks() and
conf.set("mapred.reduce.tasks","100").
However, it does not seem to take the number of reducers at all, even for the
job that does parallel counting. Any advice would be appreciated.
Regards,
Krishnakumar.
On Aug 29, 2012, at 11:28 PM, 戴清灏 <[email protected]> wrote:
I doubt that you specify the config in hadoop config xml file.
--
Regards,
Q
2012/8/30 C.V. Krishnakumar Iyer <[email protected]>
Hi,
Quick question regarding PFPGrowth in Mahout 0.6:
I see that there are no options to set the number of reducers in the
parallel counting phase of PFP Growth. It is just simple word count - so
I'm guessing it should be parallelized. But for some reason it is not!
Is that intentional?
Regards,
Krishnakumar.