i am actually not sure how to manipulate use of combiners in hadoop. All i can say that the code does make extensive use of combiners but they were always "on" for me. I had no idea one might turn their use off.
On Wed, May 22, 2013 at 6:17 AM, Jakub Pawłowski <[email protected]>wrote: > Yes, I was manipulating io.sort.factor too, it speeds up reducer, values > around 30 gives good result for me. > But my problem is not reducer, my problem is Bt-job map taks that spills > to drive. > > You mentioned Combiner, how can I turn it on ? I'm running my job from > console like that > > mahout ssvd --rank 400 --computeU true --computeV true --reduceTasks 3 > --input ${INPUT} --output ${OUTPUT} -ow --tempDir /tmp/ssvdtmp/ > > document at https://cwiki.apache.org/**MAHOUT/stochastic-singular-** > value-decomposition.data/SSVD-**CLI.pdf<https://cwiki.apache.org/MAHOUT/stochastic-singular-value-decomposition.data/SSVD-CLI.pdf>doesn't > mention anything about combiner. > > Thanks for your answer. > > > > W dniu 22.05.2013 14:59, Sean Owen pisze: > > I feel like I've seen this too and it's just a bug. You're not running >> out of memory. >> >> Are you also setting io.sort.factor? that can help too. You might try >> as high as 100. >> >> Also have you tried a Combiner? if you can apply it it should help too >> as it is designed to reduce the amount of stuff spilled. >> >> >
