Exciting. Yeah, I also set my heapsize to 2G. I set it in the hadoop-config.sh file. Did you do it there or did you instead set it in /conf/madred-site.xml --> mapred.child.java.opts? That'd be my next step if I were actually getting memory errors, but wasn't even sure that real data could be produced.
Kinda scary that it'll exit successfully without results. Does mahout ever return "wrong" results? That is, there should be 120,000 results, but because of some memory config somewhere it successfully returns just 100,000 results? Anyone ever see that, and if so, how do you deal with it? conf/mapred-site.xml mapred.child.java.opts conf/mapred-site.xml mapred.child.java.opts On Fri, May 21, 2010 at 10:36 AM, Jeff Eastman <[email protected]>wrote: > On 5/20/10 9:51 PM, Mike Roberts wrote: > >> ./bin/mahout seqdumper --seqFile patterns/fpgrowth/part-r-00000 >> > After reconfiguring a 4-node cluster to set the java heapsize to 2g I got > 92144 in patterns/fpgrowth/part-r-00000 and got Count: 359 and volumes of > output after seqdumper. But its only using a single mapper/reducer in all > the steps (probably why it OMEs with the default heap). I also tried Drew's > -Dmapred.reduce.tasks=2 trick but bin/mahout barfs on that. >
