Try changing HADOOP_HEAPSIZE in hadoop-env.sh to something bigger. ~Sarang
-----Original Message----- From: satish verma [mailto:[email protected]] Sent: Tuesday, October 02, 2012 9:18 PM To: [email protected] Subject: Heap Error: Mahout RowIdJob I am trying to run the Mahout Rowid command as follows: ./exp/mahout/current/bin/mahout rowid -i /tmp/satish/t2c/60k/vectors_m/vectors_m/ -o /tmp/satish/t2c/60k/matirx_73731_131093 I always get Exception in thread "main" java.lang.OutOfMemoryError: Java heap space when I try for larger values such as 70000 vectors , each of size 130k. I tried increasing the Mahout Heapsize in /bin/mahout script but it does not help. Top shows that memory is not being over utilized. How can I solve or debug this problem ??? 12/10/02 23:06:08 INFO common.AbstractJob: Command line arguments: {--endPhase=[2147483647], --input=[/tmp/satish/t2c/60k/vectors_m/vectors_m/], --output=[/tmp/satish/t2c/60k/matirx_73731_131093], --startPhase=[0], --tempDir=[temp]} 12/10/02 23:06:11 INFO util.NativeCodeLoader: Loaded the native-hadoop library 12/10/02 23:06:11 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library 12/10/02 23:06:11 INFO compress.CodecPool: Got brand-new compressor 12/10/02 23:06:28 INFO compress.CodecPool: Got brand-new compressor 12/10/02 23:06:33 INFO compress.CodecPool: Got brand-new decompressor Exception in thread "main" java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.io.compress.DecompressorStream.<init>(DecompressorStream.java:43) at org.apache.hadoop.io.compress.DefaultCodec.createInputStream(DefaultCodec.java:71) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1520) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1428) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1417) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1412) at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.<init>(SequenceFileIterator.java:58) at org.apache.mahout.common.iterator.sequencefile.SequenceFileDirIterator$1.apply(SequenceFileDirIterator.java:110) at org.apache.mahout.common.iterator.sequencefile.SequenceFileDirIterator$1.apply(SequenceFileDirIterator.java:106) at com.google.common.collect.Iterators$8.next(Iterators.java:765) at com.google.common.collect.Iterators$5.hasNext(Iterators.java:526) at com.google.common.collect.ForwardingIterator.hasNext(ForwardingIterator.java:43) at org.apache.mahout.utils.vectors.RowIdJob.run(RowIdJob.java:75) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) at org.apache.mahout.utils.vectors.RowIdJob.main(RowIdJob.java:98) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68) at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139) at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
