RE: Heap Error: Mahout RowIdJob

Sarang Deshpande Wed, 03 Oct 2012 14:38:16 -0700

Try changing HADOOP_HEAPSIZE in hadoop-env.sh to something bigger.

~Sarang


-----Original Message-----
From: satish verma [mailto:[email protected]] 
Sent: Tuesday, October 02, 2012 9:18 PM
To: [email protected]
Subject: Heap Error: Mahout RowIdJob

I am trying to run the Mahout Rowid command  as follows:

./exp/mahout/current/bin/mahout rowid -i 
/tmp/satish/t2c/60k/vectors_m/vectors_m/ -o
/tmp/satish/t2c/60k/matirx_73731_131093


I always get Exception in thread "main" java.lang.OutOfMemoryError: Java heap 
space when I try for larger values such as 70000 vectors , each of size 130k.

I tried increasing the Mahout Heapsize in /bin/mahout script but it does not 
help. Top shows that memory is not being over utilized.

How can I solve or debug this problem ???



12/10/02 23:06:08 INFO common.AbstractJob: Command line arguments:
{--endPhase=[2147483647],
--input=[/tmp/satish/t2c/60k/vectors_m/vectors_m/],
--output=[/tmp/satish/t2c/60k/matirx_73731_131093], --startPhase=[0], 
--tempDir=[temp]}
12/10/02 23:06:11 INFO util.NativeCodeLoader: Loaded the native-hadoop library
12/10/02 23:06:11 INFO zlib.ZlibFactory: Successfully loaded & initialized 
native-zlib library
12/10/02 23:06:11 INFO compress.CodecPool: Got brand-new compressor
12/10/02 23:06:28 INFO compress.CodecPool: Got brand-new compressor
12/10/02 23:06:33 INFO compress.CodecPool: Got brand-new decompressor Exception 
in thread "main" java.lang.OutOfMemoryError: Java heap space
    at
org.apache.hadoop.io.compress.DecompressorStream.<init>(DecompressorStream.java:43)
    at
org.apache.hadoop.io.compress.DefaultCodec.createInputStream(DefaultCodec.java:71)
    at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1520)
    at
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1428)
    at
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1417)
    at
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1412)
    at
org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.<init>(SequenceFileIterator.java:58)
    at
org.apache.mahout.common.iterator.sequencefile.SequenceFileDirIterator$1.apply(SequenceFileDirIterator.java:110)
    at
org.apache.mahout.common.iterator.sequencefile.SequenceFileDirIterator$1.apply(SequenceFileDirIterator.java:106)
    at com.google.common.collect.Iterators$8.next(Iterators.java:765)
    at com.google.common.collect.Iterators$5.hasNext(Iterators.java:526)
    at
com.google.common.collect.ForwardingIterator.hasNext(ForwardingIterator.java:43)
    at org.apache.mahout.utils.vectors.RowIdJob.run(RowIdJob.java:75)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.mahout.utils.vectors.RowIdJob.main(RowIdJob.java:98)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
    at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
    at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

RE: Heap Error: Mahout RowIdJob

Reply via email to