How did you set the heap sizes? If you are running on a cluster you need to add
properties to your mapred-site.xml. Something like this:
<property>
<name>mapred.map.child.java.opts</name>
<value>-Xmx1500m</value>
<description>Java opts for the map tasks.
MapR:
Default heapsize(-Xmx) is determined by memory reserved for mapreduce at
tasktracker.
Default memory for a mapreduce task =
(Total Memory reserved for mapreduce) * (2*#reduceslots / (#mapslots +
2*#reduceslots))
</description>
</property>
<property>
<name>mapred.reduce.child.java.opts</name>
<value>-Xmx4000m</value>
<description>Java opts for the reduce tasks.
MapR:
Default heapsize(-Xmx) is determined by memory reserved for mapreduce at
tasktracker.
Reduce task is given more memory than map task.
Default memory for a reduce task =
(Total Memory reserved for mapreduce) * (2*#reduceslots / (#mapslots +
2*#reduceslots))
</description>
</property>
-----Original Message-----
From: [email protected] [mailto:[email protected]]
Sent: Wednesday, November 16, 2011 5:06 PM
To: [email protected]
Subject: OutofMemoryError when running kmeans or fuzzykmeans cluster method
Hi guys,
I am trying to run the synthetic_control example on the hadoop. In order
to test the performance of Mahout, I simulated the synthetic control data to
increase the size of data to the matrix size of 1080000*60 (which are divided
into 9 files). After upload the data into HDFS and run the following command:
Mahout org.apache.mahout.clustering.syntheticcontrol.kmeans.Job
It started to run fine, until I reached the OutofMemeoryError in the
iteration 10. I tried several times, the problem is still there even I changed
the java heapsize to 4GB. Please provide any idea about this kind of problem,
thanks a lot. The error information is as following:
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at
org.apache.mahout.math.map.OpenIntDoubleHashMap.rehash(OpenIntDoubleHashMap.java:434)
at
org.apache.mahout.math.map.OpenIntDoubleHashMap.put(OpenIntDoubleHashMap.java:387)
at
org.apache.mahout.math.RandomAccessSparseVector.setQuick(RandomAccessSparseVector.java:138)
at
org.apache.mahout.math.VectorWritable.readFields(VectorWritable.java:113)
at
org.apache.mahout.clustering.WeightedVectorWritable.readFields(WeightedVectorWritable.java:55)
at
org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:1751)
at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1879)
at
org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.computeNext(SequenceFileIterator.java:95)
at
org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.computeNext(SequenceFileIterator.java:38)
at
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136)
at
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131)
aption in thread "main" java.lang.OutOfMemoryError: Java heap space
at
org.apache.mahout.math.map.OpenIntDoubleHashMap.rehash(OpenIntDoubleHashMap.java:434)
at
org.apache.mahout.math.map.OpenIntDoubleHashMap.put(OpenIntDoubleHashMap.java:387)
at
org.apache.mahout.math.RandomAccessSparseVector.setQuick(RandomAccessSparseVector.java:138)
at
org.apache.mahout.math.VectorWritable.readFields(VectorWritable.java:113)
at
org.apache.mahout.clustering.WeightedVectorWritable.readFields(WeightedVectorWritable.java:55)
at
org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:1751)
at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1879)
at
org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.computeNext(SequenceFileIterator.java:95)
at
org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.computeNext(SequenceFileIterator.java:38)
zou.cl via foxmail
---------------------------------------------------------------------------------------------------
Confidentiality Notice: The information contained in this e-mail and any
accompanying attachment(s)
is intended only for the use of the intended recipient and may be confidential
and/or privileged of
Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader of
this communication is
not the intended recipient, unauthorized use, forwarding, printing, storing,
disclosure or copying
is strictly prohibited, and may be unlawful.If you have received this
communication in error,please
immediately notify the sender by return e-mail, and delete the original message
and all copies from
your system. Thank you.
---------------------------------------------------------------------------------------------------