How did you set the heap sizes? If you are running on a cluster you need to add 
properties to your mapred-site.xml. Something like this:

<property>
   <name>mapred.map.child.java.opts</name>
   <value>-Xmx1500m</value>
   <description>Java opts for the map tasks.
     MapR:
     Default heapsize(-Xmx) is determined by memory reserved for mapreduce at 
tasktracker.
     Default memory for a mapreduce task =
       (Total Memory reserved for mapreduce) * (2*#reduceslots / (#mapslots + 
2*#reduceslots))
   </description>
 </property>

 <property>
   <name>mapred.reduce.child.java.opts</name>
   <value>-Xmx4000m</value>
   <description>Java opts for the reduce tasks.     
     MapR:
     Default heapsize(-Xmx) is determined by memory reserved  for mapreduce at 
tasktracker.
     Reduce task is given more memory than map task.
     Default memory for a reduce task =
       (Total Memory reserved for mapreduce) * (2*#reduceslots / (#mapslots + 
2*#reduceslots))
   </description>
 </property>


-----Original Message-----
From: [email protected] [mailto:[email protected]] 
Sent: Wednesday, November 16, 2011 5:06 PM
To: [email protected]
Subject: OutofMemoryError when running kmeans or fuzzykmeans cluster method

Hi guys,

     I am trying to run the synthetic_control example on the hadoop. In order 
to test the performance of Mahout, I simulated the synthetic control data to 
increase the size of data to the matrix size of 1080000*60 (which are divided 
into 9 files). After upload the data into HDFS and run the following command:
     Mahout org.apache.mahout.clustering.syntheticcontrol.kmeans.Job
     
      It started to run fine, until I reached the OutofMemeoryError in the 
iteration 10. I tried several times, the problem is still there even I changed 
the java heapsize to 4GB. Please provide any idea about this kind of problem, 
thanks a lot. The error information is as following:
     
     
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
        at 
org.apache.mahout.math.map.OpenIntDoubleHashMap.rehash(OpenIntDoubleHashMap.java:434)
        at 
org.apache.mahout.math.map.OpenIntDoubleHashMap.put(OpenIntDoubleHashMap.java:387)
        at 
org.apache.mahout.math.RandomAccessSparseVector.setQuick(RandomAccessSparseVector.java:138)
        at 
org.apache.mahout.math.VectorWritable.readFields(VectorWritable.java:113)
        at 
org.apache.mahout.clustering.WeightedVectorWritable.readFields(WeightedVectorWritable.java:55)
        at 
org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:1751)
        at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1879)
        at 
org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.computeNext(SequenceFileIterator.java:95)
        at 
org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.computeNext(SequenceFileIterator.java:38)
        at 
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136)
        at 
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131)
        aption in thread "main" java.lang.OutOfMemoryError: Java heap space
        at 
org.apache.mahout.math.map.OpenIntDoubleHashMap.rehash(OpenIntDoubleHashMap.java:434)
        at 
org.apache.mahout.math.map.OpenIntDoubleHashMap.put(OpenIntDoubleHashMap.java:387)
        at 
org.apache.mahout.math.RandomAccessSparseVector.setQuick(RandomAccessSparseVector.java:138)
        at 
org.apache.mahout.math.VectorWritable.readFields(VectorWritable.java:113)
        at 
org.apache.mahout.clustering.WeightedVectorWritable.readFields(WeightedVectorWritable.java:55)
        at 
org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:1751)
        at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1879)
        at 
org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.computeNext(SequenceFileIterator.java:95)
        at 
org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.computeNext(SequenceFileIterator.java:38)
        




zou.cl via foxmail
---------------------------------------------------------------------------------------------------
Confidentiality Notice: The information contained in this e-mail and any 
accompanying attachment(s) 
is intended only for the use of the intended recipient and may be confidential 
and/or privileged of 
Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader of 
this communication is 
not the intended recipient, unauthorized use, forwarding, printing,  storing, 
disclosure or copying 
is strictly prohibited, and may be unlawful.If you have received this 
communication in error,please 
immediately notify the sender by return e-mail, and delete the original message 
and all copies from 
your system. Thank you. 
---------------------------------------------------------------------------------------------------

Reply via email to