Hi guys,

     I am trying to run the synthetic_control example on the hadoop. In order 
to test the performance of Mahout, I simulated the synthetic control data to 
increase the size of data to the matrix size of 1080000*60 (which are divided 
into 9 files). After upload the data into HDFS and run the following command:
     Mahout org.apache.mahout.clustering.syntheticcontrol.kmeans.Job
     
      It started to run fine, until I reached the OutofMemeoryError in the 
iteration 10. I tried several times, the problem is still there even I changed 
the java heapsize to 4GB. Please provide any idea about this kind of problem, 
thanks a lot. The error information is as following:
     
     
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
        at 
org.apache.mahout.math.map.OpenIntDoubleHashMap.rehash(OpenIntDoubleHashMap.java:434)
        at 
org.apache.mahout.math.map.OpenIntDoubleHashMap.put(OpenIntDoubleHashMap.java:387)
        at 
org.apache.mahout.math.RandomAccessSparseVector.setQuick(RandomAccessSparseVector.java:138)
        at 
org.apache.mahout.math.VectorWritable.readFields(VectorWritable.java:113)
        at 
org.apache.mahout.clustering.WeightedVectorWritable.readFields(WeightedVectorWritable.java:55)
        at 
org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:1751)
        at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1879)
        at 
org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.computeNext(SequenceFileIterator.java:95)
        at 
org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.computeNext(SequenceFileIterator.java:38)
        at 
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136)
        at 
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131)
        aption in thread "main" java.lang.OutOfMemoryError: Java heap space
        at 
org.apache.mahout.math.map.OpenIntDoubleHashMap.rehash(OpenIntDoubleHashMap.java:434)
        at 
org.apache.mahout.math.map.OpenIntDoubleHashMap.put(OpenIntDoubleHashMap.java:387)
        at 
org.apache.mahout.math.RandomAccessSparseVector.setQuick(RandomAccessSparseVector.java:138)
        at 
org.apache.mahout.math.VectorWritable.readFields(VectorWritable.java:113)
        at 
org.apache.mahout.clustering.WeightedVectorWritable.readFields(WeightedVectorWritable.java:55)
        at 
org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:1751)
        at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1879)
        at 
org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.computeNext(SequenceFileIterator.java:95)
        at 
org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.computeNext(SequenceFileIterator.java:38)
        




zou.cl via foxmail
---------------------------------------------------------------------------------------------------
Confidentiality Notice: The information contained in this e-mail and any 
accompanying attachment(s) 
is intended only for the use of the intended recipient and may be confidential 
and/or privileged of 
Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader of 
this communication is 
not the intended recipient, unauthorized use, forwarding, printing,  storing, 
disclosure or copying 
is strictly prohibited, and may be unlawful.If you have received this 
communication in error,please 
immediately notify the sender by return e-mail, and delete the original message 
and all copies from 
your system. Thank you. 
---------------------------------------------------------------------------------------------------

Reply via email to