Hi guys,
I am trying to run the synthetic_control example on the hadoop. In order
to test the performance of Mahout, I simulated the synthetic control data to
increase the size of data to the matrix size of 1080000*60 (which are divided
into 9 files). After upload the data into HDFS and run the following command:
Mahout org.apache.mahout.clustering.syntheticcontrol.kmeans.Job
It started to run fine, until I reached the OutofMemeoryError in the
iteration 10. I tried several times, the problem is still there even I changed
the java heapsize to 4GB. Please provide any idea about this kind of problem,
thanks a lot. The error information is as following:
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at
org.apache.mahout.math.map.OpenIntDoubleHashMap.rehash(OpenIntDoubleHashMap.java:434)
at
org.apache.mahout.math.map.OpenIntDoubleHashMap.put(OpenIntDoubleHashMap.java:387)
at
org.apache.mahout.math.RandomAccessSparseVector.setQuick(RandomAccessSparseVector.java:138)
at
org.apache.mahout.math.VectorWritable.readFields(VectorWritable.java:113)
at
org.apache.mahout.clustering.WeightedVectorWritable.readFields(WeightedVectorWritable.java:55)
at
org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:1751)
at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1879)
at
org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.computeNext(SequenceFileIterator.java:95)
at
org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.computeNext(SequenceFileIterator.java:38)
at
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136)
at
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131)
aption in thread "main" java.lang.OutOfMemoryError: Java heap space
at
org.apache.mahout.math.map.OpenIntDoubleHashMap.rehash(OpenIntDoubleHashMap.java:434)
at
org.apache.mahout.math.map.OpenIntDoubleHashMap.put(OpenIntDoubleHashMap.java:387)
at
org.apache.mahout.math.RandomAccessSparseVector.setQuick(RandomAccessSparseVector.java:138)
at
org.apache.mahout.math.VectorWritable.readFields(VectorWritable.java:113)
at
org.apache.mahout.clustering.WeightedVectorWritable.readFields(WeightedVectorWritable.java:55)
at
org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:1751)
at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1879)
at
org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.computeNext(SequenceFileIterator.java:95)
at
org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.computeNext(SequenceFileIterator.java:38)
zou.cl via foxmail
---------------------------------------------------------------------------------------------------
Confidentiality Notice: The information contained in this e-mail and any
accompanying attachment(s)
is intended only for the use of the intended recipient and may be confidential
and/or privileged of
Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader of
this communication is
not the intended recipient, unauthorized use, forwarding, printing, storing,
disclosure or copying
is strictly prohibited, and may be unlawful.If you have received this
communication in error,please
immediately notify the sender by return e-mail, and delete the original message
and all copies from
your system. Thank you.
---------------------------------------------------------------------------------------------------