Hi, 

I was running the example kmeans program following the link here 

https://cwiki.apache.org/MAHOUT/clustering-of-synthetic-control-data.html 

So I increased the input size Synthetic_cotnrol.data from around 200kb to 1.2 
GB by copying the data itself, 
the max iteration is set to 10, so after all 10 iterations are finished, I got 
a 

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space. 

I have boosted the -xmx in the ChildJVM of Hadoop to 4G, and JAVA_HEAP_SIZE in 
bin/mahout to -xmx5g, but it still happens. I am confused a bit as to where is 
the JVM being started and how to pass in the Java Heap Size options to prevent 
this error from happening, 

Thanks

and this is the full stack trace

        at 
org.apache.mahout.math.map.OpenIntDoubleHashMap.rehash(OpenIntDoubleHashMap.java:430)
        at 
org.apache.mahout.math.map.OpenIntDoubleHashMap.put(OpenIntDoubleHashMap.java:383)
        at 
org.apache.mahout.math.RandomAccessSparseVector.setQuick(RandomAccessSparseVector.java:139)
        at 
org.apache.mahout.math.VectorWritable.readFields(VectorWritable.java:118)
        at 
org.apache.mahout.clustering.classify.WeightedVectorWritable.readFields(WeightedVectorWritable.java:56)
        at 
org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:1809)
        at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1937)
        at 
org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.computeNext(SequenceFileIterator.java:95)
        at 
org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.computeNext(SequenceFileIterator.java:38)
        at 
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
        at 
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
        at com.google.common.collect.Iterators$6.hasNext(Iterators.java:630)
        at 
com.google.common.collect.ForwardingIterator.hasNext(ForwardingIterator.java:43)
        at 
org.apache.mahout.utils.clustering.ClusterDumper.readPoints(ClusterDumper.java:293)
        at 
org.apache.mahout.utils.clustering.ClusterDumper.init(ClusterDumper.java:246)
        at 
org.apache.mahout.utils.clustering.ClusterDumper.<init>(ClusterDumper.java:94)
        at 
org.apache.mahout.clustering.syntheticcontrol.kmeans.Job.run(Job.java:137)
        at 
org.apache.mahout.clustering.syntheticcontrol.kmeans.Job.main(Job.java:59)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at 
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
        at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
        at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

Reply via email to