Thanks Paritosh. clusterpp command helped to dump instances per cluster and then used vectordump to convert vectors to text.
Thanks Rajesh On Fri, Oct 12, 2012 at 9:34 PM, paritosh ranjan <[email protected]>wrote: > I think this much memory should fix the problem. > However, If you still face OOM, then try using clusterpp command instead of > clusterdump , its not having memory limitations as it also has a mapreduce > version. You can find clusterpp's usage here > https://cwiki.apache.org/MAHOUT/top-down-clustering.html. > > On Fri, Oct 12, 2012 at 9:13 PM, Rajesh Nikam <[email protected]> > wrote: > > > Hi, > > > > I have used canopy and k-means clustering to cluster around 1.2 M > > instances. > > csv file size if around 425 MB. However when I run "mahout clusterdump" > > command as below I am getting > > Java OutOfMemory error. > > > > mahout clusterdump -dt sequencefile -i > > clean-kmeans-clusters/clusters-1-final/part-r-00000 -n 20 -b 100 -o > > cdump-clean.txt -p clean-kmeans-clusters/clusteredPoints/ > > > > Exception in thread "main" java.lang.OutOfMemoryError: Java heap space > > at org.apache.mahout.math.DenseVector.<init>(DenseVector.java:44) > > at org.apache.mahout.math.DenseVector.<init>(DenseVector.java:39) > > at > > org.apache.mahout.math.VectorWritable.readFields(VectorWritable.java:99) > > at > > > > > org.apache.mahout.clustering.classify.WeightedVectorWritable.readFields(WeightedVectorWritable.java:56) > > > > I have switched to 64 bit Ubantu and even tried setting 4GB/8GB/12GB of > > memory for java. > > > > JAVA_HEAP_MAX=-Xmx4g > > JAVA_HEAP_MAX=-Xmx8g > > JAVA_HEAP_MAX=-Xmx12g > > > > Not sure how to increase required memory for Java runtime. > > > > How to check is this java on Ubantu is 64 bit or not ? > > > > Thanks > > Rajesh > > >
