I'm getting "*No input clusters found in
reuters-kmeans-clusters/part-randomSeed. Check your -c argument*" while
running k-means example on "mahout in action" sample. I searched on google,
but I didnt find a solution. I'm using mahout 0.7 version. How can I run
k-means clustering?
command:
taner@taner:~/Development/mahout-distribution-0.7/examples$
/home/taner/Development/mahout-distribution-0.7/bin/mahout kmeans -i
reuters-vectors/tfidf-vectors/ -c reuters-kmeans-clusters -o
reuters-kmeans-clusters -dm
org.apache.mahout.common.distance.SquaredEuclideanDistanceMeasure -cd 1.0
-k 20 -x 20 -cl
out:
Running on hadoop, using /home/taner/Development/hadoop-1.2.0/bin/hadoop
and HADOOP_CONF_DIR=
MAHOUT-JOB:
/home/taner/Development/mahout-distribution-0.7/examples/target/mahout-examples-0.7-job.jar
13/07/24 00:56:33 INFO common.AbstractJob: Command line arguments:
{--clustering=null, --clusters=[reuters-kmeans-clusters],
--convergenceDelta=[1.0],
--distanceMeasure=[org.apache.mahout.common.distance.SquaredEuclideanDistanceMeasure],
--endPhase=[2147483647], --input=[reuters-vectors/tfidf-vectors/],
--maxIter=[20], --method=[mapreduce], --numClusters=[20],
--output=[reuters-kmeans-clusters], --startPhase=[0], --tempDir=[temp]}
13/07/24 00:56:33 INFO util.NativeCodeLoader: Loaded the native-hadoop
library
13/07/24 00:56:33 INFO zlib.ZlibFactory: Successfully loaded & initialized
native-zlib library
13/07/24 00:56:33 INFO compress.CodecPool: Got brand-new compressor
13/07/24 00:56:33 INFO kmeans.RandomSeedGenerator: Wrote 20 Klusters to
reuters-kmeans-clusters/part-randomSeed
13/07/24 00:56:33 INFO kmeans.KMeansDriver: Input:
reuters-vectors/tfidf-vectors Clusters In:
reuters-kmeans-clusters/part-randomSeed Out: reuters-kmeans-clusters
Distance: org.apache.mahout.common.distance.SquaredEuclideanDistanceMeasure
13/07/24 00:56:33 INFO kmeans.KMeansDriver: convergence: 1.0 max
Iterations: 20 num Reduce Tasks: org.apache.mahout.math.VectorWritable
Input Vectors: {}
13/07/24 00:56:34 INFO compress.CodecPool: Got brand-new decompressor
*Exception in thread "main" java.lang.IllegalStateException: No input
clusters found in reuters-kmeans-clusters/part-randomSeed. Check your -c
argument.*
at
org.apache.mahout.clustering.kmeans.KMeansDriver.buildClusters(KMeansDriver.java:218)
at
org.apache.mahout.clustering.kmeans.KMeansDriver.run(KMeansDriver.java:149)
at
org.apache.mahout.clustering.kmeans.KMeansDriver.run(KMeansDriver.java:108)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at
org.apache.mahout.clustering.kmeans.KMeansDriver.main(KMeansDriver.java:49)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.apache.hadoop.util.RunJar.main(RunJar.java:160)