I think you are trying to run the example given in quickstart section.
It says "Finally, run kMeans with 20 clusters." Which is specified by your -k 20 attribute.

There are two ways you can run K-Means:
a) One by providing initial clusters, which is done by passing -c argument.
b) Another by specifying initial number of clusters, -k.

You are using both (-k, -c ), using just one of them will do.

You will either have to give initial cluster Centroids i.e. -c ( which can be generated by Canopy Algorithm https://cwiki.apache.org/confluence/display/MAHOUT/Canopy+Clustering ),
or, just provide -k = 20 ( initial number of randomly generated clusters ).

On 29-02-2012 23:52, manish dunani wrote:
Hi,
I am doing k-means clustering on hadoop cluster using<a href="
https://cwiki.apache.org/confluence/display/MAHOUT/K-Means+Clustering
">link</a>.

during run of k-means clustering on hadoop using following command  i got
error like:

hduser@ubuntu:/opt/mahout$ bin/mahout kmeans -i
./examples/bin/work/reuters-out-seqdir-sparse/tfidf-vectors/ -c
./examples/bin/work/clusters -o ./examples/bin/work/reuters-kmeans -x 10 -k
20 -ow
MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
Running on hadoop, using HADOOP_HOME=/usr/local/hadoop
HADOOP_CONF_DIR=/usr/local/hadoop/conf
MAHOUT-JOB: /opt/mahout/examples/target/mahout-examples-0.7-SNAPSHOT-job.jar
12/02/29 12:42:23 INFO common.AbstractJob: Command line arguments:
{--clusters=[./examples/bin/work/clusters], --convergenceDelta=[0.5],
--distanceMeasure=[org.apache.mahout.common.distance.SquaredEuclideanDistanceMeasure],
--endPhase=[2147483647],
--input=[./examples/bin/work/reuters-out-seqdir-sparse/tfidf-vectors/],
--maxIter=[10], --method=[mapreduce], --numClusters=[20],
--output=[./examples/bin/work/reuters-kmeans], --overwrite=null,
--startPhase=[0], --tempDir=[temp]}
12/02/29 12:42:23 INFO common.HadoopUtil: Deleting
examples/bin/work/reuters-kmeans
12/02/29 12:42:23 INFO common.HadoopUtil: Deleting
examples/bin/work/clusters
12/02/29 12:42:24 INFO util.NativeCodeLoader: Loaded the native-hadoop
library
12/02/29 12:42:24 INFO zlib.ZlibFactory: Successfully loaded&  initialized
native-zlib library
12/02/29 12:42:24 INFO compress.CodecPool: Got brand-new compressor
12/02/29 12:42:24 INFO kmeans.RandomSeedGenerator: Wrote 20 vectors to
examples/bin/work/clusters/part-randomSeed
12/02/29 12:42:24 INFO kmeans.KMeansDriver: Input:
examples/bin/work/reuters-out-seqdir-sparse/tfidf-vectors Clusters In:
examples/bin/work/clusters/part-randomSeed Out:
examples/bin/work/reuters-kmeans Distance:
org.apache.mahout.common.distance.SquaredEuclideanDistanceMeasure
12/02/29 12:42:24 INFO kmeans.KMeansDriver: convergence: 0.5 max
Iterations: 10 num Reduce Tasks: org.apache.mahout.math.VectorWritable
Input Vectors: {}
12/02/29 12:42:24 INFO kmeans.KMeansDriver: K-Means Iteration 1
12/02/29 12:42:26 INFO input.FileInputFormat: Total input paths to process
: 1
12/02/29 12:42:27 INFO mapred.JobClient: Running job: job_201202290930_0012
12/02/29 12:42:28 INFO mapred.JobClient:  map 0% reduce 0%
12/02/29 12:42:42 INFO mapred.JobClient: Task Id :
attempt_201202290930_0012_m_000000_0, Status : FAILED
java.lang.IllegalStateException: No clusters found. Check your -c path.
     at
org.apache.mahout.clustering.kmeans.KMeansMapper.setup(KMeansMapper.java:59)
     at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
     at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
     at org.apache.hadoop.mapred.Child.main(Child.java:170)

12/02/29 12:42:48 INFO mapred.JobClient: Task Id :
attempt_201202290930_0012_m_000000_1, Status : FAILED
java.lang.IllegalStateException: No clusters found. Check your -c path.

     at
org.apache.mahout.clustering.kmeans.KMeansMapper.setup(KMeansMapper.java:59)
     at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
     at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
     at org.apache.hadoop.mapred.Child.main(Child.java:170)

12/02/29 12:42:54 INFO mapred.JobClient: Task Id :
attempt_201202290930_0012_m_000000_2, Status : FAILED
java.lang.IllegalStateException: No clusters found. Check your -c path.

     at
org.apache.mahout.clustering.kmeans.KMeansMapper.setup(KMeansMapper.java:59)
     at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
     at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
     at org.apache.hadoop.mapred.Child.main(Child.java:170)

12/02/29 12:43:03 INFO mapred.JobClient: Job complete: job_201202290930_0012
12/02/29 12:43:03 INFO mapred.JobClient: Counters: 3
12/02/29 12:43:03 INFO mapred.JobClient:   Job Counters
12/02/29 12:43:03 INFO mapred.JobClient:     Launched map tasks=4
12/02/29 12:43:03 INFO mapred.JobClient:     Data-local map tasks=4
12/02/29 12:43:03 INFO mapred.JobClient:     Failed map tasks=1
Exception in thread "main" java.lang.InterruptedException: K-Means
Iteration failed processing examples/bin/work/clusters/part-randomSeed
     at
org.apache.mahout.clustering.kmeans.KMeansDriver.runIteration(KMeansDriver.java:373)
     at
org.apache.mahout.clustering.kmeans.KMeansDriver.buildClustersMR(KMeansDriver.java:317)
     at
org.apache.mahout.clustering.kmeans.KMeansDriver.buildClusters(KMeansDriver.java:239)
     at
org.apache.mahout.clustering.kmeans.KMeansDriver.run(KMeansDriver.java:154)
     at
org.apache.mahout.clustering.kmeans.KMeansDriver.run(KMeansDriver.java:112)
     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
     at
org.apache.mahout.clustering.kmeans.KMeansDriver.main(KMeansDriver.java:61)
     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
     at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
     at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
     at java.lang.reflect.Method.invoke(Method.java:616)
     at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
     at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
     at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:188)
     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
     at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
     at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
     at java.lang.reflect.Method.invoke(Method.java:616)
     at org.apache.hadoop.util.RunJar.main(RunJar.java:156)


I also created "clusters"directory in /opt/mahout/eamples/work.

Then after i got the same error .


What to do to solve the error?



Reply via email to