That error is thrown when the mapper is initialized and finds no
initial clusters (The error message should say "No clusters found").
Check your command line -c argument. It should name the directory
containing the initial clusters (output/clusters-0 if you used canopy to
produce them). Please post your exact command line arguments if you
still have problems, and I will help you debug them. K-Means has been
pretty well tested in some production environments and errors are
usually caused by incorrect arguments.
On 9/20/10 3:47 PM, Zhen Guo (JIRA) wrote:
Kmeans clustering error
-----------------------
Key: MAHOUT-504
URL: https://issues.apache.org/jira/browse/MAHOUT-504
Project: Mahout
Issue Type: Bug
Reporter: Zhen Guo
I tried the Kmeans algorithm on the Synthetic Control data. The following error
appears. I tried the Canopy algorithm, it is fine. This error is from Mapper. I
am using Trunk.
10/09/20 19:40:06 INFO mapred.JobClient: Task Id :
attempt_201008261432_1324_m_000000_0, Status : FAILED
java.lang.IllegalStateException: Cluster is empty!
at
org.apache.mahout.clustering.kmeans.KMeansClusterMapper.setup(KMeansClusterMapper.java:57)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:583)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
at org.apache.hadoop.mapred.Child.main(Child.java:170)