Yes, you must use trunk. I've tested the current build on stand-alone
and a 1-node cluster and both work correctly. IIRC, 0.3 had problems
with synthetic control but that was long ago. Mahout is changing so fast
that we always recommend using trunk.
On 9/27/10 2:06 PM, Zhen Guo (JIRA) wrote:
[
https://issues.apache.org/jira/browse/MAHOUT-504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12915400#action_12915400
]
Zhen Guo commented on MAHOUT-504:
---------------------------------
Is this change available in Trunk?
I tested as in Quick Start document. I use the following command:
$MAHOUT_HOME/bin/mahout org.apache.mahout.clustering.syntheticcontrol.kmeans.Job
It failed and the error messages are the same as above.
Kmeans clustering error
-----------------------
Key: MAHOUT-504
URL: https://issues.apache.org/jira/browse/MAHOUT-504
Project: Mahout
Issue Type: Bug
Reporter: Zhen Guo
Assignee: Robin Anil
Fix For: 0.4
I tried the Kmeans algorithm on the Synthetic Control data. The following error
appears. I tried the Canopy algorithm, it is fine. This error is from Mapper. I
am using Trunk.
10/09/20 19:40:06 INFO mapred.JobClient: Task Id :
attempt_201008261432_1324_m_000000_0, Status : FAILED
java.lang.IllegalStateException: Cluster is empty!
at
org.apache.mahout.clustering.kmeans.KMeansClusterMapper.setup(KMeansClusterMapper.java:57)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:583)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
at org.apache.hadoop.mapred.Child.main(Child.java:170)