Hi All,

I am trying to build a classifier for a set of data that I have collected
myself. I am very new to mahout and would be very grateful if someone could
help me with the steps to get started.

The documents I have come across so far explain how to run the sample codes
but when I tried converting my text to vectors ( using seqdirectory and
seq2sparse) and run the kmeans algorithm, I get errors like below. I am not
even able to find the source code to "kmeans" or "seq2sparse" executables to
begin fixing the issue. Pointers to good reads will also help. Any help at
all will be greatly appreciated.

dipti@dipti-laptop:~$ mahout kmeans -i seq-output2 -c temp -o cluster-output
-k 20 -cd 0.01 -x 20
Running on hadoop, using HADOOP_HOME=/usr/lib/hadoop-0.20.2/
HADOOP_CONF_DIR=/usr/lib/hadoop-0.20.2/conf
11/05/02 22:02:55 INFO common.AbstractJob: Command line arguments:
{--clusters=temp, --convergenceDelta=0.01,
--distanceMeasure=org.apache.mahout.common.distance.SquaredEuclideanDistanceMeasure,
--endPhase=2147483647, --input=seq-output2, --maxIter=20,
--method=mapreduce, --numClusters=20, --output=cluster-output,
--startPhase=0, --tempDir=temp}
11/05/02 22:02:55 INFO common.HadoopUtil: Deleting temp
11/05/02 22:02:55 INFO util.NativeCodeLoader: Loaded the native-hadoop
library
11/05/02 22:02:55 INFO zlib.ZlibFactory: Successfully loaded & initialized
native-zlib library
11/05/02 22:02:55 INFO compress.CodecPool: Got brand-new compressor
Exception in thread "main" java.lang.ClassCastException: class
org.apache.hadoop.io.IntWritable
at java.lang.Class.asSubclass(Class.java:3039)
 at
org.apache.mahout.clustering.kmeans.RandomSeedGenerator.buildRandom(RandomSeedGenerator.java:86)
 at
org.apache.mahout.clustering.kmeans.KMeansDriver.run(KMeansDriver.java:96)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
 at
org.apache.mahout.clustering.kmeans.KMeansDriver.main(KMeansDriver.java:54)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:616)
 at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
 at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
 at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:184)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:616)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

Regards,
Dipti Mathur

Reply via email to