Hi Dipti,

In the future, please ask on the [email protected] mailing list as this 
list is primarily about the internals of Mahout and the User list is primarily 
about issues interacting w/ Mahout.  

However, a few questions inline below.

On May 2, 2011, at 12:44 PM, Dipti Mathur wrote:

> Hi All,
> 
> I am trying to build a classifier for a set of data that I have collected
> myself. I am very new to mahout and would be very grateful if someone could
> help me with the steps to get started.
> 
> The documents I have come across so far explain how to run the sample codes
> but when I tried converting my text to vectors ( using seqdirectory and
> seq2sparse) and run the kmeans algorithm, I get errors like below. I am not
> even able to find the source code to "kmeans" or "seq2sparse" executables to
> begin fixing the issue. Pointers to good reads will also help. Any help at
> all will be greatly appreciated.

What version of Mahout are you using?  Also, what commands did you run to build 
the input to KMeans?


> 
> dipti@dipti-laptop:~$ mahout kmeans -i seq-output2 -c temp -o cluster-output
> -k 20 -cd 0.01 -x 20
> Running on hadoop, using HADOOP_HOME=/usr/lib/hadoop-0.20.2/
> HADOOP_CONF_DIR=/usr/lib/hadoop-0.20.2/conf
> 11/05/02 22:02:55 INFO common.AbstractJob: Command line arguments:
> {--clusters=temp, --convergenceDelta=0.01,
> --distanceMeasure=org.apache.mahout.common.distance.SquaredEuclideanDistanceMeasure,
> --endPhase=2147483647, --input=seq-output2, --maxIter=20,
> --method=mapreduce, --numClusters=20, --output=cluster-output,
> --startPhase=0, --tempDir=temp}
> 11/05/02 22:02:55 INFO common.HadoopUtil: Deleting temp
> 11/05/02 22:02:55 INFO util.NativeCodeLoader: Loaded the native-hadoop
> library
> 11/05/02 22:02:55 INFO zlib.ZlibFactory: Successfully loaded & initialized
> native-zlib library
> 11/05/02 22:02:55 INFO compress.CodecPool: Got brand-new compressor
> Exception in thread "main" java.lang.ClassCastException: class
> org.apache.hadoop.io.IntWritable
> at java.lang.Class.asSubclass(Class.java:3039)
> at
> org.apache.mahout.clustering.kmeans.RandomSeedGenerator.buildRandom(RandomSeedGenerator.java:86)
> at
> org.apache.mahout.clustering.kmeans.KMeansDriver.run(KMeansDriver.java:96)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> at
> org.apache.mahout.clustering.kmeans.KMeansDriver.main(KMeansDriver.java:54)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:616)
> at
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
> at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
> at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:184)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:616)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> 
> Regards,
> Dipti Mathur

--------------------------
Grant Ingersoll
Lucene Revolution -- Lucene and Solr User Conference
May 25-26 in San Francisco
www.lucenerevolution.org

Reply via email to