$MAHOUT_HOME/src/conf -----Original Message----- From: Lance Norskog [mailto:[email protected]] Sent: Monday, May 02, 2011 4:36 PM To: [email protected] Subject: Re: Just beginning with Mahout classifiers
Where is this mapping? On Mon, May 2, 2011 at 4:16 PM, Daniel McEnnis <[email protected]> wrote: > Seq2Sparse maps to > org.apache.mahout.vectorizer.SparseVectorsFromSequenceFiles in the > core source branch. > > Daniel. > > On Mon, May 2, 2011 at 12:44 PM, Dipti Mathur <[email protected]> wrote: >> Hi All, >> >> I am trying to build a classifier for a set of data that I have collected >> myself. I am very new to mahout and would be very grateful if someone could >> help me with the steps to get started. >> >> The documents I have come across so far explain how to run the sample codes >> but when I tried converting my text to vectors ( using seqdirectory and >> seq2sparse) and run the kmeans algorithm, I get errors like below. I am not >> even able to find the source code to "kmeans" or "seq2sparse" executables to >> begin fixing the issue. Pointers to good reads will also help. Any help at >> all will be greatly appreciated. >> >> dipti@dipti-laptop:~$ mahout kmeans -i seq-output2 -c temp -o cluster-output >> -k 20 -cd 0.01 -x 20 >> Running on hadoop, using HADOOP_HOME=/usr/lib/hadoop-0.20.2/ >> HADOOP_CONF_DIR=/usr/lib/hadoop-0.20.2/conf >> 11/05/02 22:02:55 INFO common.AbstractJob: Command line arguments: >> {--clusters=temp, --convergenceDelta=0.01, >> --distanceMeasure=org.apache.mahout.common.distance.SquaredEuclideanDistanceMeasure, >> --endPhase=2147483647, --input=seq-output2, --maxIter=20, >> --method=mapreduce, --numClusters=20, --output=cluster-output, >> --startPhase=0, --tempDir=temp} >> 11/05/02 22:02:55 INFO common.HadoopUtil: Deleting temp >> 11/05/02 22:02:55 INFO util.NativeCodeLoader: Loaded the native-hadoop >> library >> 11/05/02 22:02:55 INFO zlib.ZlibFactory: Successfully loaded & initialized >> native-zlib library >> 11/05/02 22:02:55 INFO compress.CodecPool: Got brand-new compressor >> Exception in thread "main" java.lang.ClassCastException: class >> org.apache.hadoop.io.IntWritable >> at java.lang.Class.asSubclass(Class.java:3039) >> at >> org.apache.mahout.clustering.kmeans.RandomSeedGenerator.buildRandom(RandomSeedGenerator.java:86) >> at >> org.apache.mahout.clustering.kmeans.KMeansDriver.run(KMeansDriver.java:96) >> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) >> at >> org.apache.mahout.clustering.kmeans.KMeansDriver.main(KMeansDriver.java:54) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >> at java.lang.reflect.Method.invoke(Method.java:616) >> at >> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68) >> at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139) >> at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:184) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >> at java.lang.reflect.Method.invoke(Method.java:616) >> at org.apache.hadoop.util.RunJar.main(RunJar.java:156) >> >> Regards, >> Dipti Mathur >> > -- Lance Norskog [email protected]
