I got a slightly different error on the next line of KMeansDriver.java
(running on OS X Snow Leopard)

11/06/08 16:02:12 INFO compress.CodecPool: Got brand-new compressor
Exception in thread "main" java.lang.ClassCastException:
org.apache.hadoop.io.IntWritable cannot be cast to
org.apache.mahout.math.VectorWritable
 at
org.apache.mahout.clustering.kmeans.RandomSeedGenerator.buildRandom(RandomSeedGenerator.java:90)
at
org.apache.mahout.clustering.kmeans.KMeansDriver.run(KMeansDriver.java:102)


On Sun, Jun 5, 2011 at 9:31 PM, Jeff Eastman <[email protected]> wrote:

> IIRC, Reuters used to run on a cluster but no longer does due to some
> obscure Lucene changes. In 0.5 it only works in local mode. I really hope
> this can be repaired by 0.6 as Reuters is a key entry point into Mahout
> clustering for many users.
>
> -----Original Message-----
> From: Sean Owen [mailto:[email protected]]
> Sent: Sunday, June 05, 2011 11:56 AM
> To: [email protected]
> Subject: Re: Problems running examples
>
> This all sounds a load like things that were fixed a little while ago. Are
> you on version 0.5, or better yet, SVN HEAD?
>
> The rest, I don't know, would have to defer to the author of that bit.
>
> On Sun, Jun 5, 2011 at 7:07 PM, Mark <[email protected]> wrote:
>
> > Hi all. I'm trying to run the examples/bin/build-reuters.sh but I
> continue
> > to run into the following exception.
> >
> > INFO: Deleting mahout-work/reuters-kmeans-clusters
> > Jun 5, 2011 10:29:37 AM org.apache.hadoop.util.NativeCodeLoader <clinit>
> > WARNING: Unable to load native-hadoop library for your platform... using
> > builtin-java classes where applicable
> > Jun 5, 2011 10:29:37 AM org.apache.hadoop.io.compress.CodecPool
> > getCompressor
> > INFO: Got brand-new compressor
> > Exception in thread "main" java.lang.IndexOutOfBoundsException: Index: 0,
> > Size: 0
> >    at java.util.ArrayList.RangeCheck(ArrayList.java:547)
> >    at java.util.ArrayList.get(ArrayList.java:322)
> >    at
> >
> org.apache.mahout.clustering.kmeans.RandomSeedGenerator.buildRandom(RandomSeedGenerator.java:108)
> >    at
> >
> org.apache.mahout.clustering.kmeans.KMeansDriver.run(KMeansDriver.java:101)
> >    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> >    at
> >
> org.apache.mahout.clustering.kmeans.KMeansDriver.main(KMeansDriver.java:58)
> >    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >    at
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> >    at
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >    at java.lang.reflect.Method.invoke(Method.java:597)
> >    at
> >
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
> >    at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
> >    at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:187)
> >
> > I am also confused reading the build-reuters.sh code itself. There seems
> to
> > be some disjunction between what is expected to be local and what should
> be
> > on HDFS. For example on the comments on 77-79 are:
> >
> > # we know reuters-out-seqdir exists on a local disk at
> > # this point, if we're running in clustered mode,
> > # copy it up to hdfs
> >
> > However upon inspection you'll notice that the reueters-out-seqdir is
> > actually on HDFS.  It seems like the seqdirectory will never write to
> local
> > disk... even with the MAHOUT_LOCAL=true flag set.
> >
> > Any ideas?
> >
> > Thanks
> >
>



-- 
Yee Yang Li Hector
http://hectorgon.blogspot.com/ (tech + travel)
http://hectorgon.com (book reviews)

Reply via email to