I was following the book examples and k means , dirichlet and lda all have this 
casting problem. It may be a Mac issue not sure . I suspect it may be 
seq2sparse messing up the inputs, maybe wrong version. It outputs the regular 
part-r-* but the lda driver expects a file called data. 

Sent from my iPad

On Jun 9, 2011, at 7:40 AM, Mark <[email protected]> wrote:

> Forgot to mention... great book :)
> 
> On 6/9/11 7:30 AM, Mark wrote:
>> KMeans is busted? What do you mean by this? The algorithm simply won't work 
>> or just the reuters example?
>> 
>> Thanks
>> 
>> On 6/9/11 12:28 AM, Sean Owen wrote:
>>> (Assuming you are on HEAD,) I think KMeans is busted -- this has come up
>>> before. I don't know if it is being maintained.  Anyone who's willing to
>>> step up and fix it is also welcome to overhaul it IMHO.
>>> 
>>> On Thu, Jun 9, 2011 at 12:03 AM, Hector Yee<[email protected]>  wrote:
>>> 
>>>> I got a slightly different error on the next line of KMeansDriver.java
>>>> (running on OS X Snow Leopard)
>>>> 
>>>> 11/06/08 16:02:12 INFO compress.CodecPool: Got brand-new compressor
>>>> Exception in thread "main" java.lang.ClassCastException:
>>>> org.apache.hadoop.io.IntWritable cannot be cast to
>>>> org.apache.mahout.math.VectorWritable
>>>>  at
>>>> 
>>>> org.apache.mahout.clustering.kmeans.RandomSeedGenerator.buildRandom(RandomSeedGenerator.java:90)
>>>>  
>>>> at
>>>> org.apache.mahout.clustering.kmeans.KMeansDriver.run(KMeansDriver.java:102)
>>>>  
>>>> 
>>>> 
>>>> On Sun, Jun 5, 2011 at 9:31 PM, Jeff Eastman<[email protected]>  wrote:
>>>> 
>>>>> IIRC, Reuters used to run on a cluster but no longer does due to some
>>>>> obscure Lucene changes. In 0.5 it only works in local mode. I really hope
>>>>> this can be repaired by 0.6 as Reuters is a key entry point into Mahout
>>>>> clustering for many users.
>>>>> 

Reply via email to