Hello. First, sorry for my English.
I'm a noob in Mahout and Hadoop. I want to run kmeans clustering on a Hadoop pseudo-distributed mode. I have 5 million of vectors in a .mat file, with 38 numeric features for each vector, like this: 0 0 1 0 0 0 0 0 0 0 0 0 ... I've run the examples that I've found, like Reuters ( https://mahout.apache.org/users/clustering/k-means-clustering.html) or synthetic data. I know i have to convert this vectors to SequenceFile, but I don't know if I have to do something more before. I'm using Mahout 0.7 and Hadoop 1.2.1. Thanks. -- *Gómez Muñoz, Adrián.*
