K-Means on Hadoop Cluster

Adri Gómez Sat, 24 May 2014 11:36:17 -0700

Hello.

First, sorry for my English.


I'm a noob in Mahout and Hadoop. I want to run kmeans clustering on a
Hadoop pseudo-distributed mode. I have 5 million of vectors in a .mat file,
with 38 numeric features for each vector, like this: 0 0 1 0 0 0 0 0 0 0 0
0 ...

I've run the examples that I've found, like Reuters (
https://mahout.apache.org/users/clustering/k-means-clustering.html) or
synthetic data. I know i have to convert this vectors to SequenceFile, but
I don't know if I have to do something more before.

I'm using Mahout 0.7 and Hadoop 1.2.1.

Thanks.

-- 
*Gómez Muñoz, Adrián.*

K-Means on Hadoop Cluster

Reply via email to