On 18-09-2012 21:04, Rajesh Nikam wrote:
I have question related Mahout clustering and classification algorithms:
1. I have a csv file with attributes for each instance.
How to use csv file as input to mahout canopy clustering to identify
number of clusters ?
See seqdirectory and seq2sparse commands, or just write your own code to
generate vectors in sequence files, its pretty simple.
2. How to separate out instances into clusters after mahout kmeans
clustering ?
Usr clusterdump command or clusterpp command for it.
https://cwiki.apache.org/MAHOUT/cluster-dumper.html
https://cwiki.apache.org/MAHOUT/top-down-clustering.html
3. Using mahout Stochastic Gradient Descent (sgd) to create model and this
model in serialized.
Model is stored in binary format. I have requirement to use this model in
non-java (c/c++) application.
How to use this model in this application.
Your valuable comments are appreciated. !
Thanks,
Rajesh