The KMeansDriver has a method (clusterData) which you can invoke from a
Java program to cluster (classify) your new data with the old clusters.
You need to be sure the vectors are the same size (and the elements
denote the same attributes) for this to work. There is currently no CLI
to invoke this step independently from the buildClusters (training) step
and this is indeed under development.
As Paritosh indicates, we are planning to refactor all of the
clusterData implementations into an independent job so the redundant
implementations in the various clustering algorithms can be consolidated.
On 12/19/11 3:46 AM, Paritosh Ranjan wrote:
This feature is in development.
Try using ClusterClassifier. Populate it with the clusters you have as
models.
Then use ClusterIterator with KMeansClusteringPolicy.
Hope it would solve your problem.
On 19-12-2011 15:11, Faizan(Aroha) wrote:
Yes you are correct. Do you have any suggestions ?
-----Original Message-----
From: Paritosh Ranjan [mailto:[email protected]]
Sent: Monday, December 19, 2011 1:27 PM
To: [email protected]
Subject: Re: Clustering - k-means as a search
You want to classify the new vectors (smaller dataset) with the old
clusters ( huge dataset ). Am I correct?
Paritosh
On 19-12-2011 13:32, Faizan(Aroha) wrote:
Hello,
I'm trying to implement k-means as a search.
I've performed k-means clustering on a huge dataset.
Now if I have a new (small)dataset or document , how will I determine
with which cluster it belongs?
Thanks in advance.
Regards,
Faizan Shaikh
Aroha Labs(Private) Ltd
-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1415 / Virus Database: 2108/4089 - Release Date:
12/18/11
-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1415 / Virus Database: 2108/4089 - Release Date: 12/18/11