The KMeansDriver has a method (clusterData) which you can invoke from a Java program to cluster (classify) your new data with the old clusters. You need to be sure the vectors are the same size (and the elements denote the same attributes) for this to work. There is currently no CLI to invoke this step independently from the buildClusters (training) step and this is indeed under development.

As Paritosh indicates, we are planning to refactor all of the clusterData implementations into an independent job so the redundant implementations in the various clustering algorithms can be consolidated.

On 12/19/11 3:46 AM, Paritosh Ranjan wrote:
This feature is in development.

Try using ClusterClassifier. Populate it with the clusters you have as models.
Then use ClusterIterator with KMeansClusteringPolicy.

Hope it would solve your problem.

On 19-12-2011 15:11, Faizan(Aroha) wrote:
Yes you are correct. Do you have any suggestions ?

-----Original Message-----
From: Paritosh Ranjan [mailto:[email protected]]
Sent: Monday, December 19, 2011 1:27 PM
To: [email protected]
Subject: Re: Clustering - k-means as a search

You want to classify the new vectors (smaller dataset)  with the old
clusters ( huge dataset ). Am I correct?

Paritosh

On 19-12-2011 13:32, Faizan(Aroha) wrote:
Hello,



I'm trying to implement k-means as a search.



I've performed k-means clustering on a huge dataset.



Now if  I have a new (small)dataset or document , how will I determine
with which cluster it belongs?



Thanks in advance.





Regards,

Faizan Shaikh

Aroha Labs(Private) Ltd




-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1415 / Virus Database: 2108/4089 - Release Date:
12/18/11


-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1415 / Virus Database: 2108/4089 - Release Date: 12/18/11




Reply via email to