Hi, I implemented something similar in the following way.
Created a class which implements org.apache.commons.math3.ml.clustering.Clusterable with just two member variables double[] point and long id and geter/setter function. Iterated through the data and created instances of this class. Added them to a list Then instantiated KMeansPlusPlusClusterer as below org.apache.commons.math3.ml.clustering.KMeansPlusPlusClusterer<CustomPoint> clusterer = new KMeansPlusPlusClusterer<CustomPoint>(4,100,new org.apache.commons.math3.ml.distance.CanberraDistance()); Then called KMeansPlusPlusClusterer.clusterer as follows List<CentroidCluster<CustomPoint>> clusterList = clusterer.cluster(points); I was able to get the clusters in this way. Don't know whether this is the right approach. But it worked for me. Regards, Anand.C -----Original Message----- From: syed kather [mailto:[email protected]] Sent: Tuesday, June 18, 2013 3:23 PM To: [email protected] Subject: K Mean Clustering on Two columns` Hi Team How to do the K Mean Clustering on 2 selected Columns Line No,age,income,sex,city 1,22,1500,1,xxx, 2,54,13450,2,yyy - - - - - Like this Input Goes . But i need to do Clustering on Columns 2 and 3 How to do that ? I had tried using synthatic kmean Means But i am not able to extract the cluster ID with corresponding to Line No. Please help me Thanks and regards Syed Abdul Kather Thanks and Regards, S SYED ABDUL KATHER
