I am not too familiar with affinity propagation, but just trying it out. The problem is to cluster using a distance metric that is euclidean distance but with a limit. When the distance is greater than some threshold than the metric is -Inf. In other words, a point can be accepted into a cluster only if the distance from the point to the cluster center is less than some threshold.
It seems my test with affinity propagation will sometimes produce a correct result, but other times the result seems to violate the condition. In the example code, a couple of outlier points seem to be in clusters that are not close at all. I've tried playing with parameters (such as preference) without eliminating the problem. Any suggestions? --------- import numpy as np from sklearn.cluster import AffinityPropagation # from randomgen import RandomGenerator, Xoroshiro128 # rs = RandomGenerator (Xoroshiro128 (0)) from numpy.random import RandomState rs = RandomState(3) pts = rs.uniform (-5, 5, (50,2)) import seaborn as sns import matplotlib.pyplot as plt def distance (ax, ay, bx, by): d = (ax - bx)**2 + (ay - by)**2 if d > 1: return -1e6 else: return -d d = np.empty ((pts.shape[0], pts.shape[0])) for i in range(pts.shape[0]): for j in range(pts.shape[0]): d[i,j] = distance(pts[i,0], pts[i,1], pts[j,0], pts[j,1]) preference = -20 #np.mean (d[d > -1e6]) print ('preference:', preference) clustering = AffinityPropagation(affinity='precomputed', verbose=True, preference=preference) res = clustering.fit(d) c = clustering colors = np.array(sns.color_palette("hls", np.max(c.labels_)+1)) print('n_clusters:', np.max(c.labels_)+1) centers = pts[c.cluster_centers_indices_] plt.scatter (pts[:,0], pts[:,1], c=colors[c.labels_]) plt.scatter (centers[:,0], centers[:,1], marker='X', s=100, c=colors) plt.show() _______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn