I have a set of points in 1d represented by a list X of floating point numbers. The list has one dense section and the rest is sparse and I want to find the dense part. I can't release the actual data but here is a simulation:
N = 100 start = 0 points = [] rate = 0.1 for i in range(N): points.append(start) start = start + random.expovariate(rate) rate = 10 for i in range(N*10): points.append(start) start = start + random.expovariate(rate) rate = 0.1 for i in range(N): points.append(start) start = start + random.expovariate(rate) plt.hist(points, bins = 100) plt.show() I would like to use scikit learn to find the dense region. This feels a little like outlier detection or the task of finding one cluster with noise. Is there a suitable method in scikit learn for this task? Raphael _______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn