I have a set of points in 1d represented by a list X of floating point
numbers.  The list has one dense section and the rest is sparse and I
want to find the dense part. I can't release the actual data but here
is a simulation:

N = 100

start = 0
points = []
rate = 0.1
for i in range(N):
    points.append(start)
    start = start + random.expovariate(rate)
rate = 10
for i in range(N*10):
    points.append(start)
    start = start + random.expovariate(rate)
rate = 0.1
for i in range(N):
    points.append(start)
    start = start + random.expovariate(rate)
plt.hist(points, bins = 100)
plt.show()

I would like to use scikit learn to find the dense region. This feels
a little like outlier detection or the task of finding one cluster
with noise.

Is there a suitable method in scikit learn for this task?

Raphael
_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Reply via email to